# Models & model choice

> Choose the language model behind each agent and the embedding model behind its retrieval. Compare the built-in models, understand per-model credit costs, and learn how to match a model to your agent's job.

Every agent runs on a **language model** that composes its replies, and an **embedding model** that powers retrieval. Bookbag ships a built-in catalog of both, and you pick per agent. This page explains the choices and their trade-offs.

> **MODEL CHOICE IS PER AGENT:** Set the model in [Agent settings](/docs/agents/settings). A fast model for a high-volume FAQ agent and a stronger model for a complex-policy agent can coexist in the same workspace.

## The built-in model catalog

These language models are available out of the box. Each has a **credit cost** — the number of credits one reply consumes — that reflects its relative expense. See [Credits & usage](/docs/agents/credits) for how credits work.

| Model | Provider | Credit cost / reply | Good for |
| --- | --- | --- | --- |
| Bookbag Local | Built-in | 1 | Keyless testing and simple extractive answers — no external provider needed. |
| GPT-4o mini | OpenAI | 1 | The everyday workhorse: fast, cheap, handles the bulk of routine support well. |
| Claude Haiku 4.5 | Anthropic | 2 | Fast and capable; strong, well-grounded support answers. |
| Claude Sonnet 4.6 | Anthropic | 4 | Higher reasoning for nuanced or multi-step policy questions. |
| GPT-4o | OpenAI | 5 | Top-tier reasoning and the broadest tool/function support. |

> **START CHEAP, UPGRADE ONLY WHERE IT PAYS:** Most routine ecommerce questions are answered just as well by a 1-credit model as a 5-credit one — because the *answer is in your data*, and the model is mostly composing it. Reserve premium models for agents that genuinely reason over complex policies or long context.

## How to choose a language model

For grounded support, the data does most of the work; the model's job is to phrase retrieved facts naturally and follow your guardrails. So:

- **Default to a fast, low-cost model** (GPT-4o mini or Claude Haiku) for FAQ, order-status, and policy-lookup agents.
- **Step up to a stronger model** (Claude Sonnet, GPT-4o) only when an agent must reason across multiple sources, handle long conversations, or interpret ambiguous requests.
- **Use Bookbag Local** for quick keyless trials and offline testing — no provider key required, though answers are simpler.
- **Pick a function-calling model** when your agent uses tool-style [custom actions](/docs/actions/custom-action). GPT models support function calling.

> **FUNCTION CALLING:** Function/tool calling lets an agent call your APIs as structured actions. In the built-in catalog, the OpenAI models support function calling; check the model's capabilities when wiring up [custom actions](/docs/actions/custom-action).

## Embedding models

The **embedding model** turns your sources and each incoming question into vectors so retrieval can find the most relevant chunks. It's separate from the language model and pinned per agent — every chunk an agent stores is embedded with the same model so dimensions never mix.

| Embedding model | Provider | Dimensions | Notes |
| --- | --- | --- | --- |
| Local MiniLM | Built-in | 384 | Default. Strong keyless retrieval with no external provider. |
| Local Hashing 256 | Built-in | 256 | Lightweight fully-offline option. |
| OpenAI text-embedding-3-small | OpenAI | 1536 | Higher-fidelity embeddings; requires an OpenAI key. |

> **CHANGING AN EMBEDDING MODEL MEANS RETRAINING:** Retrieval can only compare vectors from the same embedding model. If you change an agent's embedding model, retrain its [data sources](/docs/agents/data-sources) so the whole index is rebuilt — otherwise old chunks become unsearchable.

## Bringing your own keys

By default, replies run on Bookbag's infrastructure and are metered in credits. If you'd rather run on your own OpenAI or Anthropic account, you can [bring your own API keys](/docs/agents/byo-keys) — which takes your workspace off credit metering for provider-backed models and bills usage directly to your provider account instead.

## What's next

- [Credits & usage](/docs/agents/credits) — How per-model credit costs add up and how to track spend.
- [Bring your own API keys](/docs/agents/byo-keys) — Run on your own OpenAI/Anthropic account and skip metering.
- [Agent settings](/docs/agents/settings) — Set the model and temperature for an agent.
- [Data sources](/docs/agents/data-sources) — Embedding models and the vector index in context.
