Models & model choice

Choose the language model behind each agent and the embedding model behind its retrieval. Compare the built-in models, understand per-model credit costs, and learn how to match a model to your agent's job.

View as Markdown

Every agent runs on a language model that composes its replies, and an embedding model that powers retrieval. Bookbag ships a built-in catalog of both, and you pick per agent. This page explains the choices and their trade-offs.

Model choice is per agent

Set the model in Agent settings. A fast model for a high-volume FAQ agent and a stronger model for a complex-policy agent can coexist in the same workspace.

The built-in model catalog

These language models are available out of the box. Each has a credit cost — the number of credits one reply consumes — that reflects its relative expense. See Credits & usage for how credits work.

Model	Provider	Credit cost / reply	Good for
Bookbag Local	Built-in	1	Keyless testing and simple extractive answers — no external provider needed.
GPT-4o mini	OpenAI	1	The everyday workhorse: fast, cheap, handles the bulk of routine support well.
Claude Haiku 4.5	Anthropic	2	Fast and capable; strong, well-grounded support answers.
Claude Sonnet 4.6	Anthropic	4	Higher reasoning for nuanced or multi-step policy questions.
GPT-4o	OpenAI	5	Top-tier reasoning and the broadest tool/function support.

Start cheap, upgrade only where it pays

Most routine ecommerce questions are answered just as well by a 1-credit model as a 5-credit one — because the answer is in your data, and the model is mostly composing it. Reserve premium models for agents that genuinely reason over complex policies or long context.

How to choose a language model

For grounded support, the data does most of the work; the model's job is to phrase retrieved facts naturally and follow your guardrails. So:

Default to a fast, low-cost model (GPT-4o mini or Claude Haiku) for FAQ, order-status, and policy-lookup agents.
Step up to a stronger model (Claude Sonnet, GPT-4o) only when an agent must reason across multiple sources, handle long conversations, or interpret ambiguous requests.
Use Bookbag Local for quick keyless trials and offline testing — no provider key required, though answers are simpler.
Pick a function-calling model when your agent uses tool-style custom actions. GPT models support function calling.

Function calling

Function/tool calling lets an agent call your APIs as structured actions. In the built-in catalog, the OpenAI models support function calling; check the model's capabilities when wiring up custom actions.

Embedding models

The embedding model turns your sources and each incoming question into vectors so retrieval can find the most relevant chunks. It's separate from the language model and pinned per agent — every chunk an agent stores is embedded with the same model so dimensions never mix.

Embedding model	Provider	Dimensions	Notes
Local MiniLM	Built-in	384	Default. Strong keyless retrieval with no external provider.
Local Hashing 256	Built-in	256	Lightweight fully-offline option.
OpenAI text-embedding-3-small	OpenAI	1536	Higher-fidelity embeddings; requires an OpenAI key.

Changing an embedding model means retraining

Retrieval can only compare vectors from the same embedding model. If you change an agent's embedding model, retrain its data sources so the whole index is rebuilt — otherwise old chunks become unsearchable.

Bringing your own keys

By default, replies run on Bookbag's infrastructure and are metered in credits. If you'd rather run on your own OpenAI or Anthropic account, you can bring your own API keys — which takes your workspace off credit metering for provider-backed models and bills usage directly to your provider account instead.

What's next

Credits & usage

How per-model credit costs add up and how to track spend.

Bring your own API keys

Run on your own OpenAI/Anthropic account and skip metering.

Agent settings

Set the model and temperature for an agent.

Data sources

Embedding models and the vector index in context.

Voice

Bring your own API keys