How much content does an AI agent need before it can go live?

Coverage of your top five to seven ticket categories is enough to launch. Accuracy matters far more than volume — a small, precise knowledge base outperforms a large, contradictory one. Pull your recent tickets, sort by reason, write the most common categories well, and expand later as escalations reveal real gaps.

Should I use my existing help center or rewrite it for AI?

Start with what you have, then revise. Most help center articles were written for human readers and need to be made more specific, more self-contained, and explicit about exceptions to retrieve well. Plan to review your top 10 to 15 articles against retrieval principles rather than assuming the existing content is AI-ready.

What if our policies change often?

Build the knowledge update into the policy-change workflow rather than treating it as a follow-up task. When operations changes the return window or a carrier, the knowledge base should be updated the same day. Assign one owner who is notified on every policy or process change so nothing lags behind reality.

Does the knowledge base also answer 'where is my order'?

No — and it should not try to. Order status changes per customer and per hour, so it belongs in live store and carrier data, not a document. A capable agent reads your documented policy from the knowledge base and looks up the specific order from your store, then combines them into one accurate answer.

Can the agent learn from conversations automatically?

It improves as your content improves, not by silently learning from individual chats. Platforms like Bookbag surface escalation patterns so you can see exactly what knowledge to add or fix, but a human review step before changes go live keeps quality under control. The loop is human-guided by design.

Guides

Building a Knowledge Base Your AI Agent Can Actually Use

A weak knowledge base produces confident wrong answers. A strong one lets an AI agent resolve the bulk of routine contacts on its own. The gap is almost entirely in how you write, structure, and maintain the content.

The Bookbag Team·June 2026· 13 min read

In this article

Why the knowledge base is the real bottleneck
How an AI agent actually reads your knowledge
What to put in first
How to write for AI retrieval
Structuring policy documents
Knowledge plus live store data
Mistakes that wreck accuracy
How to measure if it's working
Maintenance and freshness
How Bookbag builds and maintains it

Why the knowledge base is the real bottleneck for AI support

When merchants are unhappy with an AI support agent, they almost always blame the model. The model is rarely the problem. Building a knowledge base your AI agent can use is the single highest-leverage thing you can do to improve answer quality, and most teams skip it because it is unglamorous content work rather than a settings toggle.

Here is the mechanism. A modern language model is good at reasoning over information you hand it. It is bad at inventing facts it was never given. When a customer asks "can I return a worn pair of boots after 40 days" and your knowledge base says nothing specific about worn items or the exact window, the agent has two options: escalate, or guess. A poorly grounded agent guesses, and a confident wrong answer is worse than no answer at all because the customer acts on it.

So the failures you see in testing — the made-up return window, the wrong shipping cutoff, the refund promise you never authorized — are usually not reasoning errors. They are gaps. The content the agent needed was missing, buried, ambiguous, or contradicted by another page. Fix the content and the same model suddenly looks far smarter.

This reframes the work in a useful way. You are not tuning an AI; you are writing the operating manual a sharp new hire would need to answer customers without bothering anyone. If a fact is not in that manual, no amount of model quality will conjure it. If two pages of the manual disagree, your new hire will sometimes pick the wrong one. Everything in this guide is about making that manual accurate, retrievable, and current.

The 70% rule of thumb

Across AI support audits, the large majority of wrong or unhelpful answers trace back to knowledge problems — missing facts, stale content, or ambiguity — rather than the underlying model. That is good news: content is something you control, and fixing it requires no engineering.

How an AI agent actually reads your knowledge

An AI agent does not read your help center top to bottom the way a new hire would. It uses retrieval. When a question comes in, the system converts it into a vector, finds the chunks of your content that are semantically closest, and feeds only those chunks to the model as context. The model answers from what it was handed. This is the part most people misunderstand, and it changes how you should write.

Two consequences fall out of this. First, a fact that exists in your knowledge base but never gets retrieved is, for practical purposes, invisible. If the relevant paragraph is phrased nothing like how customers ask the question, retrieval misses it and the agent answers without it. Second, content is read in fragments. The agent might pull one 200-word passage out of the middle of a 2,000-word policy page, with none of the surrounding context. If that passage only makes sense after reading the three paragraphs above it, the answer will be wrong.

Good knowledge content is therefore written for retrieval, not for a linear human reader. The requirements are concrete:

Specific over vague. "We process returns quickly" is noise. "We process refunds within 2 business days of receiving the returned item" is a fact the agent can quote.
Self-contained chunks. Each paragraph should stand on its own. Cut "see above" and "as mentioned earlier" — the agent may retrieve that paragraph alone.
Phrased like the question. Customers say "can I change my address," not "order mutation policy." Mirror their language so retrieval finds the right passage.
Unambiguous. If a rule has exceptions, list them in the same place as the rule. A general policy with a vague "contact us for exceptions" forces the agent to guess.
Current. An outdated article is worse than a missing one, because the agent will cite it with full confidence.

What to put in your knowledge base first

Build in ticket-volume order, not alphabetical order and not in the order your operations team happens to think of things. The goal at launch is not a complete encyclopedia. It is precise coverage of the questions that actually drive your contact volume. A tight knowledge base covering your top handful of categories will outperform a sprawling one that is comprehensive but inconsistent.

Pull your last 60 to 90 days of tickets, tag them by reason, and sort by frequency. For most ecommerce stores the same categories dominate: where is my order, returns and exchanges, sizing and fit, order changes, and discount questions. Write those first, write them well, then expand. Here is the build order that works for the typical store:

1Returns and exchanges. The complete policy: window, condition requirements, who pays return shipping, what is eligible, what is final sale, and the exact steps to start one. This single category often drives the most repeat contacts.
2Shipping and delivery. Processing time before dispatch, transit times by region and method, carrier options, holiday and peak-season cutoffs, and what happens when a parcel is delayed or marked delivered but missing.
3Order changes and cancellation. What can be changed (address, size, items), the cutoff window for changes, and how a customer requests one.
4Product and fit questions. Sizing charts with real measurements, materials, care instructions, allergens or ingredients, and compatibility notes for tech. Cover the specific questions your tickets show, not generic filler.
5Promotions and discounts. How codes apply, expiry, stacking rules, exclusions, and what to do when a valid code is rejected at checkout.
6Subscriptions, if you sell them. How to pause, skip, swap, change frequency, update payment, and cancel — with the exact path for each.
7Escalation and contact. When and how a customer reaches a human. The agent should surface this proactively for anything outside policy, never hide it.

Coverage beats volume

You do not need a thousand articles to launch. Seven well-written categories covering 80% of your ticket reasons will resolve more contacts than a bloated help center riddled with contradictions. Add depth later, guided by what customers actually escalate.

How to write knowledge content for AI retrieval

The core shift is this: a human reader infers context, and a retrieval system does not. Write every paragraph as if it might be read on its own, by someone who has not seen the rest of the page — because that is exactly what happens. The four habits below do most of the work.

None of this requires a rewrite of your whole help center on day one. Run your top 10 to 15 articles through these principles first; those cover the bulk of retrievals.

Lead with the answer, then explain

Put the key fact in the first sentence of a section, before any context or caveats. "Returns are free on orders over $75. Orders under $75 carry a $7 return shipping fee." Then explain edge cases. Burying the rule three paragraphs down means retrieval may grab the context and miss the rule.

Use a question-and-answer format

Structure content as the question a customer would type, immediately followed by the answer. "How long is the return window? You can return most items within 30 days of the delivery date." This mirrors how questions arrive, which sharpens retrieval, and it happens to make the page more scannable for humans too.

List every exception next to the rule

"All items can be returned within 30 days, except: final-sale items (marked on the product page), custom or personalized orders, and opened intimates." Keeping exceptions beside the rule stops the agent from applying the general policy where a carve-out should win. Exceptions stored on a separate page often never get retrieved with the rule they modify.

Use the customer's words, not your internal ones

Customers ask about "my order," not your "fulfillment record," and "the charge on my card," not a "settlement event." Write in the vocabulary customers use in real tickets. Internal jargon and SKU codes degrade retrieval because nobody phrases a question that way.

Structuring policy documents for an AI agent

Policy pages — returns, shipping, terms — are usually drafted by legal or operations in formal language built to limit liability, not to answer a customer. That language is hard for retrieval to parse and easy for the model to misread. Keep the legal version for compliance, and write a separate plain-language operational version for the agent to use. They serve different masters.

Use a consistent skeleton for every policy. When all your policies share the same shape, retrieval gets more predictable and your own audits get faster. This structure works well:

Section	What to include	Why it matters for the agent
Summary	The 2-3 rules a customer most needs, stated plainly up front	Most retrievals land here; it should answer the common case alone
Eligibility	Explicit list of what qualifies and what does not	Prevents the agent from guessing edge cases
Process steps	Numbered, exact actions to complete the request	Lets the agent walk a customer through it step by step
Timelines	Specific durations for each step, with numbers	Stops vague "a few days" answers that generate repeat contacts
Exceptions	Every carve-out with its condition, beside the rule	Keeps general rules from overriding a valid exception
If something goes wrong	The direct path to a human for true edge cases	Gives the agent a clean escalation instead of a guess

Why static knowledge is only half the job

A perfect knowledge base still cannot answer the most common ecommerce question: where is my order. That answer does not live in a document. It lives in your store and your carrier's tracking, and it is different for every customer and every minute. This is the line between a chatbot that recites policy and an agent that resolves the contact.

Static knowledge handles the rules — your return window, your shipping policy, your sizing chart. Live data handles the specifics of this customer's situation — their order status, tracking number, eligibility for a refund under the rules you set. A capable agent uses both: it reads the policy from your knowledge base, looks up the order from your store, and combines them into one accurate answer. "Your order shipped Tuesday and is due Thursday" plus "and yes, it is inside your 30-day return window" is something neither the document nor the data could produce alone.

When you plan your knowledge base, decide deliberately which questions are answered by content and which require a live lookup. Trying to document things that change — inventory, an individual order's status, a specific refund amount — is a losing game. Document the rules; connect the system for the facts.

Question type	Answered by	Example
Policy and rules	Knowledge base content	"What is your return window?"
This order's status	Live store / carrier data	"Where is order #10482?"
Eligibility for an action	Knowledge rules + live data	"Can I still return this order?"
Product details	Catalog + knowledge content	"Does this run small?"
Taking the action	Connected store actions	"Start my return for the blue one."

Document the rules, connect the facts

If a piece of information changes per customer or per hour — order status, stock, a refund total — do not try to write it into a help article. Connect the live source instead and let the agent combine it with your documented policy.

Mistakes that quietly wreck AI accuracy

Most knowledge base damage is self-inflicted and invisible until a customer hits it. These are the patterns that show up again and again in audits, and each one is fixable in an afternoon once you know to look for it.

The worst offender is contradiction. When two pages disagree — your returns page says 30 days, an old FAQ says 14 — retrieval may surface either one, so the agent's answer becomes a coin flip. Conflicting content is more dangerous than missing content, because at least a gap produces a clean escalation.

Contradictory pages. The same fact stated two ways across articles. Pick one source of truth per fact and delete or redirect the rest.
Marketing copy as policy. "Hassle-free returns!" is not a rule the agent can apply. It needs the window, the conditions, and the steps.
PDFs and screenshots. Information trapped in an image or an un-parsed PDF often never gets retrieved. Put it in real text.
Stale dates and codes. A holiday cutoff or promo code left in after it expired will be quoted confidently months later. Date-stamp time-sensitive content.
Over-stuffing. Pasting your entire 4,000-word terms page in as one block dilutes retrieval. Break it into focused, titled sections.
No escalation language. If nothing tells the agent when to hand off, it will try to answer things it should not. Write explicit "if this, escalate" guidance.

Test like a skeptical customer

Before you call the knowledge base done, ask the agent the 20 nastiest real questions from your ticket queue — the worn-item return, the expired code, the address change after dispatch. The answers will tell you exactly which content is missing or contradictory.

How to measure if your knowledge base is working

You cannot improve a knowledge base by feel. Tie it to a few hard numbers and the weak spots become obvious. The two that matter most are resolution rate (the share of contacts the agent closes without a human) and escalation reasons (why it handed the rest off). Falling resolution or a cluster of escalations in one category is a content signal, not a model problem.

Watch these metrics, and read escalations as a to-do list rather than a failure log:

Metric	What it tells you	What to do when it slips
Resolution rate	Share of contacts closed without a human	Trace drops to the categories driving new escalations
Escalation reasons	Why the agent handed off	Each recurring reason is a missing or weak article
Repeat-contact rate	Customers coming back on the same issue	The answer existed but was vague; tighten the content
Answer accuracy / thumbs-down	Where customers flag a wrong answer	Find the article it cited and correct the fact
Coverage of top reasons	Whether your top 10 ticket types each have a dedicated article	Write the missing ones first

Maintenance and keeping knowledge fresh

A knowledge base is not a project you finish; it is a loop you run. Escalations reveal gaps, gaps trigger updates, updates lift resolution, and the cycle repeats. The stores that get strong AI performance are not the ones with the most content on day one — they are the ones that close gaps fastest and never let stale facts accumulate. Treat the knowledge base as a living system with a clear owner.

Assign one person as the knowledge base owner, the same way you would own a product page. Then run this cadence:

Weekly (first three months): read every escalation from the past week. Any handoff caused by a knowledge gap should produce an article update within 24 hours.
Monthly: verify shipping timelines and any time-sensitive content — promo codes, seasonal cutoffs — especially after a carrier or fulfillment change.
Quarterly: full audit. Flag anything not reviewed in 90 days, re-test your top 10 ticket categories, and remove contradictions.
Event-triggered: whenever a policy actually changes — return window, carrier, pricing — update the same day. Build it into the change workflow so it is never an afterthought.

The agent does not get smarter on its own. It gets smarter every time you close the gap an escalation just showed you.
— Ecommerce CX, Bookbag

How Bookbag builds and maintains the knowledge base

Bookbag is an AI customer support agent built for Shopify and ecommerce, and the knowledge work above is wired into how it sets up. You point it at your existing help center, website, and policy pages, and it imports and indexes them for retrieval rather than asking you to start from a blank document. Most stores get a working agent live in under a day from that import alone.

Because Bookbag is an agent and not a script-based chatbot, it pairs that imported knowledge with live store data. It reads your documented return policy from the knowledge base, looks up the actual order from Shopify, WooCommerce, or BigCommerce, and combines them to answer WISMO, returns, refunds, and product questions — and takes the action within the rules and caps you set. The static content and the live facts work together, which is the whole point of the previous section.

On maintenance, the agent surfaces escalation patterns and flags content that looks stale, so the gap-closing loop has a dashboard instead of a spreadsheet. Scheduled auto-retrain re-indexes your content on a cadence you choose, with a human-in-the-loop review before changes affect answers. Pricing is flat and credit-based rather than per-resolution, so a knowledge base that deflects more does not quietly inflate your bill.

See Bookbag pricing How to write help docs AI can answer from How AI agents use your product catalog

Key takeaways

Most AI support failures are knowledge problems — missing, stale, or contradictory content — not model limitations. Content is the highest-leverage fix.
AI agents read by retrieval, not top to bottom. Write self-contained chunks, lead with the fact, and phrase content the way customers ask.
Build in ticket-volume order. Seven precise categories beat a thousand inconsistent articles.
Document the rules in your knowledge base; connect live store data for order status, eligibility, and actions.
Run the loop: escalations reveal gaps, gaps trigger same-week updates, and resolution rate climbs over time.
Keep a legal policy for compliance and a separate plain-language operational version for the agent.

Building a Knowledge Base Your AI Agent Can Actually Use

Why the knowledge base is the real bottleneck for AI support

How an AI agent actually reads your knowledge

What to put in your knowledge base first

How to write knowledge content for AI retrieval

Lead with the answer, then explain

Use a question-and-answer format

List every exception next to the rule

Use the customer's words, not your internal ones

Structuring policy documents for an AI agent

Why static knowledge is only half the job

Mistakes that quietly wreck AI accuracy

How to measure if your knowledge base is working

Maintenance and keeping knowledge fresh

How Bookbag builds and maintains the knowledge base

Key takeaways

Frequently Asked Questions

Keep reading

How to Write Help Docs That AI Can Actually Answer From

How to Train Your AI Support Agent (and Keep It Accurate)

How AI Agents Use Your Product Catalog to Answer Customer Questions

Measuring and Improving AI Answer Accuracy in Ecommerce Support

How to Measure Ticket Deflection (and Actually Improve It)

Turn support into your competitive edge