What counts as PHI in AI-generated messages?

Any of HIPAA's 18 identifier categories: names, dates, phone numbers, geographic data, Social Security numbers, medical record numbers, email addresses, and other unique identifiers. If your AI includes any of these in a patient-facing message, it's a potential HIPAA exposure.

How is PHI detection different from data masking?

Data masking redacts PHI for storage or display. PHI detection catches PHI in AI-generated outbound communications before they ship — it's a pre-delivery governance control, not a post-processing step.

Does PHI detection replace a HIPAA compliance program?

No. Bookbag supports compliance workflows — we don't replace compliance teams or provide legal compliance services. PHI detection is one governance control within a broader compliance program.

Bookbag

Glossary

PHI Detection

Automated scanning of AI-generated communications for protected health information (PHI) as defined by HIPAA — including the 18 identifiers that constitute PHI — before messages reach patients or customers.

Get a Free Safety Audit See How It Works

What It Means

Key Insight

Your AI doesn't know what PHI is. PHI detection catches what your model was never trained to avoid.

Protected health information includes 18 categories of identifiers under HIPAA: names, dates (birth, admission, discharge, death), telephone numbers, geographic data, fax numbers, Social Security numbers, email addresses, medical record numbers, account numbers, health plan beneficiary numbers, certificate/license numbers, vehicle identifiers, device identifiers, web URLs, IP addresses, biometric identifiers, full-face photos, and any other unique identifying number. When AI generates patient-facing communications, it can inadvertently include PHI from training data, context windows, or retrieval-augmented generation. PHI detection scans every AI-generated message for these identifiers before the message ships, flagging or blocking messages that contain protected data elements.

Why It Matters

HIPAA violations carry penalties from $100 to $50,000 per violation, with annual maximums of $1.5 million per violation category. A single AI-generated message containing a patient's medical record number sent to the wrong recipient is a reportable breach. PHI detection is the frontline defense — catching protected data before it leaves your system, rather than discovering the exposure after the fact.

How Bookbag Helps

18-identifier scanning

Checks every message against all 18 HIPAA-defined PHI categories before delivery.

Automatic routing

Messages with detected PHI are blocked or routed to qualified reviewers — never shipped automatically.

Audit-ready logging

Every PHI detection event is logged with the identifier type, message context, and resolution for compliance documentation.

Frequently Asked Questions

Related Resources

Solutions

Compare

See comparison →

See how Bookbag works

Join the teams shipping safer AI with real-time evaluation, audit trails, and continuous improvement.

Request a demo Get a free audit