What It Means
Your AI doesn't know what PHI is. PHI detection catches what your model was never trained to avoid.
Protected health information includes 18 categories of identifiers under HIPAA: names, dates (birth, admission, discharge, death), telephone numbers, geographic data, fax numbers, Social Security numbers, email addresses, medical record numbers, account numbers, health plan beneficiary numbers, certificate/license numbers, vehicle identifiers, device identifiers, web URLs, IP addresses, biometric identifiers, full-face photos, and any other unique identifying number. When AI generates patient-facing communications, it can inadvertently include PHI from training data, context windows, or retrieval-augmented generation. PHI detection scans every AI-generated message for these identifiers before the message ships, flagging or blocking messages that contain protected data elements.
Why It Matters
HIPAA violations carry penalties from $100 to $50,000 per violation, with annual maximums of $1.5 million per violation category. A single AI-generated message containing a patient's medical record number sent to the wrong recipient is a reportable breach. PHI detection is the frontline defense — catching protected data before it leaves your system, rather than discovering the exposure after the fact.
How Bookbag Helps
18-identifier scanning
Checks every message against all 18 HIPAA-defined PHI categories before delivery.
Automatic routing
Messages with detected PHI are blocked or routed to qualified reviewers — never shipped automatically.
Audit-ready logging
Every PHI detection event is logged with the identifier type, message context, and resolution for compliance documentation.
Frequently Asked Questions
Related Resources
Solutions
Compare
See comparison →See how Bookbag works
Join the teams shipping safer AI with real-time evaluation, audit trails, and continuous improvement.