The Business Associate Subcontractor Trap When Your AI Vendor Uses OpenAI
The Business Associate Subcontractor Trap When Your AI Vendor Uses OpenAI
You did your due diligence. You found an AI vendor that handles PHI, negotiated a Business Associate Agreement, confirmed they encrypt data in transit and at rest, and filed it all away. Solid compliance hygiene. But here is a question that keeps surfacing in breach investigations and OCR audits: did your vendor's vendor sign a BAA too?
If your AI tool routes PHI through a large language model provider like OpenAI, Anthropic, or Google, and there is no BAA between your vendor and that LLM provider, you have a broken chain. And under HIPAA, a broken chain is your problem.
The Subcontractor BAA Requirement
The 2013 HIPAA Omnibus Rule made this explicit. Under 45 CFR 164.502(e)(1)(ii) and 45 CFR 164.504(e)(1)(i), a business associate that engages a subcontractor to handle PHI must ensure that subcontractor agrees to the same restrictions and conditions that apply to the business associate. In practice, this means a written BAA at every link in the chain.
The regulatory logic is straightforward. If a covered entity shares PHI with Business Associate A, and Business Associate A shares that PHI with Subcontractor B to process it, Subcontractor B is a business associate of Business Associate A. The Omnibus Rule extended direct liability to subcontractors, meaning OCR can and does enforce against them independently. But the obligation to secure the BAA runs through the chain.
So when your healthcare AI vendor sends patient data, clinical notes, or any individually identifiable health information to an LLM API for inference, that LLM provider is a subcontractor performing a function involving PHI. No BAA between them? The chain is broken.
Why This Keeps Happening
Most major LLM providers have historically been reluctant to sign BAAs for their general API products. OpenAI introduced a BAA option for its Enterprise tier in early 2024, but the standard API tier and ChatGPT Team plans do not come with BAA coverage. Anthropic and Google have their own enterprise agreements with varying levels of HIPAA support, but the details matter enormously and change frequently.
The problem is that many health tech startups and mid-market SaaS vendors build on the standard API tiers. They are moving fast, they want GPT-4o or Claude capabilities, and they integrate the API before fully working through the compliance implications. Some genuinely do not realize they need a downstream BAA. Others assume their own BAA with the covered entity is sufficient, or that de-identification before the API call solves the problem.
Sometimes it does. If PHI is genuinely de-identified per the Safe Harbor method under 45 CFR 164.514(b), it is no longer PHI and the subcontractor BAA requirement does not apply. But "we strip out names" is not Safe Harbor de-identification. Safe Harbor requires removal or generalization of 18 specific identifier categories, plus the entity must have no actual knowledge that the remaining information could identify an individual. Clinical notes are notoriously difficult to fully de-identify because they contain contextual details, rare diagnoses, family histories, and geographic references that can re-identify patients even without explicit identifiers.
How to Spot a Broken Link
When evaluating an AI vendor that will touch PHI, you need to ask pointed questions. Not "are you HIPAA compliant" (a meaningless question), but specific ones about the data flow.
- Does any PHI or partially de-identified data leave your infrastructure during inference? If the vendor calls an external API with patient data, you need to know.
- Which LLM provider do you use, and do you have a signed BAA with them? Ask to see it, or at minimum get a written attestation specifying the provider and the BAA's effective date.
- If you rely on de-identification before the API call, what method do you use? Ask whether it is Safe Harbor or Expert Determination under 45 CFR 164.514(a). Ask for documentation of the process. If they say "we use regex to strip PII," that is a red flag.
- Do you use a self-hosted or private instance of the model? Some vendors deploy open-weight models (Llama, Mistral) on their own infrastructure, which avoids the subcontractor issue entirely because no data leaves their environment. This is a meaningfully different architecture.
- What happens to data sent to the LLM provider? Even with a BAA in place, you want to confirm the LLM provider does not use input data for model training. OpenAI's Enterprise API and Anthropic's API both state they do not train on API inputs, but you should verify this is contractually guaranteed, not just a policy page.
The Enforcement Reality
OCR has not yet brought a marquee enforcement action specifically about an AI vendor's missing subcontractor BAA. But the enforcement framework is well established. In 2016, OCR settled with North Memorial Health Care for $1.55 million in part because a business associate, Accretive Health, did not have a BAA in place. The missing agreement was a central finding. In the 2023 OCR investigation of Banner Health following a 2016 breach affecting 3.7 million individuals, subcontractor oversight failures were part of the broader compliance breakdown.
OCR's audit protocol explicitly examines whether business associates have obtained satisfactory assurances from subcontractors. The December 2023 OCR guidance on online tracking technologies reinforced that third parties receiving PHI through technology integrations are business associates and need BAAs. While that guidance focused on web trackers, the principle applies directly to API calls that transmit PHI to LLM providers.
The FTC has also entered this space. Its enforcement actions against BetterHelp ($7.8 million settlement in 2023) and GoodRx ($1.5 million penalty in 2023) involved health data sharing with third parties without adequate protections. While these were FTC Act and Health Breach Notification Rule cases rather than HIPAA, they signal a regulatory environment with very little patience for "we didn't realize our vendor was sharing data downstream."
Contractual Risk Beyond Regulatory Fines
Fines are one dimension. The contractual exposure is another. If your BAA with a vendor includes standard indemnification provisions and breach notification obligations, a breach at the LLM subcontractor layer triggers a cascade. Your vendor must notify you. You must notify affected individuals and HHS. If the breach affects more than 500 individuals, it goes on the OCR breach portal, colloquially known as the Wall of Shame, and stays there.
Litigation follows. Class action plaintiffs' firms monitor the breach portal closely. A 2024 analysis by BakerHostetler found that the average cost of a healthcare data breach reached $10.93 million, with legal fees and settlements comprising a growing share. A broken subcontractor BAA chain makes the covered entity's legal position materially worse because it suggests a compliance program that did not adequately vet downstream risk.
How FirmAdapt Addresses This
FirmAdapt was built to avoid the subcontractor trap entirely. The platform processes data within its own infrastructure using models deployed in environments that FirmAdapt controls. PHI does not get routed to third-party LLM APIs, which means there is no subcontractor BAA gap to worry about. The data flow is straightforward: your data stays within a compliance boundary that FirmAdapt maintains and can contractually guarantee.
FirmAdapt signs BAAs directly with covered entities and business associates. Because there is no downstream LLM provider in the chain, the BAA relationship is clean. For compliance teams doing vendor assessments, this simplifies the analysis considerably. You are evaluating one entity's security controls and contractual commitments, not trying to validate a chain of agreements that your vendor may or may not have actually secured.