FirmAdapt
FirmAdapt
LIVE DEMO
Back to Blog
AI complianceregulatoryhealthcareHIPAAPHI

Health Information Exchanges, Interoperability, and the AI Aggregation Risk

By Basel IsmailMay 4, 2026

Health Information Exchanges, Interoperability, and the AI Aggregation Risk

Health Information Exchanges have been around long enough that most compliance teams treat them as mature infrastructure. Routine plumbing. Data flows in from hospitals, labs, pharmacies, and payers; data flows out to authorized participants. The privacy framework is well understood: Business Associate Agreements, minimum necessary standards, HIPAA's transaction and code set rules, and state-level consent requirements layered on top. Straightforward enough.

Then you add AI, and the risk profile changes in ways that the existing control framework was never designed to handle.

The Scale of What HIEs Actually Hold

To appreciate the problem, consider the volume. The Strategic Health Information Exchange Collaborative (SHIEC) reported in 2023 that HIEs collectively process billions of clinical transactions annually across all 50 states. The largest networks, like CommonWell Health Alliance and Carequality, connect tens of thousands of provider organizations. Indiana's IHIE, one of the oldest in the country, holds records on more than 15 million patients. New York's SHIN-NY network spans the entire state and feeds data to regional organizations serving over 20 million residents.

This is not a single hospital's EHR. This is longitudinal, cross-institutional, multi-payer health data aggregated at population scale. When you train or fine-tune an AI model on data at this density, you create something qualitatively different from what HIPAA's framers were contemplating in 1996.

The 21st Century Cures Act Made This Worse (On Purpose)

The Cures Act, signed in December 2016, was designed to accelerate interoperability. Section 4003 directed HHS to develop conditions and maintenance of certification requirements for health IT. The information blocking provisions, which took enforcement effect on April 5, 2021, made it affirmatively illegal for providers, health IT developers, and HIEs to unreasonably restrict the access, exchange, or use of electronic health information (EHI).

ONC's final rule (85 FR 25642) defined EHI broadly, encompassing essentially all electronic PHI in a designated record set. The eight exceptions to information blocking are narrow and specific. If you are an HIE and a connected AI vendor requests data access through a legitimate channel, your ability to say "no" on privacy grounds is constrained unless you can fit your refusal into one of those exceptions, most likely the Privacy Exception (45 CFR 171.202) or the Security Exception (45 CFR 171.203).

So the regulatory environment simultaneously demands broad data sharing and imposes strict privacy controls. Those two mandates coexist in tension, and AI is the thing that turns that tension into actual risk.

What AI Aggregation Risk Actually Looks Like

The core problem is concentration. When an AI system ingests data from an HIE, it can correlate across institutions, time periods, and data types in ways that no individual provider could. This creates several concrete risks:

  • Re-identification at scale. De-identified data under the HIPAA Safe Harbor method (45 CFR 164.514(b)) relies on removing 18 identifiers. But research has repeatedly shown that combining de-identified datasets makes re-identification feasible. A 2019 study in Nature Communications by Rocher, Hendrickx, and de Montjoye demonstrated that 99.98% of Americans could be re-identified in any dataset using 15 attributes. An AI model trained on HIE-scale data has access to far more than 15 attributes per person.
  • Model memorization. Large language models and deep learning systems can memorize training data. Carlini et al. demonstrated in 2021 that GPT-2 could regurgitate verbatim training sequences. If a model is trained on raw or insufficiently anonymized PHI from an HIE, it may leak that data during inference. This is a breach under HIPAA, full stop.
  • Purpose creep beyond the BAA. A Business Associate Agreement specifies permitted uses and disclosures. But AI models, once trained, are portable. If a model trained on HIE data is later used for insurance underwriting, employment screening, or marketing, the original BAA's scope has been violated. Enforcement here is lagging, but OCR has signaled interest. The December 2022 bulletin on tracking technologies showed OCR is willing to treat downstream data use as a covered disclosure.
  • Compounding breach impact. A breach of a single provider's EHR is bad. A breach of a model trained on an entire HIE's data is catastrophically worse. The average cost of a healthcare data breach hit $10.93 million in 2023 according to IBM's Cost of a Data Breach Report. Scale that by the population coverage of a major HIE, and the liability exposure is extraordinary.

Where Current Controls Fall Short

HIPAA's minimum necessary standard (45 CFR 164.502(b)) requires covered entities to limit PHI disclosures to what is reasonably necessary for the purpose. But "reasonably necessary" was written for transactional data exchange, not for training datasets. How do you apply minimum necessary to a model that improves with more data by design?

The HIPAA Security Rule's administrative, physical, and technical safeguards (45 CFR 164.308, 310, 312) address data at rest and in transit. They do not address data embedded in model weights. There is no provision in the Security Rule for auditing what a neural network has "learned." OCR has not issued guidance on this, and the NPRM for the HIPAA Security Rule update published in January 2025 (90 FR 898) focuses on patching, encryption, and access controls, not on AI-specific model governance.

State laws add complexity but not clarity. Texas HB 300 imposes stricter consent requirements for PHI disclosure. California's CMIA layers additional protections. But none of these frameworks address the specific problem of PHI being encoded into model parameters rather than stored in a traditional database.

What Different Controls Would Look Like

Organizations running AI on HIE data need controls that go beyond standard HIPAA compliance:

  • Data provenance tracking that follows PHI from the HIE into training pipelines, with auditable lineage showing what data was used, when, and under what authorization.
  • Differential privacy or federated learning architectures that allow model training without centralizing raw PHI. These are not theoretical; Google Health and Apple have deployed federated approaches in production.
  • Model-level access controls that restrict who can query a trained model and what types of outputs are permitted, treating the model itself as a repository of PHI.
  • BAA amendments that explicitly address model training, model portability, and downstream inference uses. Standard BAA templates do not cover this.
  • Regular membership inference testing to determine whether a model can be used to confirm that a specific individual's data was in the training set. If it can, you have a disclosure problem.

The gap between what HIPAA requires and what AI-on-HIE-data demands is significant. Organizations that treat standard HIPAA compliance as sufficient for AI workloads are carrying risk they may not have quantified.

How FirmAdapt Addresses This

FirmAdapt's architecture was built around the premise that AI operating on regulated data requires controls at the model layer, not just the data layer. For organizations working with HIE data, this means data provenance tracking is built into the pipeline, BAA scope is mapped against actual model behavior, and access controls extend to inference outputs. The platform treats a trained model as a regulated artifact, which is the correct posture when that model has ingested population-scale PHI.

FirmAdapt also maintains continuous compliance mapping against both HIPAA and the Cures Act information blocking exceptions, so organizations can document that their data access restrictions are legally justified under 45 CFR 171.202 and 171.203 rather than constituting prohibited information blocking. When the regulatory framework pulls in two directions at once, having that documentation automated and audit-ready is the difference between a defensible position and an expensive one.

Ready to uncover operational inefficiencies and learn how to fix them with AI?
Try FirmAdapt free with 10 analysis credits. No credit card required.
Get Started Free
Health Information Exchanges, Interoperability, and the AI A | FirmAdapt