Why Air-Gapped AI Deployments Are Coming Back
Why Air-Gapped AI Deployments Are Coming Back
Air-gapped networks were supposed to be a relic. The cloud migration wave of the 2010s made them feel like an anachronism, something you maintained for legacy SCADA systems or classified defense programs but certainly not where you'd run modern workloads. And yet here we are in 2024 and 2025, watching some of the most sophisticated organizations in defense, finance, and healthcare actively choosing to deploy large language models and other AI systems on networks with no internet connectivity whatsoever.
The reasons are partly regulatory, partly economic, and partly architectural. They're worth understanding because this trend tells you something important about where information governance is heading.
The Regulatory Pressure That Changed the Math
Start with defense. DFARS 252.204-7012 has required controlled unclassified information (CUI) to be handled on systems meeting NIST SP 800-171 since December 2017. CMMC 2.0, which the DoD finalized its rule on in October 2024, layers a certification and assessment regime on top of that. For Level 2 and Level 3 contractors, the practical reality is that any AI system touching CUI needs to live in an environment where data exfiltration risk is near zero. Air-gapping is the most straightforward way to demonstrate that.
ITAR adds another dimension. The International Traffic in Arms Regulations, administered by DDTC under 22 CFR Parts 120-130, impose strict controls on technical data related to defense articles. Running an AI model that ingests ITAR-controlled technical data through a cloud API, even one hosted in the U.S., creates export control exposure that most compliance teams simply do not want to manage. The penalties are severe; Raytheon paid $20 million in 2023 to settle ITAR violations that involved, among other things, inadequate access controls on technical data.
In healthcare, the calculus is different but converges on a similar answer. HIPAA's Security Rule (45 CFR Part 164) requires covered entities and business associates to implement technical safeguards that control access to electronic protected health information. When you're using AI to analyze patient records, radiology images, or genomic data, the question of where that data travels becomes acute. The HHS Office for Civil Rights issued guidance in December 2023 emphasizing that AI tools used in healthcare settings must comply with existing HIPAA obligations, including the minimum necessary standard. Several large health systems have concluded that the simplest compliance posture is to keep the AI, and the data, on a network that physically cannot reach the internet.
Financial services rounds out the picture. The SEC's Regulation S-P amendments, finalized in May 2024, tightened incident response and data protection requirements for broker-dealers and investment advisers. The OCC has been issuing guidance on model risk management (SR 11-7) for years, and examiners are now asking pointed questions about where AI models are trained and what data flows outside the institution's perimeter. For firms handling material nonpublic information, the combination of Regulation FD, insider trading liability, and the SEC's increasing focus on AI-related conflicts of interest creates a strong incentive to keep AI workloads contained.
The Economics Actually Work Now
Two years ago, running a capable LLM on-premises required hardware budgets that were hard to justify. That has changed meaningfully. A few factors:
- Smaller, more capable models. Models like Llama 3.1 (8B and 70B parameter variants), Mistral 7B, and Phi-3 can run on hardware that costs tens of thousands of dollars rather than millions. A single NVIDIA A100 node can serve a 70B parameter model with reasonable latency for internal use cases. You don't need a hyperscaler's infrastructure anymore.
- Quantization and optimization. Techniques like GPTQ and GGUF quantization let you run models at 4-bit or 8-bit precision with minimal quality loss. This cuts memory requirements roughly in half or more, which means your existing GPU inventory might already be sufficient.
- Inference-focused hardware. NVIDIA's L40S, AMD's Instinct MI300X, and even Intel's Gaudi 2 are all priced and designed for inference workloads rather than training. A two-node L40S setup runs around $60,000 to $80,000 and can handle enterprise inference loads comfortably.
Compare that to the fully loaded cost of a cloud AI deployment in a regulated environment. Once you add private endpoints, dedicated tenancy, encryption at rest and in transit, logging, audit trails, and the compliance overhead of managing a BAA or ITAR agreement with a cloud provider, the annual cost for a mid-size deployment easily exceeds $200,000 to $500,000. The on-premises air-gapped option starts looking like the cheaper path over a three-year horizon, especially when you factor in the reduced compliance surface area.
Architectural Patterns That Are Emerging
The interesting part is how organizations are actually structuring these deployments. A few patterns keep showing up.
The "Clean Room" Pattern
Data is ingested into the air-gapped environment through a one-way data diode or a manual transfer process with strict chain-of-custody logging. The AI system operates entirely within the enclave. Results are reviewed by a human before anything leaves the network. This is common in defense and intelligence-adjacent work, and it maps well to NIST SP 800-171's access control family (3.1.x) and audit requirements (3.3.x).
The "Federated Island" Pattern
Multiple air-gapped environments each run their own instance of the same model. Model updates are distributed via encrypted physical media or secure, intermittent connections. Each island maintains its own data and fine-tuning. This shows up in healthcare systems with multiple facilities and in financial institutions with geographically distributed trading operations.
The "Inference-Only Enclave" Pattern
Training and fine-tuning happen on a connected (but heavily controlled) network using synthetic or de-identified data. The resulting model weights are then transferred to the air-gapped environment where all production inference occurs on real, sensitive data. This lets organizations benefit from cloud-scale training resources while keeping their most sensitive data completely isolated.
The Trade Secret Angle
There's a dimension here that doesn't get enough attention. Under the Defend Trade Secrets Act of 2016 (18 U.S.C. 1836), a trade secret owner must demonstrate "reasonable measures" to keep the information secret. Courts have interpreted this requirement with increasing specificity. In Compulife Software Inc. v. Newman (11th Cir. 2020), the court scrutinized the technical measures the plaintiff used to protect its data, and found them wanting.
If your organization is using AI to process proprietary formulations, trading strategies, source code, or client data that qualifies as trade secret material, the question of whether you took "reasonable measures" will eventually come down to your technical architecture. Running that AI on a third-party cloud platform, even with contractual protections, is a harder argument than running it on an air-gapped network you control. Defense counsel will tell you the same thing. The air gap is a concrete, demonstrable measure that courts understand.
The Practical Challenges
None of this is free of tradeoffs. Air-gapped AI deployments create real operational friction. Model updates are slower. You lose access to retrieval-augmented generation against live internet sources. Your IT team needs skills in GPU cluster management that they may not have today. And there's a latent risk of model drift if you're not disciplined about periodic revalidation.
The organizations doing this well treat the air-gapped AI environment as a first-class production system with its own update cadence, monitoring, and governance framework, not as a side project.
How FirmAdapt Addresses This
FirmAdapt's architecture was designed from the start to operate in environments with restricted or no internet connectivity. The platform supports fully air-gapped deployment with local model inference, on-premises document processing, and audit logging that meets the chain-of-custody requirements of NIST SP 800-171, HIPAA, and ITAR-controlled environments. Model updates are distributed through a secure transfer mechanism that maintains integrity verification without requiring a persistent network connection.
For organizations evaluating air-gapped AI, FirmAdapt provides the compliance scaffolding, access controls, audit trails, data classification, and policy enforcement, that turns a raw model deployment into a governed system. The platform handles the information governance layer so your team can focus on the use cases rather than rebuilding compliance infrastructure from scratch.