Model Cards, Data Sheets, and the Vendor Disclosures You Should Be Demanding
Model Cards, Data Sheets, and the Vendor Disclosures You Should Be Demanding
I've been reviewing AI vendor documentation for regulated clients for the better part of three years now, and I can tell you with confidence that roughly 80% of what vendors hand over as "transparency documentation" is marketing copy dressed up in technical formatting. It looks like disclosure. It reads like disclosure. It is not disclosure.
If your organization operates in a regulated sector and you're onboarding AI tools, you need to know the difference between a real model card and a glossy PDF that exists to close a deal. The stakes are not abstract. Under the EU AI Act (which entered into force on August 1, 2024, with tiered compliance deadlines running through August 2027), high-risk AI system providers must furnish detailed technical documentation covering training data, model architecture, performance metrics across subgroups, and known limitations. The FTC has been equally direct; its April 2024 staff report on AI specifically flagged inadequate vendor disclosures as a deceptive practice risk. And if you're in financial services, OCC Bulletin 2023-17 on third-party risk management makes clear that "the bank's due diligence should be commensurate with the risk" of the technology being adopted.
So what should you actually be looking at?
Model Cards: What They Are and What They Should Contain
The concept of model cards was formalized by Margaret Mitchell et al. in a 2019 paper out of Google. The idea was straightforward: every ML model should ship with a standardized disclosure document covering its intended use, training data composition, evaluation metrics, ethical considerations, and performance breakdowns across demographic groups.
A real model card tells you:
- What the model was trained on. Not "a diverse dataset." Specifics. Corpus size, data sources, temporal range, whether synthetic data was used, and what filtering or deduplication was applied.
- How it was evaluated. Benchmark names, metric definitions, and disaggregated performance results. If a vendor gives you a single F1 score with no breakdowns, that's a red flag.
- Known failure modes. Every model has them. If the documentation doesn't list any, the vendor either hasn't tested adequately or isn't being honest.
- Intended vs. out-of-scope uses. A model built for document summarization in English shouldn't be quietly repurposed for multilingual contract analysis. The card should draw these boundaries explicitly.
What you'll typically receive instead is a two-page overview with vague references to "state-of-the-art performance" and "rigorous testing." No benchmarks named. No failure modes listed. No disaggregated metrics. This is the marketing-grade version, and it's useless for compliance purposes.
Datasheets: The Training Data Problem
Datasheets for datasets, proposed by Timnit Gebru et al. in 2018, serve a parallel function for training data. They document how data was collected, what consent mechanisms were in place, what populations are represented, and what preprocessing occurred.
This matters enormously for information governance. If your vendor trained a model on scraped web data that includes personally identifiable information, you may have a GDPR Article 6 lawful basis problem, a CCPA issue, or both. The ongoing litigation in Doe 1 v. GitHub (filed November 2022, still active in the Northern District of California) raises exactly these questions about training data provenance for code-generation tools. And the $60 million settlement Google reached with Australian regulators in August 2022 over misleading data collection disclosures should remind everyone that "we used publicly available data" is not a legal shield.
When you ask for a datasheet and receive something that says "trained on proprietary and publicly available data," push back. You need to know:
- Whether the training data includes data from your industry or jurisdiction
- Whether any PII, PHI, or regulated data categories are present
- What data retention and deletion policies apply to the training corpus
- Whether your inputs to the system are used for model retraining (this one catches people constantly)
The Trade Secrets Tension
Vendors will push back on detailed disclosures by invoking trade secret protection, and honestly, they have a point. Model architectures, proprietary training techniques, and curated datasets can represent genuine competitive advantages protected under the Defend Trade Secrets Act (18 U.S.C. 1836) and state equivalents.
But here's where it gets interesting: trade secret protection and regulatory compliance disclosure are not mutually exclusive. You can structure vendor agreements with tiered disclosure frameworks. The vendor provides full technical documentation under NDA to your security and compliance teams, while keeping that information out of broadly circulated procurement documents. NIST's AI Risk Management Framework (AI RMF 1.0, released January 2023) explicitly contemplates this kind of structured transparency, recommending that organizations "calibrate the depth of disclosure to the risk profile of the deployment."
If a vendor refuses any meaningful disclosure and hides entirely behind trade secrets, that itself is information. It tells you they either haven't done the work to document their system properly or they've found things in that documentation they'd rather you not see. Neither possibility should make you comfortable.
A Practical Checklist for Procurement
When you're evaluating an AI vendor, here's what I'd recommend demanding before signing:
- A model card following the Mitchell et al. framework (or an equivalent structured disclosure). If the vendor hasn't heard of model cards, that tells you something about their maturity.
- A datasheet or data provenance document covering training data sources, consent mechanisms, and PII handling.
- Third-party audit results if available. SOC 2 Type II is table stakes. Look for AI-specific assessments aligned with ISO/IEC 42001 (published December 2023), the first international standard for AI management systems.
- A clear data flow diagram showing where your inputs go, whether they're stored, whether they're used for training, and what jurisdiction they're processed in.
- Incident disclosure commitments with defined timelines. If the model's behavior degrades or a data breach occurs, when do you hear about it?
- Contractual representations about training data legality. If the vendor won't warrant that their training data was lawfully obtained, factor that risk into your assessment.
None of this is unreasonable. Vendors who are doing the work will have these documents ready or be willing to produce them. Vendors who aren't will tell you it's "not standard practice" or "not something other customers ask for." Other customers might not be operating under the same regulatory obligations you are.
How FirmAdapt Approaches This
FirmAdapt was built for environments where this level of documentation isn't optional. The platform's architecture is designed to provide customers with clear, auditable records of how AI models process their data, including data flow transparency, model provenance documentation, and explicit boundaries on data retention and reuse. Customers get the kind of structured disclosures described above as a standard part of onboarding, not as a special request.
FirmAdapt also maintains compliance mappings to frameworks including the NIST AI RMF, ISO/IEC 42001, and sector-specific requirements like HIPAA and GLBA, so that your legal and compliance teams can evaluate the platform against the regulatory obligations that actually apply to your organization. The goal is to make vendor due diligence straightforward rather than adversarial, which is how it should work when a vendor has nothing to hide.