AI supply chain & infrastructure

The AI supply chain is broader than software’s: data (pre-train, fine-tune, RAG corpora - see II.2), weights (base models, adapters, quantizations), code (frameworks, MLOps, connectors), and infrastructure (serving, vector stores, GPUs). Most is pulled from public hubs with implicit trust.

Model files are code

A pickled checkpoint executes arbitrary code on load - downloading a model is running a stranger’s program. Safer formats (safetensors) and scanners help, but unsafe deserialization remains a top hub risk, and weights themselves can be backdoored (Sleeper Agents, II.3) which no format check detects.

Registry, dependency & MLOps risk

# publish malware one keystroke from a real package, or a higher version than a private one:
pip install huggingface-hubs      # squat of huggingface_hub; postinstall runs attacker code
# model-hub variant: upload a backdoored fork <org>/llama-3-8b-instruct-v2 with poisoned weights

Typosquatting and slopsquatting (LLMs hallucinate plausible package names attackers register) hit AI projects hard. MLOps infrastructure - experiment trackers, orchestrators, notebook servers - is often internet-exposed and under-hardened.

Infrastructure & deployment

Beneath the model: inference/serving endpoints, vector databases, container/Kubernetes orchestration, cloud configuration. Misconfigurations that look benign turn dangerous once AI workloads sit on them (exposed serving APIs, over-permissive IAM, unsecured vector stores). This is where most real-world breaches actually live, and it maps directly to the CSA advisory (IV.3).

Integrity for the model artifact: signing, MLBOM & provenance

If a model file can carry code (§5), then “is this the model the author actually built, unmodified?” becomes a load-bearing question - the provenance problem the software world solved for packages, now applied to weights and datasets. The tooling matured quickly across 2025-2026:

Model signing - the OpenSSF Model Signing (OMS) specification reached v1.0 in April 2025 (Google’s open-source security team with NVIDIA and HiddenLayer), built on Sigstore: keyless, identity-based signatures logged in a public transparency log (Rekor), with a detached bundle binding a model to its author and a manifest of file hashes. It is integrated into NVIDIA NGC and Google Kaggle.
Build & provenance levels - SLSA (“salsa”) gives a graded checklist for tamper-resistant build pipelines and verifiable provenance; Sigstore/Cosign supplies the signing and verification primitives.
Bill of materials - a ML-BOM enumerates the model, its datasets, and dependencies; CycloneDX (OWASP; v1.7, Oct 2025) has carried ML-BOM since v1.5, and OWASP’s SCVS guides component verification.
Documentation as metadata - Model Cards (Mitchell et al., 2019) record intended use, training data, and evaluation; the Coalition for Secure AI (CoSAI) is driving this toward tamper-proof, machine-readable metadata signed alongside the weights.