Capability Overview
Current AI, a non-profit launched at the Paris AI Action Summit in February 2025 with $400 million in committed backing, has published the Open Source AI Gap Map v0.1. The release catalogues 421 open-source AI products — 266 software tools and libraries, 85 models, 50 datasets, and 20 hardware projects — produced by 228 organisations across 14 categories and three stack layers. The underlying data is MIT-licensed and published as 1,184 YAML files in the currentai-org/os-ai-map GitHub repository, with an accompanying CSV of 16,185 tracked GitHub repositories explorable via Datasette Lite.
For defenders, the significance is not the Gap Map’s stated mission of cataloguing open-source AI for public benefit. It is that a well-funded, credible organisation has done the reconnaissance work for the entire ecosystem and published it freely.
Attack Surface Analysis
The Gap Map converts what was previously fragmented, manual research into a structured, machine-readable, version-controlled inventory. This changes the threat landscape in several concrete ways:
Adversary reconnaissance at scale. Nation-state and cybercriminal actors who previously needed to independently enumerate the open-source AI supply chain now have a curated, scored, categorised starting point. The YAML schema imposes structure that makes automated analysis trivial — identifying low-maintainer projects, unmaintained datasets, or hardware with small contributor bases requires only basic scripting against the repo.
Prioritised poisoning targets. The 85 catalogued models and 50 datasets, particularly those rated as foundational or widely depended upon, represent a ranked list of upstream components where a successful poisoning or backdoor insertion would cascade downstream. The map’s own scoring system inadvertently signals which targets yield the highest leverage.
Gap signalling. The project’s explicit purpose is to identify capability gaps in open-source AI. For offensive researchers, a publicly published gap list is an investment thesis: these are the areas where security tooling is absent or immature, making exploitation less likely to be detected.
Dependency graph exposure. The 16,000+ tracked repositories include organisational attribution. Cross-referencing this with public contributor graphs enables targeted social engineering or credential compromise against maintainers of high-impact components.
Framework Mapping
- AML.T0010 (ML Supply Chain Compromise): The map directly accelerates the reconnaissance phase of supply chain attacks by enumerating components and their relative importance.
- AML.T0019/T0020 (Publish Poisoned Datasets / Poison Training Data): Catalogued datasets with clear upstream provenance are now easier to target; adversaries can identify which poisoned dataset would affect the most downstream models.
- AML.T0044 (Full ML Model Access): Models indexed with open weights and repository links reduce the effort required to study, clone, or backdoor them.
- LLM05 (Supply Chain Vulnerabilities): The map is a supply chain transparency tool that simultaneously functions as a supply chain attack surface enumeration tool.
- LLM03 (Training Data Poisoning): High-visibility datasets in the index become priority targets.
Threat Scenarios
Scenario 1 — Targeted repo takeover. A threat actor queries the YAML dataset for models with fewer than three active contributors and high downstream citation counts. They identify two candidate repositories, initiate a maintainer impersonation campaign, and inject a backdoored model weight update.
Scenario 2 — Dataset poisoning via gap exploitation. The Gap Map flags a foundational multilingual dataset as having no security-focused maintainer. An adversary submits subtly poisoned samples through the dataset’s open contribution pipeline, knowing audit tooling in this gap area is absent.
Scenario 3 — Automated dependency enumeration for a targeted enterprise. An attacker cross-references a target organisation’s public GitHub repositories against the Gap Map’s 16,000 tracked repos to build a precise map of which open-source AI components the organisation likely uses, then crafts a spearphishing campaign against the relevant maintainers.
Defender Checklist
- Download the YAML dataset and cross-reference against your internal AI component inventory to identify overlapping dependencies
- Flag any dependencies scored as high-importance by the Gap Map for enhanced integrity monitoring (hash pinning, signed releases)
- Review contributor health of your top 10 open-source AI dependencies; apply heightened scrutiny to any with fewer than three active maintainers
- Subscribe to the
currentai-org/os-ai-maprepository to receive alerts when components you depend on are re-scored or re-categorised - Use the dataset and model lists to scope your next AI supply chain risk assessment
- Share the gap list with your threat intelligence team as an indicator of where adversarial research investment is likely to flow
References
- Simon Willison’s Weblog: https://simonwillison.net/2026/Jul/3/open-source-ai-gap-map
- Current AI Gap Map GitHub: https://github.com/currentai-org/os-ai-map
- Datasette Lite exploration: linked via source article