American intelligence is building an AI-driven central hub for purchasing, linking and analysing commercially available personal data
From the intercept:
The Office of the Director of National Intelligence is working on a system to centralize and âstreamlineâ the use of commercially available information, or CAI, like location data derived from mobile ads, by American spy agencies, according to contract documents reviewed by The Intercept. The data portal will include information deemed by the ODNI as highly sensitive, that which can be âmisused to cause substantial harm, embarrassment, and inconvenience to U.S. persons.â The documents state spy agencies will use the web portal not just to search through reams of private data, but also run them through artificial intelligence tools for further analysis.
Rather than each agency purchasing CAI individually, as has been the case until now, the âIntelligence Community Data Consortiumâ will provide a single convenient web-based storefront for searching and accessing this data, along with a âdata marketplaceâ for purchasing âthe best data at the best price,â faster than ever before, according to the documents. It will be designed for the 18 different federal agencies and offices that make up the U.S. intelligence community, including the National Security Agency, CIA, FBI Intelligence Branch, and Homeland Securityâs Office of Intelligence and Analysis â though one document suggests the portal will also be used by agencies not directly related to intelligence or defense.
https://theintercept.com/2025/05/22/intel-agencies-buying-data-portal-privacy
Whatâs striking about this is how the language of modernisation (âODNI is working to streamline a number of inefficient processesâ) is used to describe a rather terrifying exercise, particularly if the still mostly unspecified use of AI lives up to its dystopian potential. It entrenches the process by which data collection is taken out of judicial oversight, becoming a commercial transaction rather than a process which requires a warrant.
The suggestion this will use LLMs to analyse the data is particularly worrying. Even the < 5% textual recall hallucination rates in frontier models (vastly higher in the latest reasoning models) could have enormous implications when you consider the possible use cases for surveillance.
I remember recently that data brokers are selling personal data released through data breaches, which presumably will be included in the collection here. Iâm struggling to find a link though and Iâm hoping I might have dreamed this.
#confidentialData #dataBrokers #hallucination #intelligenceAgencies #LLMs #personalData #power #surveillance