Research Document Archive
Computational analysis of 234,630 declassified U.S. government documents
ML Pipeline Results
234.6K
Documents
3.2M
Pages OCR'd
31.0M
Named Entities
2.9M
Entity Links
59,830
Redactions Found
288
Topic Clusters
Classification Stamps Detected
16,501
UNCLASSIFIED
13,736
SECRET
10,730
CLASSIFIED
6,739
EXEMPT
5,554
CONFIDENTIAL
4,722
RESTRICTED