Research & Documentation

The AI² Framework &
Knowledge Graph Methodology

Scientific publications, formal doctrine, platform architecture, and system methodology for the world's first credibility-scored, cross-linked UAP intelligence platform.

Last updated May 2026
Platform records 361,491
01

Anomaly Intelligence Doctrine (AI²)

FLAGSHIP WHITEPAPER · FORMAL FRAMEWORK DEFINITION

Doctrine

From Anomaly to Intelligence: A Data-Driven, Graph-Centric, and Quantifiable Framework for UAP Analysis

Introduces Anomaly Intelligence (AI²): a rigorous analytical discipline that treats "the unknown" not as a mystery to be debated, but as a data problem to be solved. By leveraging multi-modal data fusion, Bayesian reliability scoring, and graph-based reasoning, AI² transforms disparate observations into structured, actionable intelligence. Establishes the formal ACS-1 classification standard for interoperability between defense, intelligence, and scientific agencies, and defines the autonomous agent loop governing the platform's lifecycle.

Anomaly Intelligence ACS-1 Standard Bayesian Scoring Knowledge Graph Agentic AI DoDAF OV-1/OV-5 Data Fusion
Formal definition: Anomaly Intelligence (AI²) is the systematic process of ingesting, structuring, scoring, and correlating anomalous observations into a unified knowledge graph to produce probabilistic intelligence representations. AI² is not a research tool — it is a category-defining intelligence platform.
Output

Probabilistic Intelligence Data

Replaces narrative UAP reports with mathematically grounded, graph-linked intelligence products queryable by researchers, analysts, and AI systems.

Method

Mathematical & Graph-Centric

Reliability function R, Bayesian belief revision, and graph-based confidence propagation replace anecdotal and descriptive approaches.

Agentic Loop

Autonomous Intelligence Cycle

Scoring Agent (updates R), Correlation Agent (signature search), and Drift Agent (inconsistency detection) operate continuously.

Assurance

Mission-Grade Integrity

Provenance tracking, adversarial defense against data poisoning, NIST AI RMF alignment, and human-in-the-loop validation at C3/C4 levels.

02

Scientific Methodology Paper

PEER-REVIEW SUBMISSION · CREDIBILITY SCORING & CROSS-SOURCE LINKING

Methodology

A Scientific Framework for Credibility Scoring and Cross-Source Linking of Unidentified Anomalous Phenomena Records

A scientific framework for credibility scoring and cross-source linking of UAP records drawn from multiple independent government and civilian databases. Applied to 87,595 records from five sources spanning 1906–2023 across 66 countries. Each record is assigned a credibility score from 1.0 to 9.0 based on five weighted dimensions. A Haversine geospatial algorithm identifies 17,126 cross-source corroborations. The platform is publicly verifiable at a live JSON API and AI-queryable MCP server.

UAP Data Infrastructure Credibility Scoring Geospatial Linking Haversine Algorithm AARO Project Blue Book Cross-Source Corroboration MCP Server
Epistemological position: The credibility score measures the quality of documentation — not the significance of the phenomenon. A score of 9.0 means thorough government documentation with confirmed coordinates, physical evidence, and independent corroboration. It does not mean the phenomenon was extraordinary. This separation is philosophically necessary and scientifically reproducible.
03

ACS-1 Anomaly Classification Standard

FORMAL TAXONOMY · MULTI-DIMENSIONAL CLASSIFICATION MODEL

The Anomaly Classification Standard (ACS-1) enables interoperability between defense, intelligence, and scientific agencies. Every event in the knowledge graph is classified across six orthogonal dimensions, enabling multi-axis filtering and pattern detection across the full 361,491-record corpus.

Dimension Classes Description
Source Human / Sensor / Derived The origin and nature of the observation.
Observability Direct / Indirect / Inferred The clarity and modality of the perception.
Persistence Transient / Recurring / Persistent Temporal behavior of the anomaly.
Environment Atmospheric / Space / Maritime / Terrestrial The physical domain of the event.
Signal Type Visual / IR / Radar / Multi-spectral The technical modality of the data.
Explanation State Explained / Partial / Unknown The current resolution level of the inquiry.

CONFIDENCE CLASSIFICATION — C0 THROUGH C4

C0 Noise / Sensor Artifact Below threshold. Excluded from scoring pipeline.
C1 Weak Anomaly Single source, low signal-to-noise ratio.
C2 Corroborated Anomaly Multi-sensor or multi-witness confirmation.
C3 High-Confidence Unexplained Verified high-quality data. Human-in-the-loop validation required.
C4 Persistent Structured Anomaly Recurring patterns or signatures across multiple events and sources.
04

Mathematical Scoring Model

RELIABILITY FUNCTION · BAYESIAN REVISION · GRAPH PROPAGATION

The core differentiator of AI² is the transition from narrative description to mathematical rigor. Three formulas govern the scoring pipeline: the foundational reliability function, Bayesian belief revision as new evidence arrives, and graph-based confidence propagation through the knowledge graph.

4.1 · RELIABILITY FUNCTION
R = w₁S + w₂Q + w₃C + w₄T + w₅D
S = Source Credibility Q = Signal Quality C = Corroboration T = Temporal Consistency D = Data Completeness ∑wᵢ = 1
4.2 · BAYESIAN BELIEF REVISION
P(H|E) = P(E|H) · P(H) / P(E)
H = Anomaly hypothesis E = New evidence entering system
4.3 · GRAPH-BASED CONFIDENCE PROPAGATION
R_node = α·R_self + β·∑(R_connected · w_edge)
Node reliability influenced by its connected neighbors
Implementation: For government Tier 1 sources, scores range 6.0–9.0 (base 6.0). For civilian Tier 3 sources, scores range 1.0–7.5 (base 1.0). Cross-source corroboration is the most powerful factor — two independent databases documenting the same event within 25 km / 7 days earn a significant bonus. 17,126 verified corroborations exist in the current corpus.
05

System Architecture

AI² FRAMEWORK · 7-LAYER STACK · DEPLOYMENT OVERVIEW

UAP Explorer Deployment & Doctrine Architecture (AI² Framework) ↓ Download
UAP Explorer AI² Deployment Architecture showing 6-layer stack from raw data sources through ingestion pipeline, storage, API, Cloudflare edge, and frontend applications

THE 7-LAYER ENGINE STACK

L1
Ingestion Multi-source APIs, Python extractors, Claude Haiku AI extraction for PDFs, IoT sensor feeds
L2
Normalization Schema standardization, metadata tagging, geocoding via Nominatim
L3
Feature Extraction AI-driven CV and NLP; structured field extraction from unstructured documents
L4
Scoring Engine Reliability function R, Haversine cross-source linker (25 km / 7 day threshold)
L5
Knowledge Graph Layer Entity resolution, relationship mapping, 17,126 cross-source links in uap_deploy.db
L6
AI Analytics Layer Pattern detection, clustering, anomaly signature identification, MCP server
L7
Visualization Leaflet maps, D3.js analytics, STRATCOM AI chat dashboard, 3D UAP tracks
OV-1 · Operational Concept
AI² Operational Concept showing multi-domain sensor network feeding into centralized AI² Intelligence Platform
Safe & Trustworthy AI Architecture
Architecture for Safe and Trustworthy AI showing 4-phase lifecycle with GRC framework
06

UAP Explorer Platform

LIVE INTELLIGENCE PLATFORM · PUBLIC ACCESS · AI-QUERYABLE

Platform

UAP Explorer™ — A Data-Driven Intelligence Platform for Anomaly Analysis

Reimagining how anomalous events are analyzed using structured data, AI, and intelligence-grade methodologies. Features multi-modal data ingestion, evidence scoring and reliability engine, knowledge graph and entity correlation, geospatial intelligence and heatmaps, AI-powered analytical engine via MCP, and a canonical evidence repository spanning 26 government and civilian sources across 70+ countries and 225 years of records.

Defense & Intelligence Analysis Scientific Research Aerospace Sensor Correlation AI Model Risk & Trust
07

What is a Knowledge Graph?

DATA CONNECTED, NOT JUST COLLECTED

A knowledge graph is a network where entities are nodes and relationships are edges. Most UAP databases treat records as isolated rows in a table. This platform treats them as a living graph — where a Blue Book Air Force report from 1962 can be linked to a NUFORC civilian call about the same event recorded independently, decades later.

Nodes

Individual Sighting Records

Each of the 361,491 records is a node in the graph, carrying structured attributes: date, location, shape, credibility score, source tier, and evidence types.

Edges

Cross-Source Corroborations

The Haversine algorithm identifies when two independent sources document the same event within 25 km and 7 days. Each confirmed match is an edge — 16,086 verified edges in the graph.

Credibility

Edge-Weighted Scoring

A civilian report corroborated by an Air Force investigation scores dramatically higher than either source alone. The graph structure elevates quality through connection — not assertion.

Emergent Insights

Patterns Invisible in Flat Data

Clustering coefficients surface geographic hotspots. Path analysis links seemingly unrelated events across decades. The Wright-Patterson Effect — a documentation bias — became visible only through cross-source analysis.

FIRST 01

Unified Cross-Government Database

US (Blue Book, CIA, AARO, DIA, NSA, FBI), UK (MoD), France (GEIPAN), Spain (CEFAE), and Brazil (SIAN) in a single queryable schema — never done before.

FIRST 02

Algorithmic Cross-Source Corroboration

17,126 verified events where two or more independent sources documented the same incident — computed across 800,000+ record pairs using Haversine at 25km / 7-day thresholds.

FIRST 03

Structured Credibility Scoring Algorithm

Every record receives a 1.0–9.0 score across five independent dimensions. Reproducible, transparent, versioned. Source code publicly available.

FIRST 04

AI-Assisted PDF Extraction at Scale

800+ government FOIA documents — previously unsearchable scanned PDFs — processed using Claude Haiku. 525 CIA records and 13 AARO analytical documents extracted this way.

FIRST 05

Public API with Canned Query Endpoints

A live public JSON API serves 10 pre-defined research queries with no authentication required for read access.

FIRST 06

MCP-Compatible AI Query Layer

An MCP server allows any Claude-compatible AI to query the full database in natural language — the first UAP database to be AI-queryable via standardised protocol.

08

How It Works — The Pipeline

6-STAGE INGESTION · EXTRACTION · SCORING · LINKING · AUDITABILITY

01
Source Acquisition
Each source is acquired from its primary origin: NARA FOIA releases, government open data portals, academic archives, or structured web scraping with rate limiting and robots.txt compliance. No third-party aggregators.
02
Structured Extraction
CSV sources are parsed with column normalisation and shape vocabulary mapping. PDF sources (800+ documents) are extracted page-by-page using pdfplumber, then processed by Claude Haiku with structured JSON prompts tuned to each document format.
03
Geocoding & Normalisation
Locations are resolved to latitude/longitude via OpenStreetMap Nominatim with fallback string matching. Dates are normalised to ISO 8601 with confidence levels (exact / date / month / year). Shapes are mapped to a canonical 15-category vocabulary.
04
Credibility Scoring
Each record receives a score from 1.0–9.0 across five dimensions: source tier (government vs. civilian), physical evidence types, observation duration, geospatial precision, and cross-source corroboration bonus. Algorithm is versioned and logged per ingestion run.
05
Geospatial Cross-Linking
The Haversine linker runs against all geolocated events with confirmed dates. For every source pair, it identifies events within 25 km and ±7 days. Confidence is scored 60% geographic / 40% temporal. Pairs above 0.7 confidence are stored as same_event links.
06
Auditability & Versioning
Every ingestion run is logged with script version, source file hash, record count, and error summary. The database schema and scoring algorithm are versioned. The full pipeline is reproducible from raw source files.

CREDIBILITY SCORE DIMENSION BREAKDOWN

Dimension Range Weight Notes
Source Authority 1.0 – 6.0
Gov Tier 1 base 6.0 · Civilian Tier 3 base 1.0
Physical Evidence +0.5 – 1.5
Radar, photography, video, material recovery
Duration +0.5 – 1.0
Longer observation = stronger documentation
Geospatial Precision +0.5
Confirmed lat/lng vs. approximate location
Cross-Source Corroboration Up to +4.0
Most powerful factor — independent confirmation
09

Data Sources

38 INDEPENDENT SOURCES · ONE SCHEMA · ONE SCORE

Each source was selected for its provenance, accessibility, and research value. Every record is traced to its primary origin, processed through the same ingestion pipeline, and scored against the same credibility algorithm.

TIER 1 Official Government Records — Declassified military and intelligence files Base score 6.0
🇺🇸 Project Blue Book — US Air Force 1947–1969
Public Domain
6,636
The USAF's official UFO investigation program. 6,636 digitised records from 12,000+ investigated reports. 701 cases officially classified "unidentified" — never explained. Primary anchor for cross-source corroboration.
Parsed from structured CSV with case number, date, location, shape, and USAF classification
Air Force "Unidentified" classification adds +0.5 to credibility score
Source: US National Archives, digitised by Fold3
🇺🇸 CIA FOIA Collection 1908–2001
Public Domain
525
Declassified intelligence documents from the CIA FOIA reading room — overseas cables, internal analyses, and assessments of Soviet-era sighting waves. Dalnegorsk USSR (1989) — the strongest physical evidence record in the entire database — comes from this collection.
800+ PDFs extracted page-by-page using pdfplumber, then processed by Claude Haiku AI
Each extraction logged with source document hash for full auditability
Intelligence cables involving physical evidence receive +1.0–1.5 score bonus
🇺🇸 AARO — All-domain Anomaly Resolution Office 1942–2024
Public Domain
30
Current US DoD UAP investigation office, established by Congress in 2022. Includes Go Fast (2015), Puerto Rico UAP (2013), ORNL metallic specimen analysis. 8 official case resolution reports — highest credibility records in the database.
Modern cases carry the highest base scores — government-vetted, multi-sensor confirmed
Video evidence (Go Fast, Gimbal, FLIR1) adds +1.0 photographic evidence bonus
🇬🇧 UK Ministry of Defence 1909–2009
Open Govt Licence v3
80
Declassified UAP case files released to the UK National Archives 2008–2013. Includes Rendlesham Forest (1980) and multiple radar-confirmed contacts over UK airspace.
UK grid references converted to WGS84 lat/lon for map display
Cases with military witness + radar receive maximum Tier 1 bonuses
🇪🇸 CEFAE — Spanish Air Force 1962–1995
CC BY 4.0
82
80 declassified expedientes from Spain's Air Force UAP commission. 1,900 pages spanning 1962–1995. Includes the Manises Incident (1979) — military pilot forced to land due to UAP proximity — and multiple radar-confirmed contacts.
Spanish text handled with bilingual extraction prompts (ES/EN) via Claude Haiku
CEFAE A/B/C/D classification maps directly to credibility bonus
🇺🇸 NARA-FAA — Federal Aviation Administration 2007–2024
Public Domain
651
Aviation safety UAP reports from the FAA SKYWATCH system, transferred to NARA under the 2024 NDAA. Filed by pilots and air traffic controllers with precise flight instrument position data. NARA-FAA is the dominant source in cross-source corroboration — 1,526 verified links with NUFORC civilian records.
Aviation professional witness adds +1.0; radar contact adds +1.5
Precise flight coordinates enable the highest-confidence Haversine matching in the dataset
🇺🇸 FBI Vault — UFO Files 1947–1954
Public Domain
1,609
Declassified FBI field office reports and teletype messages covering UFO sightings 1947–1954. Released via FOIA through the FBI Vault in 16 parts (1,632 scanned pages). Covers the 1947 wave, Roswell, Kenneth Arnold sighting.
Extracted via Claude Haiku vision from 1,632 scanned pages across 16 PDF volumes
708 cross-source links with Hatch-UDB — strongest corroboration pair for the 1947–1954 era
🇺🇸 NARA-NSA — National Security Agency 1955–1995
Public Domain
38
Declassified NSA signals intelligence records relating to UAP, transferred to NARA under the 2024 NDAA. Among the most restricted records ever made publicly available. Small record count (38) reflects genuine rarity, not incomplete ingestion.
🇺🇸 NARA-State — US State Department Cables 1965–1989
Public Domain
8
Diplomatic cables from US embassies reporting UAP incidents in host countries — Barbados, Argentina, Pakistan, Kuwait, Togo. These represent foreign government UAP incidents that crossed the US diplomatic reporting threshold — an extremely high bar for inclusion.
🇺🇸 DIA · Army Intelligence · DoD · Sign / Grudge 1947–2024
Public Domain
217
Combined: DIA (67 records, foreign military UAP), Army Intelligence (9 records), DoD analytical (13 records including Go Fast analysis and ORNL metallic specimen study), Sign/Grudge (5 records, pre-Blue Book 1947–1951 including the classified "Estimate of the Situation").
TIER 2 Expert Civilian / Institutional — Scientifically-organised civilian research Base score 3.0
🇫🇷 GEIPAN — French Space Agency (CNES) 1954–present
CC BY-SA
3,274
France's official UAP investigation unit, operated by the national space agency CNES since 1977. The only currently active government UAP program that publishes openly in real time with its own A/B/C/D classification.
Data ingested directly from GEIPAN's public API and open database export
GEIPAN classification A (no explanation found) receives +2.0 credibility bonus
🇺🇸 NICAP — National Investigations Committee 1860–2000s
Public Domain
5,489
The most methodologically rigorous US civilian UAP research organisation (1956–1980). NICAP's Sighting Information Database applies systematic categorisation across 11 evidence categories: radar, photographic, EM effects, physical trace, animal effects, and more.
Historical depth (1860–) adds pre-aviation era records absent from all government sources
Treated as Tier 2 due to systematic investigator review process
🌍 Hatch *U* UFO Database ~1400–2003
Family permission
18,077
Larry Hatch's lifetime compilation — 18,000+ UAP events from primary sources worldwide, pre-geocoded with lat/lon. Hatch's own credibility (0–9) and strangeness (0–9) ratings, plus rich attribute codes covering observer type, evidence category, and shape. Exceptional historical depth across 100+ countries.
🇺🇸 NARCAP — National Aviation Reporting Center 1995–present
Research licence
3 (sample)
UAP reports exclusively from aviation professionals — pilots, air traffic controllers, and flight crew — reviewed by expert investigators before publication. Full dataset integration pending formal data agreement.
🇧🇷 Brazil SIAN / Brazil-Wiki 1954–1986
CC BY / Public
15
Landmark Brazilian UAP cases from Operation Sky Open — the Brazilian Air Force's 1977 declassification. Includes the Trindade Island photographs (1958) and multi-witness military events with radar confirmation. Hand-curated for highest-evidential-quality cases.
TIER 3 Civilian Self-Reported — Large-scale databases elevated through corroboration Base score 1.0
🇺🇸 NUFORC — National UFO Reporting Center 1906–2014
Research use
134,979
The largest single source — 134,979 records spanning 108 years. NUFORC operates a public UAP reporting hotline since 1974. Individual records score low in isolation, but 1,526 NARA-FAA ↔ NUFORC cross-source links are the platform's primary research finding: civilian phone reports and professional aviation reports independently documenting the same events.
A NUFORC record corroborated by Blue Book or NARA-FAA reaches 7.0+ — high credibility
Geographic density makes NUFORC the primary corroboration target for all geolocated government records
🇺🇸 MUFON — Mutual UFO Network 2019–2021
Research licence
3 (sample)
One of the longest-running civilian UAP organisations with 150,000+ cases globally. Field investigators follow up on significant reports — treated as borderline Tier 2/3. Full dataset integration pending formal data licensing.
Additional sources in the corpus: RAAF (Australian Defence), SETKA-MO (Russian military), Lissoni UFO Archives (Italian), NARA-ODNI, AARO OSD resolution reports, IUP-Press academic records, and others — 29 total active sources in the current deployment.
10

Frequently Asked Questions

COMMON QUESTIONS · METHODOLOGY · SCOPE · ACCESS

What does a credibility score actually mean?
The credibility score measures the quality of documentation — not the significance of the phenomenon. A score of 9.0 means a case was thoroughly documented by a government source with confirmed coordinates, physical evidence, and independent corroboration from another source. It does not mean the phenomenon was extraordinary. A score of 1.0 means a single anonymous civilian report with no corroboration. These are epistemically different claims, and the scoring system keeps them separate.
How are cross-source links computed?
The Haversine geospatial algorithm identifies when two records from different independent sources are within 25 km and ±7 days of each other. Each candidate pair receives a confidence score: 60% weighted on geographic proximity, 40% on temporal proximity. Pairs above 0.7 confidence are classified as same_event links. The 60/40 weighting reflects that location descriptions in historical records are often approximate, while temporal records are typically more precise. At 0.7+ confidence, a link implies the records are within ~18 km and ~4 days — a tight constraint.
Does this platform make claims about what UAP are?
No. The platform explicitly does not claim to resolve what UAP are. It does not score cases on strangeness, implausibility, or potential non-human origin. It treats UAP research as an infrastructure problem, not a mystery problem: what do we know, how well is it documented, and where do independent sources agree? The most significant UAP cases are likely classified and therefore absent from this dataset — this is acknowledged openly as a core limitation.
How accurate is the AI-assisted PDF extraction?
The CIA and AARO PDF extraction via Claude Haiku has not yet been formally validated against independent human extraction of the same documents — this is explicitly acknowledged as a current limitation. What can be stated: every extraction is logged with the source document hash, script version, and a structured output log. Provenance is immutable. Future work includes an inter-rater agreement study where human reviewers independently extract a sample of the same documents for comparison. Until that is published, AI-extracted records carry a slight uncertainty adjustment in their base scores.
Can I use this data for my own research?
Yes. The public API at uap-knowledge-graph.onrender.com accepts standard SQL SELECT queries and returns JSON with no authentication required for read access. The MCP server at uap-mcp-server.onrender.com/mcp enables natural language queries via any Claude-compatible AI. For bulk data access or academic citation requests, use the contact form on the platform. The platform is designed as open infrastructure for the research community.
What are the known limitations?
No external validation metrics: Precision/recall for cross-source linking has not been formally computed against a human-reviewed ground truth. The 16,086 link count is an upper bound.

Small Tier 1 samples: Some government sources have small record counts (30 AARO, 80 UK MoD) reflecting the current state of public disclosure, not methodology gaps.

Parameter subjectivity: Scoring bonus values are reasonable but somewhat arbitrary in the absence of ground truth labels. Alternative parameterisations have not been systematically evaluated.

Public data only: The most significant UAP cases almost certainly remain classified. This dataset represents the declassified fraction of a substantially larger body of investigation.
How often is the database updated?
The ingestion pipeline is designed to process new government releases within days of publication. Priority targets include the DOW/PURSUE tranche (war.gov/UFO, released May 2026, rolling ~30-day releases), NARA RG-615 rolling releases from ODNI/OSD/FAA/NRC, and new AARO annual reports. The database currently holds 361,491 records from 38 sources. Routine pipeline automation is under active development.
Who operates this platform?
The UAP Knowledge Graph is an independent, non-commercial public service. Research is published under the pseudonym Redefine Zero. The platform operates on a minimal infrastructure budget using Render, Cloudflare R2, and static hosting. No advertising. No data selling. The goal is open, citable UAP research infrastructure for the public and research community.
11

How to Cite

CITATION FORMATS · ZENODO PENDING · REPRODUCIBLE RESEARCH

APA Format — Methodology Paper
Redefine Zero. (2026). A scientific framework for credibility scoring and cross-source linking of
  unidentified anomalous phenomena records. UAP Knowledge Graph.
Platform: https://ai-2.io/uap/ · API: https://uap-knowledge-graph.onrender.com
APA Format — AI² Doctrine
Redefine Zero. (2026). Anomaly Intelligence (AI²): A data-driven, graph-centric, and quantifiable
  framework for UAP analysis. UAP Knowledge Graph — Redefine Zero Research.
Platform: https://ai-2.io/uap/ · Author: Redefine Zero
Zenodo publication pending. A DOI-citable version of the methodology paper will be published on Zenodo under the pseudonym "Redefine Zero" for full citeability in academic literature. Subscribe to platform updates for the DOI when issued.