← back to feed

How Flat9 Intelligence works

A working explanation of what this platform does, how it produces its sentences, what the confidence numbers mean, and where it can be trusted.

1. Overview

Intelligence reads the world's news as a stream of structured events, builds a graph of relationships from it, and surfaces a daily feed of AI-written sentences describing where things are going. Each sentence carries a numeric confidence and an honest calibration tier.

What it is not: a news aggregator, a sentiment dashboard, a Polymarket viewer, or a chatbot. It is a calibrated reasoning engine over a news entity-graph, with prose generated by a large language model under tight constraints.

The core question this system tries to answer: given everything that just happened in the world, what shifted, and how confident should we be that those shifts mean something?

2. Architecture

Three input streams feed a graph database. Pattern detectors read the graph state and produce candidate claims. Each claim's confidence is computed numerically; its sentence is written by an LLM. The two paths never cross.

sources GDELT 15-min events RSS feeds tech, defense, … FRED macro indicators enrichment Wikidata canonical IDs (QIDs) Polymarket resolved markets (model labels) link Entity links events → entities via Wikidata QID weight Co-mention edges weighted by tier × tone × time decay aggregate Graph + gravity undirected, recursive PageRank detect Pattern detectors rising_edge, gravity_surge, … number Confidence signal magnitude + Polymarket-trained prose LLM pattern + signals, never sees number output Claim sentence + confidence % + calibration tier
Three sources push events into the system: GDELT (15-min global feed), RSS (think-tank, defense, tech press, Reddit) and FRED (macro indicators). Two enrichment layers supplement specific stages without producing events: Wikidata gives every entity a canonical QID at link time, and Polymarket resolved-market outcomes train the logistic model that converts signal magnitude into a calibrated probability. The chain itself runs vertically: events are linked to seeded entities; each shared event yields a co-mention edge weighted by the article's publisher tier × tone × time decay; recursive PageRank over the resulting undirected graph produces per-node gravity; pattern detectors fire on graph state; the system computes a confidence number from signal magnitude (refined by the Polymarket-trained mapping when the pattern shape has resolved-market history), and the LLM writes prose given the pattern + signals, never the number. Both outputs land in the claims table together.

3. Data pipeline

Three event sources land in the same events table, distinguished by the source column:

Every event row carries a publisher_tier (1=global wire, 2=national press / think tank, 3=regional / state-aligned, 4=aggregator / content farm, 5=unknown), an alert flag if the title matches the escalation keyword list, and a list of topic tags from a curated catalog (tariffs, fed-rates, iran-nuclear, election, …). All four are precomputed at ingest so downstream pattern detection is a fast SQL join.

A real GDELT event

To make this concrete, here is one event from the live database (a real row, ingested 2026-03-10), shown in three views: as it lands in the raw GDELT export, as it ends up in our events table, and as it links into the entity graph.

1. Raw GDELT row (key columns of 61)

GLOBALEVENTID         1293508008
SQLDATE               20260310
Actor1Name            ISRAELI MILITARY
Actor1CountryCode     ISR              (CAMEO)
Actor1Geo_CountryCode IS               (FIPS, geo-confirmed)
Actor2Name            HAMAS
Actor2CountryCode     PSE
Actor2Geo_CountryCode GZ
EventCode             112              (ACCUSE, within DISAPPROVE root)
EventRootCode         11
QuadClass             4                (material conflict)
GoldsteinScale        -2.0
AvgTone               -7.434           (sentence-level sentiment)
ActionGeo_CountryCode GZ
ActionGeo_FullName    Gaza Strip, Gaza, GZ
ActionGeo_Lat         31.500
ActionGeo_Long        34.750
DATEADDED             20260310203000
SOURCEURL             https://www.newcastleherald.com.au/story/9195158/
                      israeli-army-kills-three-in-southern-gaza-strip-tunnel/

2. After ingestion: row in our events table

id               659907
source           gdelt                  (the data provider, not the publisher)
source_id        1293508008             (the GDELT GLOBALEVENTID)
occurred_at      2026-03-10 20:30:00
cameo_code       112
tone             -7.434
location_lat     31.500
location_lon     34.750
url              https://www.newcastleherald.com.au/story/9195158/...
publisher_tier   3                      (regional Australian press)
publisher_weight 0.600                  (tier 3 → 0.6)

The actor codes, geo-confirmation, and Goldstein scale are read at parse time for filtering and entity-resolution decisions, but only the columns above are persisted: enough to reason about the event, recover the source, and join to entities. source here means the data provider (gdelt, treendly, polymarket), not the publisher of the article. The publisher is captured separately as publisher_tier (1=global wire / major mainstream, 2=national press, 3=regional or specialty, 4=aggregator or content farm, 5=unknown), which derives a publisher_weight used to weight the event's contribution to graph edges. The Newcastle Herald is a regional Australian daily, so this row carries weight 0.60 rather than the 1.00 a Reuters or AP wire would.

Why have a tier at all: a story that was simmering in regional outlets and then breaks into Reuters/BBC/NYT is qualitatively different from one that has been in the wires all along. The mainstream pickup is an early signal that editors who cover the region seriously have decided this matters, and it is exactly what the mainstream_crossing pattern detector looks for. Tier 5 (unknown) defaults to 0.50 so unfamiliar publishers participate in edges without pulling the model in either direction.

3. After entity linking: rows in entity_event

entity_event:
  (entity=Israel,    event=659907, role=actor1)     ← Actor1CountryCode=ISR, geo IS confirmed
  (entity=Palestine, event=659907, role=actor2)     ← Actor2CountryCode=PSE, geo GZ confirmed
  (entity=Hamas,     event=659907, role=actor2)     ← Actor2Name="HAMAS" matched org alias
  (entity=Gaza Strip,event=659907, role=proximity)  ← Action lat/lon within Gaza radius

Three of those links come from GDELT's own data (actor country codes, actor name matching). The fourth, Gaza Strip, was added by the proximity-linking pass: the event's lat/lon falls within Gaza's seeded radius, so the place entity is attached even though GDELT didn't name it. From there the event becomes one edge in the recursive PageRank graph.

Pipeline steps

  1. The ingest step downloads the latest GDELT export, filters rows to Middle East scope using a CAMEO/FIPS country-code cross-check (it skips name-based false positives, e.g. a Texas county geocoded as "Jordan" because the actor's name happens to be Jordan), and writes event records.
  2. For each event, actor names and country codes are linked to canonical entities (people, organizations, places, topics). Country actors link via CAMEO codes; named persons and orgs link via name + alias substring match against the seeded entity dictionary.
  3. The edge build step emits a co-mention edge between every pair of entities that appear in the same event. Edges carry a weight (event source weight × time decay) and a timestamp.
  4. The gravity recompute step runs a recursive PageRank iteration over the edge set and persists each entity's gravity and momentum.
  5. The pattern detection step scans the current graph state for triggered patterns; the claim generation step turns triggered patterns into claims.

Articles are not re-scraped; the seed entity catalogue is partly imported from hipcityreg/situation-monitor [3] which has done the curation work. Polymarket markets [4] are pulled from the public Gamma API and matched to entities by question-text substring against the entity name dictionary.

4. The entity graph

Nodes are entities with stable identity, every entity carries a Wikidata QID [5] when one exists. Edges are co-mentions: pairs of entities that appeared in the same news event within a recent window. The graph is undirected — an edge means "these two were talked about together," not "X did something to Y."

GDELT carries direction at the event level (Actor1 → Actor2 with a CAMEO action code), but we deliberately flatten that into an undirected co-mention edge for the graph layer. News direction is noisy: the same conflict produces paired rows like "Iran ACCUSED Israel" and "Israel STRUCK Iran" with opposite directions, and naïvely summing them washes out the underlying signal that both entities are entangled. Direction itself isn't lost: each event row keeps its CAMEO code and tone, which feed the tone_shift pattern and the publisher-tier weighting; the graph's job is to answer "who is entangled with whom," not "who is acting on whom." If a future pattern needs explicit directionality we'll add a directed edge type alongside co_mention rather than rebuild the existing one.

Iran Q794 · gravity 0.118 IRGC Q260354 · 0.067 Hezbollah Q170424 · 0.054 Israel Q801 · 0.121 IDF
A toy fragment of the Middle East subgraph. Edge thickness is proportional to co-mention frequency in the last 30 days. Each node carries a Wikidata QID and a recursive-PageRank score (gravity).

Wikidata QIDs are non-negotiable. Following Sahu et al. [6], who showed that LLM-generated knowledge graphs from GDELT suffer from entity inconsistency (DALI and THE DALI as separate nodes; 435/968 isolated vertices in GraphRAG), the system uses an ontology-grounded canonical ID for every entity. The LLM never assigns identity.

5. Recursive PageRank

Influence flows through the graph the way the original PageRank algorithm [7] proposed: an entity is important when other important entities point to it, recursively. The same idea was popularised for AI account influence ranking by the Digg AI 1000 system [8], which directly inspired this design.

The iteration:

gravity(t+1)[i]  =  (1 − d) / N
                  +  d · Σⱼ ( gravity(t)[j] · w(j → i) )
                  +  d · ( dangling_mass / N )

w(j → i)  =  weight(j↔i) · exp(−Δt · ln 2 / halflife)
              ─────────────────────────────────────────
              Σ_k weight(j↔k) · exp(−Δt · ln 2 / halflife)

Where d = 0.85 (standard damping), halflife = 14 days (so a 30-day-old edge contributes ~23% of a fresh edge), N is the entity count, and dangling mass is redistributed uniformly so isolated entities don't bleed mass to themselves. Convergence is reached in 20–30 iterations on a graph of our size; the entire recompute runs in well under a second per day in pure PHP.

Why time decay matters. Without it, the graph fossilizes, yesterday's big news stays at the same gravity forever. The 14-day half-life is an opinion, not a measured optimum: long enough that a brief news cycle doesn't immediately outweigh structural relationships, short enough that this week's events dominate. Tunable.

6. Pattern detection

Patterns are finite, pre-defined graph phenomena. The LLM does not choose what to write about, that would create variance and unaccountable output. Each pattern type defines a numeric detector, an LLM prompt template, and a calibration tier.

Pattern Triggers when Sentence shape Tier
rising_edge edge weight A↔B in last 30d ≥ 2× prior 30d, ≥3 events corroborating "{A}–{B} cooperation/tension is intensifying" medium
falling_edge edge weight A↔B in last 30d ≤ 0.5× prior 30d, ≥3 prior events "{A}–{B} relationship is cooling / backchannel going quiet" medium
gravity_surge entity gravity momentum > +20% AND gravity > 1.5× baseline "{Entity} is becoming a focal point" medium
gravity_collapse entity gravity momentum < −20% AND gravity > 1.5× baseline "{Entity}'s influence is declining" medium
tone_shift mean GDELT tone for entity shifts ≥ 2.0 between 14d windows "{Topic} sentiment is hardening / softening" medium
cluster_formation event count for entity in last 14d ≥ 3× prior 14d, ≥5 events "{Entity} is becoming a flashpoint" low
triangular_tightening three entities A,B,C with all 3 pairwise edges rising ≥1.5× "{A}, {B}, {C} are converging" low

The 1.5× baseline gate on gravity surge/collapse is load-bearing. With N entities the uniform gravity is 1/N; in a sparse graph, PageRank redistributes mass onto the few nodes with edges, mathematically squeezing everyone else by ~20% as a regression-to-mean artifact. Without the 1.5/N floor, every other entity would fire as a "collapse" the moment two of them surged. The threshold scales with graph size automatically.

Two further pattern types exist in the catalogue but are dormant until their inputs are wired:

7. Prose vs. number, the strict separation

This is the hard architectural rule that distinguishes this system from a generic "LLM reads news and writes commentary" tool: the LLM never produces, sees, or influences the confidence number.

The order of operations matters. When a pattern fires, the system first computes the numeric confidence from the signal magnitude (and, for tiers that have it, the Polymarket-trained mapping in §8). Then the LLM is handed the pattern type, the entities, and the structured signals — but never the resulting number — and asked to write a sentence. Two outputs come back: a sentence (from the LLM) and a confidence (from the formula). They land in the same claims row. The LLM doesn't know what confidence its sentence will be paired with, and the confidence math doesn't know what sentence will be paired with it.

Triggered pattern + structured signals LLM structured input → prose Sentence no numbers, no probability Confidence formula + Polymarket calibration Confidence % empirically grounded never
The two paths run in parallel from a triggered pattern and never cross. Mixing them produces hallucinated confidence dressed as analysis, the failure mode this design is built to prevent.

The reasoning, taken seriously:

8. Polymarket calibration

Polymarket [4] is the calibration substrate, never the user-facing primitive. Users never see a market or a question. Polymarket's role is purely upstream of the confidence model: it provides ground-truth outcomes against which the system's signals can be checked.

The mechanic, in three steps:

  1. Polymarket has thousands of resolved binary markets ("Will X happen by Y date — YES/NO") with a known, money-backed outcome.
  2. For each resolved market whose entities/topics overlap our graph, we replay what our graph signals were saying in the days before resolution. The market_features table accumulates that time series per market: gravity, momentum, edge weight, event count.
  3. A logistic regression learns the mapping graph state → resolution:
P(resolution = YES)  =  f(graph state at time t,
                          for the entities mentioned in the question)

That trained f is what produces the confidence number for live claims whose pattern shape is well-represented in the training data. For shapes with little or no Polymarket coverage, the confidence falls back to a function of raw signal magnitude — and the calibration tier reflects that.

predicted probability empirical frequency overconfident → ← underconfident perfectly calibrated
A calibration curve plots predicted probability (x) against the empirical frequency of YES resolutions (y). A perfectly calibrated model lies on the dashed diagonal. The Brier score quantifies the mean squared deviation. This page will populate with the live curve once enough resolved markets accumulate; until then, claims are tagged with calibration tier rather than precise probability.

Every claim carries a calibration tier badge (HIGH / MEDIUM / LOW). The tier reflects how the pattern shape behind the claim relates to Polymarket history, not whether any specific bet was validated TRUE for these entities:

So a claim like The Strait of Hormuz is experiencing a sharp escalation in military and geopolitical activity at 70% LOW means "the underlying signal is strong (a clear geographic event cluster), but no resolved market has a shape we can map this to, so we won't pretend the number is a probability." That distinction is the antidote to the "confident-sounding sentence" failure mode — the system's job includes telling the user when it is guessing.

9. Three independent witnesses

Every claim's confidence integrates three sources, computed numerically:

  1. News graph (GDELT 15-min events + RSS feeds from think tanks and OSINT outlets) — is this entity gaining centrality, is co-occurrence tightening, are the headlines using escalation language?
  2. Polymarket calibration — does the resulting feature pattern resemble historical patterns that resolved YES on money-backed binary markets?
  3. FRED economic indicators — Fed funds rate, CPI, unemployment, oil, FX, VIX. The macro substrate against which rate-, inflation-, and energy-flavoured claims can be sanity-checked rather than left to news framing alone.

Where they agree, the signal is trusted. Where they disagree, the divergence is itself the interesting datum. A graph signal that's invisible in the macro indicators may be a real geopolitical story not yet priced in; a Polymarket move uncorroborated by news flow may be informed traders front-running a leak; an indicator shift unaccompanied by news is the market expecting something coverage hasn't caught up to. All three are surfaced rather than collapsed.

Treendly's attention-pulse layer (search-and-social rising-trend signal) is in the architecture but not yet wired in — planned as a fourth corroborating witness for the claims that depend on whether the public is paying attention, distinct from whether elites or markets are.

10. User-facing surfaces

The system has four surfaces a reader actually opens, and each one is meant to answer a different question.

The feed

The homepage is a list of AI-written sentences with a confidence percentage, a calibration tier badge, a 7-day delta, and a 14-day inline sparkline. The sparkline reads from the claim_history table that records each claim's confidence on every generation pass; this is what turns "78% right now" into "78% and rising for the past week" at a glance. Above the feed sits a heatmap that aggregates ~500K events into ~2K geographic cells server-side, so the density is real (every ingested event contributes) without shipping a megabyte-scale payload.

The claim detail

Clicking any sentence opens a fuller reasoning view. At the top is a cached AI-written 2-3 paragraph summary that frames what is happening, why it matters, and what to watch next. Below it sits the structured evidence: the entities involved (each linkable to its own dossier), the numeric signals that produced the confidence, the contributing events with reconstructed-article titles when available, the calibration tier with an explanation, and a confidence-over-time chart. The summary is regenerated only when the underlying signals change (we hash the signals JSON and compare), so visiting a claim does not burn LLM credits.

The entity dossier

For any entity in the graph, the dossier shows its current gravity and momentum, its top connected entities (time-decayed edge weights), the recent events it appears in, and the active claims it features in. It is the Digg-AI-1000 [8] view applied to news, with geography on top: a markers map shows where this entity's events happen spatially.

The /review page

An admin-only QA view listing every claim generated in a recent window, with the structured signals expandable per claim. Used to spot prose drift, mismatched signals, or "confident-sounding nonsense" failures before a broader audience does. Filters by pattern type and time window.

Article-text reconstruction

Earlier in the data pipeline, each event lands with only a URL and metadata, never a real article body. To ground LLM prose with actual headlines and specifics, the system runs Bertaglia et al.'s gdeltnews package [10] as a Python sidecar: every hour it groups recent events by their 15-min GDELT publication window, downloads the matching Web News NGrams 3.0 files, and reconstructs each article's full text by merging overlapping n-grams (~95% similarity vs. the original). The reconstructed text feeds the prose layer and lives in the articles table; it is the difference between "Iran is intensifying nuclear focus" and "Iran is intensifying nuclear focus following the Bushehr drill on March 8."

11. Limitations

The honest list of where this system is weak:

12. References

  1. Leetaru, K. and Schrodt, P. A. (2013). GDELT: Global Data on Events, Location and Tone, 1979–2012. International Studies Association Annual Convention. gdeltproject.org
  2. Schrodt, P. A. (2012). CAMEO: Conflict and Mediation Event Observations Event and Actor Codebook. Pennsylvania State University.
  3. hipcityreg (2026). situation-monitor, real-time dashboard for global news, markets, and geopolitical events. github.com/hipcityreg/situation-monitor. Source-tier classification, leader catalogues, and geographic seed data are imported from this project's open configs.
  4. Polymarket. Polymarket Gamma API. docs.polymarket.com
  5. Vrandečić, D. and Krötzsch, M. (2014). Wikidata: A free collaborative knowledgebase. Communications of the ACM 57(10): 78–85. wikidata.org
  6. Sahu, A. et al. (2025). Talking to GDELT through Knowledge Graphs. arXiv:2503.07584. arxiv.org/abs/2503.07584
  7. Page, L., Brin, S., Motwani, R. and Winograd, T. (1998). The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab Technical Report.
  8. Digg AI 1000 (2026). Recursive PageRank ranking of AI accounts on X. digg.com/ai-1000. The recursive-influence framing in this system is directly inspired by Digg's approach.
  9. Sun, C. et al. (2024–2025). MIRAI: Evaluating LLM Agents for International Event Forecasting. NeurIPS Datasets & Benchmarks 2025, arXiv:2407.01231. arxiv.org/abs/2407.01231
  10. Bertaglia, T. et al. (2026). Free Access to World News: Reconstructing Full-Text Articles from GDELT. Big Data and Cognitive Computing 10(2): 45. mdpi.com/2504-2289/10/2/45. The gdeltnews Python package referenced as the v1.5 ingestion path.
  11. Federal Reserve Bank of St. Louis. FRED Economic Data API. fred.stlouisfed.org/docs/api. Free programmatic access to 800,000+ macroeconomic time-series. We pull ~10 series daily as the third independent witness alongside the news graph and Polymarket calibration.