ONTOLOGY
The Graph
What lives inside the knowledge graph. Fifteen domains, thirteen node kinds, eight epistemic statuses, twelve edge relations. A model of what we believe about billing and how strongly we believe it.
Domains are conceptual territories. Each contains the nodes relevant to one aspect of billing — its mechanisms, known facts, open questions, and tracked metrics. Domains overlap at the edges through cross-domain connections.
billing-lifecycle — the overarching domain. Subscription state machines, plan changes, the seven-domain vision, workstream coordination. The hub that other domains connect through.
collection — payment execution, Stripe integration, checkout flows, on-session vs off-session mechanics. credits — credit notes, proration, balance adjustments. usage — metered billing, threshold alerts, usage-based charges.
dunning — retry schedules, escalation logic, the dunning brain rewrite. fraud — bad-debt flags, abuse detection, inconsistent state.
honest-invoicing — invoice accuracy, credit note mechanisms, the 906 migration. reconciliation — Stripe-to-internal state agreement. migration — payment method transitions, registrar migrations.
access — entitlements, provisioning, what the customer actually gets. customer — the experience layer, support ticket patterns, customer-facing pain points. reference — cross-cutting concepts shared across domains.
Plus two observation domains — bq-observations and digest-observations — that hold raw data points from BigQuery queries and ticket digests. These feed the evidence chains that ground the knowledge nodes.
Thirteen kinds, organized into three layers. The data layer captures what we've observed. The knowledge layer captures what we believe. The governance layer captures what we've decided, what we're worried about, and what we don't know.
datasource — where data comes from (BigQuery projects, Jira pipelines). query — how we extract it (SQL queries that produce metrics). observation — point-in-time measurements. signal — classified patterns from ticket analysis. metric — tracked quantities with freshness and thresholds.
fact — a specific claim about the system ("dunning volume stabilized at 7-9K/month"). thesis — an argument about why something is true ("the vision is incremental, not heroic"). mechanism — how something works (code paths, state machines, processing pipelines). entity — a named thing in the system (a table, a service, a Stripe object).
decision — a choice that was made and why. risk — something that could go wrong. question — something we don't know yet. group — organizational containers that structure the other nodes.
The layering matters. Data nodes are ephemeral — they get refreshed. Knowledge nodes are durable — they accumulate. Governance nodes are intentional — they record human judgment.
Every node carries an epistemic status — a declaration of how well-supported it is. This is the mechanism that makes the graph honest about its own limitations.
Directly observed from data. A BigQuery result, a metric reading, an observation node with a timestamp. The strongest epistemic status.
Supported by evidence chains — incoming grounds edges from measured or other grounded nodes. Not directly observed, but traceable to things that were.
Some evidence exists but the chain is incomplete. Partially grounded. Treat with appropriate caution.
Organizational nodes — groups, entity definitions, reference material. Not claims about the world, just scaffolding for the graph.
Claimed based on code reading, architecture analysis, or domain expertise — but not backed by data chains. The bulk of knowledge debt lives here.
Hypothesis. Might be true. Needs testing. Theses often start here and graduate to grounded as evidence accumulates.
Was grounded, but the supporting data has aged past its freshness threshold. The claim might still be true but can't be confidently cited.
Contradicted by newer evidence. Kept in the graph for historical context but marked as no longer believed.
The progression from speculative → asserted → grounded → measured is the maturity path. Knowledge debt is measured by how many nodes are stuck at "asserted" with high connectivity — influential claims without evidence.
Twelve relation types define how nodes connect. Each edge has a type, a weight, and an optional note. The type determines what the connection means.
contains — group membership. A domain contains facts. A group contains mechanisms.
grounds — epistemic support. An observation grounds a fact. A metric grounds a thesis. This is the edge that evidence chains follow.
triggers — one thing causes another. enables — one thing makes another possible. complicates — one thing makes another harder.
implements — code that realizes a mechanism. reads / writes — data access patterns. tracks — a metric that monitors a fact.
predicts — a thesis makes a testable prediction. The edge that connects hypotheses to their validation criteria.
replaces — one thing supersedes another. contradicts — two things disagree. The edges that surface conflict rather than hiding it.
The edge types aren't arbitrary. They encode the kind of reasoning the graph supports. grounds enables evidence tracing. contradicts enables conflict detection. predicts enables thesis testing. The ontology is designed for epistemic operations, not just storage.
The graph is a model, not a database. It captures belief, not just data.