Core Concepts

How Nocturnus.AI represents knowledge, reasons about it, and optimizes context to cut your token costs by 97%.

Facts

A fact is the fundamental unit of knowledge: a predicate applied to arguments.

parent(alice, bob)          # alice is bob's parent
likes(bob, pizza)           # bob likes pizza
NOT allergic(alice, cats)   # alice is not allergic to cats

Facts are stored as typed Atom objects in a Hexastore — a 6-way indexed data structure that makes any query pattern fast regardless of which arguments you're searching by.

Temporal Metadata

Every fact can carry time-related fields:

validFrom — epoch ms when this fact becomes valid
validUntil — epoch ms when this fact expires
ttl — auto-expire after this many milliseconds
createdAt — when the fact was stored (set automatically)

These enable temporal queries — ask "what was true at 3pm yesterday?" via POST /memory/query/temporal.

Negation

Set truthVal: false (or negated: true via MCP) to store explicit negation. This is different from absence — "we don't know if X" vs. "we know X is false."

Rules

A rule is a Horn clause: a head (conclusion) that is derived when all body conditions are satisfied.

grandparent(?x, ?z) :- parent(?x, ?y), parent(?y, ?z)

Variables use the ? prefix: ?x, ?who, ?name. When Nocturnus evaluates a rule, it finds all variable bindings that satisfy every body atom, then derives the head with those bindings.

Multi-Body Rules

Rules can have multiple conditions:

eligible_discount(?customer) :-
    subscription_tier(?customer, enterprise),
    location(?customer, ?city),
    partner_city(?city)

All three conditions must be satisfied for the discount to apply.

Inference

Nocturnus.AI has two inference engines:

Backward Chaining (Primary)

Prolog-style SLD resolution. Given a goal like grandparent(?who, charlie), the engine works backward: it finds rules whose heads match the goal, then recursively proves each body atom. This is efficient for answering specific questions.

Forward Chaining (Rete Engine)

When a new fact is asserted, the Rete engine automatically fires any matching rules and derives new conclusions. This is useful for reactive patterns — "whenever X happens, derive Y."

Key difference from LLMs: Inference here is deterministic. Given the same facts and rules, you always get the same results. There's no temperature, no randomness, no "maybe." A fact either follows from the rules or it doesn't.

Scopes

Facts and rules can be isolated into scopes using a string label. A query in a scope sees both scoped facts and unscoped (global) facts. This is different from multi-tenancy — scopes operate within a tenant for logical isolation.

Common use cases:

Per-document knowledge — scope facts to a specific document ID
External system linking — scope facts to an external record ID
Hypothetical reasoning — test rule outcomes without affecting production data
Session isolation — each user session gets its own scope
A/B testing — different scopes hold different rule sets

Example: Scope as a Document ID

Store facts about a specific document and query them in isolation:

# Store facts scoped to a specific invoice
curl -X POST http://localhost:9300/tell \
  -H "Content-Type: application/json" \
  -d '{
    "predicate": "line_item",
    "args": ["widget_a", "500"],
    "scope": "doc_invoice_2024_001"
  }'

curl -X POST http://localhost:9300/tell \
  -H "Content-Type: application/json" \
  -d '{
    "predicate": "vendor",
    "args": ["acme_corp"],
    "scope": "doc_invoice_2024_001"
  }'

# Query only facts about this invoice
curl -X POST http://localhost:9300/ask \
  -H "Content-Type: application/json" \
  -d '{
    "predicate": "vendor",
    "args": ["?who"],
    "scope": "doc_invoice_2024_001"
  }'

# Returns: { "results": ["vendor(acme_corp)"] }

Example: Scope as an External ID

Link knowledge to records in external systems (CRM, JIRA, Salesforce):

# Store facts linked to a Salesforce lead
curl -X POST http://localhost:9300/tell \
  -H "Content-Type: application/json" \
  -d '{
    "predicate": "lead_status",
    "args": ["qualified"],
    "scope": "ext_salesforce_lead_42"
  }'

curl -X POST http://localhost:9300/tell \
  -H "Content-Type: application/json" \
  -d '{
    "predicate": "contact_name",
    "args": ["jane_doe"],
    "scope": "ext_salesforce_lead_42"
  }'

# Query all facts about this lead
curl -X POST http://localhost:9300/ask \
  -H "Content-Type: application/json" \
  -d '{
    "predicate": "lead_status",
    "args": ["?status"],
    "scope": "ext_salesforce_lead_42"
  }'

Tip: Scopes compose with global facts. If you teach a rule without a scope, it applies everywhere — but scoped facts are only visible when querying within that scope. This lets you define global rules that reason over document-specific or session-specific data.

Truth Maintenance

When a fact is retracted, the Truth Maintenance System (TMS) automatically cascade-retracts any facts that were derived from it. The ProvenanceTracker maintains a dependency graph of which facts support which derived conclusions.

Example: if you retract parent(alice, bob), then grandparent(alice, charlie) — which was derived from it — is also automatically retracted. No manual cleanup needed.

Salience & Memory

In long-running agent sessions, facts accumulate fast. Nocturnus scores every fact with a salience value, combining three signals:

Recency — how recently the fact was accessed
Frequency — how often it's been queried
Priority — explicit importance weight

The /memory/context endpoint returns the top-K most salient facts — this is your agent's "working memory." Low-salience facts can be automatically evicted via /memory/decay, and repetitive patterns can be compressed into summaries via /memory/consolidate. For goal-driven optimization that goes further — backward chaining, contradiction detection, and 97% token reduction — see Context Optimization below.

Memory Lifecycle

For long-running agents, Nocturnus manages memory automatically through four stages:

1. Salience Scoring

Every fact gets a composite salience score: recency × frequency × priority. Facts that are accessed often and recently score higher.

2. Context Window

The /memory/context endpoint returns the top-K most salient facts — effectively your agent's "working memory."

3. Consolidation

After many interactions, call /memory/consolidate to detect episodic patterns (e.g., "user asked about X five times") and compress them into semantic summaries.

4. Decay

Call /memory/decay to evict expired facts (past TTL) and low-salience facts (below threshold). Keeps memory lean without manual curation.

# Get agent's context window (top 50 most relevant facts)
curl http://localhost:9300/memory/context?maxFacts=50

# Compress episodic patterns into semantic summaries
curl -X POST http://localhost:9300/memory/consolidate

# Evict stale facts
curl -X POST http://localhost:9300/memory/decay \
  -H "Content-Type: application/json" \
  -d '{"threshold": 0.05}'

Context Optimization

The Context Management Engine is what makes NocturnusAI a cost optimization tool, not just a knowledge base. Instead of stuffing your entire KB into every LLM prompt, the engine delivers only the facts that matter.

How It Works

Goal resolution — you specify what your agent needs to know (e.g., "is this contract enforceable?"). Backward chaining traces exactly which facts are relevant.
Salience ranking — relevant facts are scored by recency, frequency, and priority. The most important facts bubble to the top.
Contradiction detection — conflicting facts are flagged or auto-resolved before reaching the LLM.
Deduplication — cross-source duplicates are merged to avoid wasting tokens.
Token-optimized output — the final context window contains only goal-relevant, deduplicated, consistent facts with full provenance.

The Cost Impact

A typical knowledge base with 500 facts produces a 150K-token context window when stuffed into a prompt. After optimization, the same query returns 15 goal-relevant facts in ~820 tokens — a 97% reduction in tokens billed to OpenAI, Anthropic, or any LLM provider.

# Optimize context for a specific goal
curl -X POST http://localhost:9300/context/optimize \
  -H "Content-Type: application/json" \
  -d '{
    "goals": [{"predicate": "eligible_for_sla", "args": ["acme_corp"]}],
    "maxFacts": 25
  }'

# Response: 15 facts, 820 tokens, full provenance
# vs. 150K tokens without optimization

Incremental Diffs

For multi-turn conversations, use POST /context/diff with a sessionId to get only what changed since the last call — further reducing token spend on subsequent requests.

Key insight: Context optimization isn't just about saving money — less noise means better LLM reasoning. Fewer irrelevant facts = fewer hallucinations, fewer contradictions, more accurate answers.