LLM Integration

Use LLMs to extract facts from natural language, synthesize grounded answers, and cut context costs by 97% with goal-driven optimization.

How it works: The LLM layer is optional. Core fact storage, rules, and inference work without any LLM. When enabled, the LLM handles translation between natural language and structured predicates — but all retrieval and reasoning remains deterministic. Combined with the Context Management Engine, this means your LLM calls use 97% fewer tokens — the extraction is smart, and the context delivery is optimized.

Supported Providers

Nocturnus supports 5 LLM providers. Set one environment variable to enable:

Provider Environment Variable Models
Anthropic ANTHROPIC_API_KEY Claude Sonnet, Haiku, Opus
OpenAI OPENAI_API_KEY GPT-4o, GPT-4, GPT-3.5
Google GOOGLE_API_KEY Gemini Pro, Flash
Ollama LLM_BASE_URL Any local model (Llama, Mistral, etc.) — set to http://localhost:11434/v1
OpenAI-Compatible LLM_BASE_URL + LLM_API_KEY Groq, Mistral, DeepSeek, etc.

Nocturnus checks for keys in the order listed above and uses the first one found.


Fact Extraction — POST /extract

Feed Nocturnus plain English text and the LLM extracts structured facts (and optionally rules). With assert: true, extracted facts are automatically stored in the knowledge base.

Request

{
  "text": "Acme Corp is on the enterprise plan and has been a customer since 2019. They are based in Austin, Texas.",
  "assert": true,
  "rules": false,
  "scope": null,
  "context": null
}
Field Type Default Description
text string (required) Plain text to extract facts from
assert boolean false Auto-store extracted facts in the KB
rules boolean false Also extract logical rules from the text
scope string? null Scope for asserted facts
context string? null Additional context to help the LLM interpret the text

Response

{
  "facts": [
    { "predicate": "subscription_tier", "args": ["acme_corp", "enterprise"] },
    { "predicate": "customer_since", "args": ["acme_corp", "2019"] },
    { "predicate": "location", "args": ["acme_corp", "austin_texas"] }
  ],
  "rules": [],
  "asserted": true,
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514"
}

Batch Extraction

Process multiple texts in one call with POST /extract/batch:

{
  "texts": [
    "Alice is a senior engineer at Acme.",
    "Bob manages the sales team."
  ],
  "assert": true
}

Answer Synthesis — POST /synthesize

Ask a natural language question. Nocturnus translates it into logic queries, finds matching facts, and uses the LLM to synthesize a grounded answer — with full derivation showing which facts were used.

Request

{
  "question": "What plan is Acme Corp on and where are they located?",
  "scope": null
}

Response

{
  "answer": "Acme Corp is on the enterprise plan and is based in Austin, Texas.",
  "derivation": [
    { "fact": "subscription_tier(acme_corp, enterprise)", "type": "fact_match" },
    { "fact": "location(acme_corp, austin_texas)", "type": "fact_match" }
  ],
  "missingContext": "",
  "confidence": 0.95,
  "queriesExecuted": [
    "subscription_tier(acme_corp, ?tier)",
    "location(acme_corp, ?loc)"
  ],
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514"
}
Grounded, not hallucinated. The derivation field shows exactly which stored facts supported the answer. If no facts match, the missingContext field explains what's missing — the LLM won't make up an answer.

How Synthesis Works

  1. Schema discovery — Nocturnus finds all predicates in the KB
  2. Query translation — the LLM maps the question to logical query patterns
  3. Fact retrieval — queries are executed via direct lookup and backward chaining
  4. Answer synthesis — the LLM composes a natural language answer from the matched facts

LLM + Context Optimization Loop

For multi-turn conversations, combine extraction and context APIs:

  1. Use /extract (or /context/ingest) to convert raw language into facts.
  2. Use /context/optimize with a stable sessionId to build the prompt window.
  3. Use /context/diff on subsequent turns so only changes are sent to the model.

Configuration

Additional env vars for tuning LLM behavior:


Cost Optimization

Every LLM call costs money — OpenAI charges $15/M tokens for GPT-4o, Anthropic charges $15/M for Claude Sonnet. NocturnusAI reduces your bill in two ways:

  1. Smart extraction/extract converts unstructured text into compact structured facts. A 500-word paragraph becomes 3-5 predicate/args pairs.
  2. Goal-driven context/context/optimize delivers only the facts your agent needs for a specific question. 500 facts → 15 facts → 820 tokens instead of 150K.

At scale (1,000 requests/hour), this is the difference between $54,000/month and $240/month in LLM costs.

Works with any provider. Context optimization happens before the LLM call. Whether you use OpenAI, Anthropic, Google, or Ollama, you save the same 97% on context tokens.

What's Next?

Quickstart →

Use natural language mode with LLM extraction

API Reference →

Extract and synthesize endpoint details

Integrations →

Connect to LangChain, CrewAI, and more