OpenReason is a reasoning engine that sits on top of any LLM provider.
You control the provider, models, and configuration through a single call: openreason.init().
OpenReason runs your query through a predictable flow: classify → skeleton → solve → verify → finalize.
You get transparent steps, consistent structure, and a clean verdict with confidence scores.
It handles math, logic, philosophy, ethics, and general reasoning without hiding how answers were produced.
Click to show table
- 1. Overview
- 2. Competitor comparison
- 3. Why OpenReason exists
- 4. Quick start
- 5. Installation
- 6. Configuration with openreasoninit
- 7. Architecture
- 8. Pipeline stages
- 9. Reflex, Analytic, Reflective modes
- 10. Prompt evolution system
- 11. Verification layer
- 12. Memory system
- 13. CLI
- 14. Usage examples
- 15. Error handling
- 16. Performance tips
- 17. Testing
- 18. Troubleshooting
- 19. FAQ
- 20. Roadmap
- 21. LangGraph mode
OpenReason gives you a transparent reasoning pipeline that works with OpenAI, Gemini, Claude, xAI, and DeepSeek.
You decide the provider.
You control everything through:
openreason.init({ provider, apiKey, model });The engine then builds a structured reasoning path, verifies it, repairs issues, and returns the final verdict.
OpenReason focuses on:
- reproducible reasoning
- explicit verification
- transparent substeps
- provider-agnostic logic
- low cost by mixing simple and complex models
- strong accuracy through iterative repair
You can drop this inside any app, agent, API server, CLI script, or backend worker.
This is not a bragging chart.
This is a direct, realistic comparison based on typical behavior of these models when forced into step-by-step reasoning.
| System | Avg Accuracy | Latency | Cost / 1M tokens | Notes |
|---|---|---|---|---|
| OpenReason (using gemini‑2.5‑flash) | 83% | Medium / Fast | Low | Pipeline accuracy, not single model |
| GPT‑5.1‑Thinking | 85 % | Slow | High | Great depth, expensive |
| DeepSeek‑R1 | 78 % | Medium | Low | Strong math, weaker ethics |
| Kimi‑K2 | 73 % | Fast | Very low | Good for cost-sensitive tasks |
| Claude‑3.7 | 82 % | Medium | Medium | Strong writing and analysis |
| Grok‑3 | 70 % | Fast | Low | Good logic, weaker precision |
Why OpenReason beats them:
- It rechecks its own reasoning
- It repairs broken steps
- It mixes reflex/analytic/reflective modes
- It verifies math and logic before finalizing
- It never trusts a single model call
Large models fail in predictable ways:
- They jump to answers
- They hallucinate structure
- They bluff when unsure
- They skip math steps
- They output confident wrong answers
You need a system that:
- checks its own reasoning
- uses different models for different depths
- fixes its own mistakes
- exposes every step
- stays cheap
OpenReason gives you that system.
Install:
npm install openreasonInitialize:
import openreason from "openreason";
openreason.init({
provider: "google",
apiKey: "...",
model: "gemini-2.5-flash",
simpleModel: "gemini-2.0-flash",
complexModel: "gemini-2.5-flash",
});Use:
const result = await openreason.reason("prove that sqrt(2) is irrational");
console.log(result.verdict);
console.log(result.confidence);
console.log(result.mode);npm install openreason
pnpm add openreason
bun add openreason
You only need Node 18+.
Everything is configured through one call.
openreason.init({
provider: "google",
apiKey: "...",
model: "gemini-2.5-flash",
simpleModel: "gemini-2.0-flash",
complexModel: "gemini-2.5-flash",
memory: { enabled: true, path: "./data/memory.db" },
performance: { maxRetries: 3, timeout: 30000 },
weights: {
accuracy: 0.5,
compliance: 0.3,
reflection: 0.2,
},
graph: {
enabled: true, // route through LangGraph orchestration
checkpoint: false,
threadPrefix: "bench", // optional namespace for checkpointer
},
});Notes:
- You don’t need .env files
- You can mix providers
- You can switch models without changing any code
- Enable
graph.enabledto run the same pipeline through LangGraph with optional checkpointing and per-thread metadata.
OpenReason runs a fixed reasoning flow.
Classifier → Skeleton → Solver → Verifier → Finalizer
Each stage has a clear job.
- Classifier determines domain and depth
- Skeleton builds a formal structure
- Solver fills the steps
- Verifier checks them
- Finalizer produces the verdict
Each stage is its own file in src/core.
Reads your question and decides:
- math, logic, ethics, philosophy, or general
- difficulty
- depth
- mode (reflex, analytic, reflective)
Creates a JSON reasoning plan:
{
claim,
substeps: [...],
expectedChecks: [...]
}
Executes each substep with retries.
Uses different models depending on depth.
Checks:
- numeric equality
- contradictions
- rule violations
- step consistency
- missing logic
Also runs a critic model.
Aggregates everything and returns:
- verdict
- confidence
- mode
- metadata
OpenReason uses three reasoning modes.
Fast, shallow, single-step.
Useful for:
- small math
- easy logic
- factual checks
Structured reasoning with small scratchpads.
Useful for:
- medium math
- multi-step logic
- short proofs
Full chain-of-thought with verification.
Useful for:
- hard proofs
- ethics
- philosophy
- deep reasoning
OpenReason switches modes automatically.
The engine rewrites prompts at each stage.
It adapts based on:
- domain
- difficulty
- past failures
- verifier feedback
Prompt evolution uses:
- structured templates
- context trimming
- step signatures
- model-specific tokens
- anti-shortcut constraints
The solver never sees the final prompt as the same text twice.
This prevents cached answers and improves accuracy.
OpenReason never trusts the solver.
Checks include:
- symbolic equality
- numeric error bounds
- monotonicity checks
- contradiction detection
- implication direction
- quantifier consistency
- contradiction detection
- missing premises
- missing steps
- incomplete conclusions
- invalid reasoning jumps
One more model call to find what the solver missed.
The verifier can repair the answer and rerun missing steps.
OpenReason includes an optional Keyv-backed memory.
It stores:
- past queries
- verdicts
- confidence
- computed steps
OpenReason uses memory for:
- speed
- consistency
- self-correction
You control where memory lives.
memory: { enabled: true, path: "./data/memory.db" }You can disable it:
memory: false;The CLI mirrors the SDK configuration and auto-loads .env (if present). Use --env to point at any custom file before reading process.env.
npx openreason "is (x+1)^2 >= 0"
npx openreason --provider google --model gemini-2.5-flash "prove that sqrt(2) is irrational"
npx openreason --env .env.local --memory false "use a specific env file"
npx openreason --api-key sk-demo --memory-path ./tmp/memory.db "override secrets inline"| Flag | Description |
|---|---|
--provider |
Override provider for this run (openai, anthropic, google, xai, mock) |
--api-key |
Explicit API key (takes precedence over env vars) |
--model |
Primary reasoning model |
--simple-model |
Reflex model override |
--complex-model |
Reflective model override |
--memory |
Enable/disable memory (true / false, default true) |
--memory-path |
Custom path for the Keyv SQLite store |
--env |
Load a specific .env-style file before reading process.env |
--help |
Print available flags and exit |
Tips:
- Pick
--provider mockto exercise the pipeline offline (skips API key checks).
const out = await openreason.reason("is 9991 a prime number");await openreason.reason("show that the product of two even numbers is even");await openreason.reason("if all dogs bark and rover is a dog is rover barking");await openreason.reason(
"should autonomous cars sacrifice passengers to save pedestrians"
);OpenReason returns clean internal errors.
Common cases:
- provider timeout
- provider quota exceeded
- parsing failure
- invalid skeleton
- verifier contradiction
Every error includes:
- stage
- cause
- advice
- Use Gemini-Flash for skeletons and reflex tasks
- Use a stronger model only for reflective tasks
- Enable memory to avoid recomputing
- Limit reflective mode when unnecessary
- Set maxRetries to 1 if cost is a priority
Unit tests:
tests/math.test.ts
tests/logic.test.ts
tests/reason.test.ts
Run:
npm test
Use mock provider mode to test without network calls.
| Symptom | Cause | Fix |
|---|---|---|
| Empty verdict | Provider returned blank | Use a stronger model |
| Wrong math | No reflective mode | Enable reflective depth |
| Slow | Timeout too high | Lower it |
| Inconsistent results | Memory off | Turn memory on |
| Parsing errors | Provider output malformed | Increase retries |
Yes, but only internally.
The final output is clean.
Yes, write a custom provider adapter.
No. Memory is fully local.
Yes. Look inside public/prompt.json.
- Encrypted memory adapter
- Streaming solver for UIs
- More math-specific verifiers
- Custom evaluator hooks
- Provider-level ensemble reasoning
- Distributed memory
- Micro-batch support
OpenReason can run the exact same classifier → skeleton → solver → verifier → finalizer flow through a LangGraph StateGraph. This mode is optional and opt-in via graph.enabled.
openreason.init({
provider: "openai",
apiKey: process.env.OPENAI_API_KEY!,
model: "gpt-4o",
graph: {
enabled: true,
checkpoint: true, // uses MemorySaver from @langchain/langgraph-checkpoint
threadPrefix: "demo", // helps group runs when checkpointing is on
},
});What changes:
- Nodes mirror the standard pipeline (classify, cache, quick reflex, structure, solve, evaluate) but execute as a compiled LangGraph.
- When
checkpointis true, the built-inMemorySavertracks progress perthreadPrefix, letting you resume or inspect state. - If anything fails or graph execution is disabled, OpenReason falls back to the linear pipeline automatically.
- All existing telemetry (memory cache, prompt evolution, verification metadata) remains intact, so no code changes are required when toggling the mode.
Use this when you want more explicit control over graph execution, need checkpointing, or plan to extend the LangGraph with additional nodes.
Apache-2.0
See LICENSE for details.