MIMER Is Mythical Enhanced Reasoning

Contents

1 a Teachable, Agentic AI Knowledge System
2 The Core: A RAG Pipeline, Not a Chat Wrapper
3 Teachable: Feed It Anything
4 Reasoning Strategies & Chain of Thought
5 Multi-Provider LLM Support
6 Server & Client Modes
7 Skills: Pluggable Specialist Modules
8 Agentic Capabilities
9 The Self-Improvement System
10 Vision: See and Understand Images
11 Image Generation
12 Chart & Diagram Generation
13 Structured Reports & Comparisons
14 The Web Chat Interface
15 It Knows What It Is
16 What’s Under the Hood
17 Why I Built This
18 Release and price?

a Teachable, Agentic AI Knowledge System

I’ve been building something for a while now, and it’s time to show it.

theMIMER is a self-contained AI knowledge system. Not a chatbot wrapper. Not a thin skin on top of an API. It’s a full agentic RAG pipeline with multi-provider LLM support, pluggable skills, vision, image generation, diagram generation, reasoning transparency, self-improvement capabilities, and a web-based chat interface that ties it all together.

The name comes from Norse mythology. Mímir (in Danish Mímer ) was the wisest of the Æsir — the guardian of the Well of Wisdom beneath the world tree Yggdrasil. Odin himself sacrificed an eye for a single drink from Mímir’s well. After Mímir was beheaded in the Æsir-Vanir war, Odin preserved the head with herbs and magic, and it continued to counsel him with secret knowledge and wisdom. The logo reflects that: one glowing cyan eye (knowledge), one amber eye (Odin’s sacrifice), the flowing beard, and the neural network sparks around the neck — ancient wisdom fused with modern AI.

theMIMER stands for MIMER Is Mythical Enhanced Reasoning, and the idea is exactly that: a system that you teach with your own knowledge, and that reasons over what it actually knows rather than what some foundation model was trained on two years ago.

Let me walk you through what it does.

But first — since we live in a world of buzzwords and I know you’re wondering — here’s what theMIMER touches:

A dark-themed graphic showcasing various AI-related terms and concepts in a grid format, including terms like 'Conversational AI,' 'Image Generation,' and 'Knowledge Graphs.'

Now let’s get into the real stuff.

The Core: A RAG Pipeline, Not a Chat Wrapper

At the heart of theMIMER is a Retrieval-Augmented Generation pipeline. When you ask a question, it doesn’t just fire off a prompt to an LLM and hope for the best. It goes through a multi-stage process:

1. Router — Your question is classified. The router determines the intent, the complexity, and which approach is most likely to produce a good answer. This classification runs on a local or remote model (your choice), and the raw router output is visible in the pipeline inspector.

2. Skill Selection — Based on the router’s classification, theMIMER selects a specialist skill. There are skills for general Q&A, code analysis, diagram generation, comparison reports, vision tasks, and more. Skills are pluggable — you can add your own.

3. Planning — The system builds an execution plan. This includes determining the intent refinement, whether a code agent is needed, which context filters to apply, and even a “mood” analysis of the conversation tone. The plan is fully visible.

4. Context Retrieval — This is the RAG step. theMIMER searches its vector store for chunks relevant to your question — source code, documentation, PDFs, markdown, whatever you’ve fed it. The retrieved chunks (with their relevance scores) form the grounding context for the answer.

5. Code Agent — For questions that involve source code, an agent analyzes the retrieved files in detail, tracing through logic, identifying patterns, and understanding relationships between classes and methods.

6. Answer Generation — The LLM generates the answer, but now it has real context — not internet hearsay, but your actual data. This is where chain-of-thought reasoning happens, and you can inspect the reasoning process via the “Show reasoning” panel.

7. Post-Processing — The output is corrected for format and skill-specific rules. Mermaid diagrams get validated, code blocks get formatted, reports get structured.

Screenshot of a software interface displaying a pipeline analysis process with various steps including 'Analyzing your question', 'Selecting a specialist skill', and 'Correcting output format'.

The pipeline inspector expanded — every stage is visible: Router, RouterRaw, Skills, Plan, Execute, Answer, CodeAgent, and PostProcess. Each shows what model was used, what was decided, and what was produced.

This transparency is a core design principle. You should never have to wonder “where did this answer come from?” You can trace it back from the final response through every pipeline stage to the exact chunks that were retrieved.

Teachable: Feed It Anything

theMIMER is not hardwired to any specific domain. It’s a teachable system. You feed it your knowledge — whatever that is — and it learns it.

Source code? Sure. API documentation? Yes. Internal wiki pages, PDF manuals, research papers, product specs, legal documents, training materials? All of it. theMIMER chunks your content, generates vector embeddings, and stores them in its retrieval store. When you ask a question, it retrieves the relevant pieces and reasons over them.

This means you can point theMIMER at a Delphi codebase and ask it framework-level architecture questions — or you can point it at a medical knowledge base and ask it about drug interactions. The same pipeline, the same reasoning, the same grounding — different knowledge.

The key insight: theMIMER doesn’t “know” things because a foundation model was trained on them. It knows things because you explicitly gave it the knowledge, and it retrieves that knowledge at query time. That’s what makes RAG fundamentally different from pure LLM inference — and it’s what makes hallucinations go from “constant nuisance” to “more rare edge case” depending on the choise of LLM behind it.

However all these examples have been shown with a 9B Qwen3 LLM and a 1.5B Qwen 2.5 LLM which handles different tasks. For even better results you can ask it to take complex questions to an even more capable LLM, like Claude, Gemini, OpenAI or others.

Reasoning Strategies & Chain of Thought

Not every question needs the same thinking approach. A simple factual lookup is different from a multi-step code architecture question, which is different from a creative comparison report. theMIMER supports multiple reasoning strategies and selects them dynamically based on what the router classifies.

The “Show reasoning” panel at the top of every answer lets you see exactly how the model thought through its response. This isn’t just a debug feature — it’s a trust feature. You can see the chain of thought, verify that the reasoning makes sense, and catch cases where the model might be going off track.

A pie chart illustrating the distribution of component usage types in Delphi/FMX development, including FMX, VCL, Custom Controls, and Third-party Libraries.

The reasoning panel (collapsed here, expandable) sits above every answer. Below it, a Mermaid pie chart generated from retrieved context — with a “Copy source” button to grab the raw Mermaid markup.

Multi-Provider LLM Support

theMIMER is not locked to a single LLM provider. It supports multiple backends — and different stages of the pipeline can use different models. You might run the router on a fast local model for speed, but use a larger cloud model for the final answer generation where quality matters most.

The provider layer not only is able to load GGUF models itself, but also speaks the OpenAI-compatible API protocol, which means it works with a wide range of backends out of the box: OpenAI, Anthropic, local models via Ollama or LM Studio, or anything else that implements the standard chat completions endpoint.

This is a practical choice. Model capabilities and pricing change constantly. You shouldn’t have to rebuild your entire system every time a new model drops. With theMIMER, you swap the provider config and keep going.

Server & Client Modes

theMIMER operates in multiple modes, and this is where it gets architecturally interesting.

As an self contained LLM provider with a command line interface: theMIMER understands GGUF models and can run those on your CPU and take advantage of your GPUs. This way it can run completely offline, as long as you have chosen some good GGUF models and have the hardware to run them. The command line procides lots of possibilities to check, control and reconfigure it on the fly. It also provides textual access to querying the LLM, check the knowledge store etc.

As an OpenAI-compatible client: In addition theMIMER can talk to LLM providers using the standard OpenAI API protocol. This is how it reaches out to cloud models, local 3rdpary inference servers, or any compatible endpoint.

As an OpenAI-compatible server: Further theMIMER itself exposes an OpenAI-compatible API. This means other tools, applications, and scripts can talk to theMIMER as if it were an LLM endpoint — but behind the scenes they get the full RAG pipeline, skill routing, context retrieval, and everything else. You can integrate theMIMER into existing workflows, CI/CD pipelines, IDE extensions, or custom applications without those tools needing to know anything about theMIMER’s internals.

As a web server with a built in advanced AI chat interface: And finally theMIMER can expose a web based chat interface, which is the one you see being used in all the screenshots and videos in this post.

Skills: Pluggable Specialist Modules

Skills are theMIMER’s way of specializing its behavior for different types of tasks. Each skill defines how a certain category of question should be handled — what context to retrieve, how to structure the prompt, what output format to produce, and what post-processing to apply.

Out of the box, theMIMER ships with skills for general Q&A and conversation, deep code analysis and source code reasoning, diagram and chart generation (Mermaid syntax), structured comparison reports, vision and image analysis, image generation, and more. But the skill system is designed to be extended. You can create custom skills tailored to your specific domain — a skill for reviewing pull requests, a skill for generating test cases, a skill for writing documentation in a specific format. The router learns to classify questions to your custom skills just like the built-in ones.

Agentic Capabilities

theMIMER doesn’t just answer questions. It has agentic tools that let it perform actions — things that go beyond simple text generation.

The code agent, for example, doesn’t just summarize source code. It actively navigates through retrieved files, traces call chains, identifies design patterns, understands class hierarchies, and produces structured analyses. It’s not a grep-and-paste job — it’s actual reasoning over code structure.

The code agent in action — asked about character hit-testing at mouse coordinates, it retrieves the relevant source, traces the coordinate transformation pipeline, identifies the key classes (TViewport, TCanvas, TTextLayout), and explains the full flow with actual code excerpts.

Code snippet showcasing JavaScript functions and methods for handling asynchronous programming.

The agentic tools also enable theMIMER to generate files, produce structured output, interact with external systems through its API layer, and chain multiple operations together in a single response. This is what turns it from a knowledge retrieval system into an actual AI assistant — one that can do things, not just say things.

The Self-Improvement System

theMIMER has a built-in self-improvement mechanism. The system can analyze its own performance — looking at which answers were good, where retrieval fell short, which skills got selected incorrectly — and adjust its behavior accordingly.

Think of it as a feedback loop: the pipeline produces an answer, the quality signals (explicit feedback, implicit signals from follow-up questions, retrieval relevance scores) feed back into the system, and theMIMER tunes its routing, retrieval, and generation parameters over time. It gets better the more you use it — not because a foundation model is being fine-tuned, but because the RAG pipeline, the skill routing, and the context management are being refined.

This is a practical, engineering-driven approach to AI improvement.

Vision: See and Understand Images

theMIMER is multimodal. You can upload images and ask questions about them, and the system handles vision tasks through the same pipeline as text queries — routing, skill selection, reasoning, and all.

Here’s an example I like. I uploaded a street photograph and asked “What’s in this picture?” — theMIMER identified the city (Copenhagen) from the architecture and bus route numbers, read the license plate on a car, identified the vehicle make and model, and then systematically broke the scene down into foreground, midground, and background sections:

Image analysis — theMIMER identified the city from visual cues, read text from signs and license plates, and structured the analysis by depth.

A vibrant street scene in a European city, likely Copenhagen, featuring a dark grey Volvo V70 station wagon at an intersection. Pedestrians and cyclists cross the street, including a man with a baby stroller and a woman on a bicycle in orange pants.

The full breakdown: specific shop names (“SINGLER”, “Café La Vallée”), building features (“RUNDLEDEN” pediment with clock), graffiti (“BKO” in white), and transportation modes.

Then I asked “How many people are in this image?” — and it counted 23 individuals, organized by location in the frame and described by distinguishing clothing and actions:

23 people identified and catalogued — cyclists, pedestrians, a person with a stroller, grouped by position in the scene.

This isn’t just a party trick. Vision support means theMIMER can analyze screenshots, UI mockups, architectural diagrams, whiteboard photos, scanned documents — anything visual that’s part of your knowledge workflow.

Image Generation

On the flip side of vision, theMIMER can also generate images. When the task calls for it — illustrations, diagrams, concept art, visual aids — the system can produce images as part of its response. This runs through the same skill routing system, so it’s not a separate bolted-on feature. It’s part of the pipeline.

The file management panel (visible in the right sidebar) shows generated images alongside uploaded files, and you can preview, download, or use them in follow-up conversations.

Chart & Diagram Generation

Ask theMIMER to visualize something and it generates Mermaid diagrams — pie charts, flowcharts, sequence diagrams, Gantt charts, class diagrams — rendered right in the conversation. A “Copy source” button lets you grab the raw Mermaid markup for use in your own docs, READMEs, or presentations.

A pie chart illustrating the distribution of component usage in Delphi/FMX development projects, showing percentages for FMX, VCL, Custom Controls, and Third-party Libraries.

A pie chart generated from context — Mermaid syntax rendered inline, with copy-source functionality built in.

The diagram generation skill is context-aware. It doesn’t just produce generic sample charts — it pulls data from the retrieved context and builds visualizations that reflect your actual knowledge base. Ask it to diagram the architecture of a system it knows about, and you get an accurate architecture diagram, not a textbook illustration.

Structured Reports & Comparisons

One of the built-in skills generates structured comparison reports. Give theMIMER two things to compare — frameworks, technologies, approaches, products — and it produces a proper report with an executive summary, categorized feature comparisons, and architectural analysis.

A report titled 'kbmMW and J2EE Server-Side Web Development Feature Comparison Report' outlining key differences and features for server-side web development between kbmMW and J2EE, dated October 26, 2023.

A structured comparison of kbmMW and J2EE for server-side web development — executive summary, feature breakdown, and architectural differences, all grounded in retrieved documentation.

Again: this is grounded in what theMIMER actually knows. The comparison isn’t pulled from generic internet knowledge — it’s built from the documentation and source code you’ve fed into the system. That’s a meaningful difference when you’re comparing something like an in-house framework against a well-known standard.

The Web Chat Interface

I spent real time on this, and I think it shows. The web chat is a dark-themed, responsive interface designed for actual use.

Connection status — The header shows the live connection state (“Connected” with a green dot), so you always know if the backend is reachable.

Show reasoning — Every response has a collapsible reasoning panel that shows the model’s chain of thought before the answer. Open it to see exactly how the model approached your question.

Pipeline inspector — At the bottom of every response, an expandable section shows every pipeline stage: Router, RouterRaw, Skills, Plan, Execute, Answer, CodeAgent, PostProcess. Each shows which model was used, what classification was made, what plan was built. Full transparency.

Filters — Control how context is retrieved. Adjust what gets included, tune the retrieval scope, filter by document type or source.

File management — A right sidebar panel lets you browse, preview, sort, and manage your workspace files. Upload images, PDFs, or any files. Click to preview — images open in a modal overlay. The panel shows file sizes, thumbnails, and timestamps.

Settings — Configure providers, models, behavior, and pipeline parameters right from the interface.

Screenshot of a coding interface discussing how to find a character at a given mouse position within a control element. It includes a code snippet for coordinate transformation using ObjectAtPoint and screen to local methods.

The web chat interface — dark theme, file panel with 65 workspace files, image preview modal, and the full conversation visible underneath.

The interface also handles response streaming, so you see answers appear in real-time as they’re generated. Send and Stop buttons give you control over generation. The whole thing is responsive and works on various screen sizes.

And you can give a thumbs up or down on the answers given, and provide additional information about what was good or bad for theMIMER to improve from.

It Knows What It Is

Here’s a fun one. I asked theMIMER: “Tell me what theMIMER actually is.”

Screenshot of a conversational AI tool called theMIMeR, showcasing a user interface that includes a text input field, response area, and options for filters and settings.

theMIMER’s self-description

It correctly identified itself as a RAG-based system, described its multimodal capabilities (text, vision, image generation, agentic tools), mentioned the OpenAI-compatible API layer, and even named its creator and company. All retrieved from its own knowledge base. That’s the RAG pipeline doing exactly what it’s supposed to do — and it needed 13 pipeline steps to get there.

What’s Under the Hood

A few technical details for those who want to know what’s actually running:

The system uses vector embeddings for retrieval — your content is chunked, embedded, and stored in a vector store. At query time, the question is embedded and matched against the stored chunks by semantic similarity. This is the core RAG mechanism — retrieve first, generate second.

The pipeline is modular. Each stage (routing, skill selection, planning, retrieval, generation, post-processing) is a separate component that can be configured, replaced, or extended independently. The skill system is plug-and-play.

Provider abstraction means the LLM layer is decoupled from the pipeline. Switch between OpenAI, Anthropic, local models, or anything OpenAI-compatible. Different pipeline stages can use different providers — fast cheap models for routing, powerful models for generation.

The OpenAI-compatible server mode means theMIMER can be integrated into any tool or workflow that speaks the standard chat completions API. Your IDE, your scripts, your CI pipeline — they can all use theMIMER as a backend.

The web interface communicates with the backend in real-time, with streaming support, file upload and management, and full pipeline introspection.

Why I Built This

I wanted to have an AI built in Delphi! that was able to run without need for external or other servers, that I could trust with my own source code and that I could build on following the latest ideas in AI.

It started out as a…. can I do this? and it quickly engulfed all my spare time and nights, because there were always something more and something interesting I could do.

And yes… it is built as a cooperation between flesh and AI, during thousands of iterative processes and steps. It was based on ideas I had started coding years ago. theMIMER is however made from scratch. It is supporting latest LlamaCpp release with multimedia and vision support and latest Stable Diffusion for image generation.

The rest of it is pure Delphi code organized in dozens of units:

kbmLlamaCpp.Agent.pas
kbmLlamaCpp.Api.pas
kbmLlamaCpp.Bootstrap.pas
kbmLlamaCpp.ChatServer.ImageGen.pas
kbmLlamaCpp.ChatServer.pas
kbmLlamaCpp.ChatServer.Vision.pas
kbmLlamaCpp.CodeTools.pas
kbmLlamaCpp.CodeWorkspace.pas
kbmLlamaCpp.CommandHandler.pas
kbmLlamaCpp.Commands.Display.pas
kbmLlamaCpp.Commands.Experience.pas
kbmLlamaCpp.Commands.Ingest.pas
kbmLlamaCpp.Commands.KnowledgeBase.pas
kbmLlamaCpp.Commands.Media.pas
kbmLlamaCpp.Commands.Memory.pas
kbmLlamaCpp.Commands.Query.pas
kbmLlamaCpp.Commands.Router.pas
kbmLlamaCpp.Commands.Server.pas
kbmLlamaCpp.Commands.Settings.pas
kbmLlamaCpp.Commands.Thinking.pas
kbmLlamaCpp.Commands.Tools.pas
kbmLlamaCpp.Commands.Workspace.pas
kbmLlamaCpp.Configuration.pas
kbmLlamaCpp.ExternalTools.pas
kbmLlamaCpp.FileVersioning.pas
kbmLlamaCpp.ImageGen.Api.pas
kbmLlamaCpp.ImageGen.pas
kbmLlamaCpp.IngestionQueue.pas
kbmLlamaCpp.LLMProvider.pas
kbmLlamaCpp.LLMRegistry.pas
kbmLlamaCpp.Loader.Image.pas
kbmLlamaCpp.Loader.PDF.pas
kbmLlamaCpp.Loader.Text.pas
kbmLlamaCpp.Loader.Web.pas
kbmLlamaCpp.Multimedia.Api.pas
kbmLlamaCpp.OpenAI.Client.pas
kbmLlamaCpp.OpenAI.Server.ImageGen.pas
kbmLlamaCpp.OpenAI.Server.pas
kbmLlamaCpp.Orchestrator.pas
kbmLlamaCpp.Output.pas
kbmLlamaCpp.Pipeline.pas
kbmLlamaCpp.Pipeline.Tools.pas
kbmLlamaCpp.PromptComposer.pas
kbmLlamaCpp.PromptInstructions.pas
kbmLlamaCpp.QueryRouter.pas
kbmLlamaCpp.QueryRouting.pas
kbmLlamaCpp.QueryRules.pas
kbmLlamaCpp.RAG.Chat.pas
kbmLlamaCpp.RAG.Ingestor.pas
kbmLlamaCpp.RAG.Memory.pas
kbmLlamaCpp.RAG.pas
kbmLlamaCpp.RAG.Pool.pas
kbmLlamaCpp.RAG.Store.pas
kbmLlamaCpp.RAG.Types.pas
kbmLlamaCpp.Request.pas
kbmLlamaCpp.SelfImprove.pas
kbmLlamaCpp.SettingsManager.pas
kbmLlamaCpp.Skill.Thinking.pas
kbmLlamaCpp.Skills.pas
kbmLlamaCpp.Splitter.Delphi.pas
kbmLlamaCpp.Thinking.Auto.pas
kbmLlamaCpp.Thinking.Code.pas
kbmLlamaCpp.Thinking.Domains.pas
kbmLlamaCpp.Thinking.Meta.pas
kbmLlamaCpp.Thinking.pas
kbmLlamaCpp.ToolCalling.pas
kbmLlamaCpp.UploadTools.pas
kbmLlamaCpp.UserStore.pas
kbmLlamaCpp.Vision.pas
kbmLlamaCpp.Wrapper.pas

Release and price?

That’s theMIMER. Like its mythological namesake, it draws from a well of actual knowledge — your knowledge — and when it speaks, it speaks from what it has seen, not from what it imagines.

Version 1.0.0 is soon to be released as a FREE download of the executable, so you can create your own top of the range AI setup!

Revealing theMIMER v1.0.0

Bykimbomadsen