The AI Interrogator Bot: Turning Tribal Knowledge Into Structured Systems

ciciodonnell
Jun 24
5 min read

I once spent several months trying to understand a client's fulfillment process. On paper, it was one process. In practice, there were several distinct processes. The process had originally been designed to work well in sprawling suburban big-box stores that dominated sales across the chain.

But it had also been adapted to small-format stores tucked into high-rise buildings in the inner-city (which was a critical component of unlocking inventory and the company’s “save the sale” strategy). Staffing models, walking distances, and peak-hour bottlenecks were all different, because the physical and operational realities were different.

Getting to that understanding took dozens of one-on-one meetings with headquarter staff, field leaders, and software engineers. I also personally traveled to different stores and shadowed fulfillment employees.

The process of getting to the real pain point and understanding the important variables in the domain space is the most time consuming and expensive part of every consulting engagement I've worked on. It's also, I'd argue, one of the biggest unsolved problems in applied AI right now.

The Real Problem: Knowledge Elicitation

Ask a domain expert "tell me everything you know about X" and you'll get a vague answer. Ask the same expert "what would you change about this specific example" and you'll get something rich, specific, and immediately useful.

It's a well-documented challenge called the knowledge elicitation problem: expertise is often tacit, procedural, and example-bound. People are good at recognizing the right answer in context and bad at generating abstract rules from scratch.

Consultants have built an entire profession around closing this gap, usually through structured interviews, workshops, on-site visits, shadowing, and working iteratively with clients across many one-on-one conversations.

But what if instead of scheduling another round of one-on-one meetings, you could send the client a link to a chatbot that asked the questions for you?

Why This Matters Now

This problem has gotten more urgent with the rise of LLMs. There's a common assumption that you solve an organization's AI gap by pointing a RAG system at its internal documents (wikis, reports, Slack history) and calling it done. But tribal knowledge was never written down or not written down in a cohesive and structured way.

This is a major bottleneck in enterprise AI adoption. The documents are the easy part. The tacit knowledge is the part that actually moves the needle, and it's the part most traditional RAG implementations never touch.

The Solution: Send a Link Instead of a Calendar Invite

Here's the idea I kept coming back to. What if, instead of scheduling another round of one-on-one meetings, you could send the client a link to a chatbot? The clients click it on their own schedule and work through a structured set of questions at their own pace. Best of all, this can be done in parallel across many stakeholders.

That's the premise behind the Interrogator: an asynchronous, example-driven knowledge extraction tool that does the slow, expensive part of consulting without requiring the consultant to be in the room for every minute of it.

The Two-Process Model

The Interrogator isn't a single conversation. It's two distinct processes with two distinct owners.

Process 1: Coverage Matrix Generation (Consultant-led). Before the client ever sees a question, the consultant works through what the full space of relevant variables actually looks like. The output is a coverage matrix: a concrete, finite set of scenarios that, taken together, span the space the consultant needs to understand.

Process 2: Knowledge Extraction (Client-led). The client gets a link. The interaction is asynchronous, resumable, and example-driven. Each answer fills in a cell of the coverage matrix. The client never has to articulate an abstract rule; they just react to specifics, which is the thing people are actually good at.

Making It Concrete: The Diet Bot as First Test Case

I'm currently building toward the Diet Bot's prototype using exactly this logic.

For Process 1, I defined a coverage matrix bounded by a small prototype: three pie recipes, three cakes, three bar-style desserts — nine recipes total, chosen specifically to span a range of ingredients and cooking techniques rather than to be a random sample. That exercise alone surfaced some genuinely interesting wrinkles in how substitution coverage should be designed, which I'll get into in a follow-up post (Article 3B).

For Process 2, I walked through each conventional recipe myself and annotated it: which ingredients are forbidden, what the viable substitutes are, and which one I'd actually use and why. The final output of that annotation is a structured YAML file. This is the same rule-engine format described in the Diet Bot's architecture (see Article 1A). This becomes the direct input to the Diet Bot's rule engine. Theoretically, if my AI Interrogator Bot works, that YAML file is what the AI Interrogator Bot will generate on its own.

Extending the Use Case

Going back to the store-format example: the consultants build a coverage matrix for an engagement using the exact same structure. They create a finite, deliberately chosen set of concrete scenarios (small urban format, large suburban format, and whatever else turns out to matter) that span the operational variation a client actually has. This still requires some domain knowledge, but at a level that I think is manageable for a consultant with some experience in the field. I personally find that my own experiences can be ported easily across clients in the same industry and even across different industries.

The client works through the examples, one by one, providing contextual knowledge through annotation and answers. This will likely be an iterative process where the consultant reviews the client outputs, drafts new questions, waits for another round of inputs, and repeats until the problem space is clear. Either way, this process still has the advantage of being asynchronous and done in parallel across many stakeholders.

Eventually this becomes distilled into output (e.g., markdown files on a particular topic) that can be added as a document into a RAG, and then fed into a purpose-built AI bot (as I mention in my previous article).

Where the AI Arbiter Bot Fits

Knowledge extraction solves one problem. It doesn't solve all of them. Sometimes there isn't a single right answer. Multiple valid substitutions exist for the same ingredient, and the best choice depends on context and personal preference. That's where the second tool in this series, the AI Arbiter Bot, picks up: ranking the viable options and learning a user's preferences over time, rather than forcing the AI Interrogator Bot to extract a single "correct" answer where none exists.

I see the AI Arbiter Bot as being especially useful in data QA’ing processes when an ML model identifies a boundary case that gets sent to a human for manual processing. That manual process should be harnessed to a knowledge extraction tool, in other words, the AI Arbiter Bot.

Closing

This is a working model, not a finished one. I expect the two-process structure to hold up reasonably well, since it mirrors how I already work with clients. But the details of how the coverage matrix gets built and how cleanly the client-facing extraction can be turned into generalizable summaries are still up in the air. I'll report back on what actually worked (and what didn’t) in the next article.