Blog Insights
Building Products You Can Trust in a Probabilistic World

For decades, software development relied on a simple promise: determinism. You wrote the rules, and for every specific user input, you got a predictable output. If something broke, you found the bug and fixed it. Reliability was essentially a byproduct of control.

AI has changed that. We’ve moved into a world where the same user input can yield different results every time. At Forum One, we’ve seen this shift firsthand. For the government agencies, foundations, and other mission-driven agencies we serve, this isn’t a technical quirk. It’s a pivotal opportunity to rethink how we maintain public trust while scaling our impact through emerging technology and AI-enabled software.

Moving from Instructions to Behavior

With probabilistic systems, such as AI, we aren’t dictating outcomes so much as we’re shaping behavior. Failure in this new world rarely looks like a system crash or a 404 error. Instead, it manifests as model drift, institutional bias, or “hallucinations”.

A response might look polished and credible, but simply be untrue. If we treat AI development like traditional software, we risk shipping systems that look great in a demo but quietly fail, mislead, or cause harm in production. We are no longer just writing instructions; we’re managing predictions, and that requires a new kind of discipline.

Reliability Through Orchestration

There is a natural tension in AI design between flexibility and accountability. “Agent-driven” systems are powerful because they reason dynamically, but that autonomy often makes them unpredictable. In high-stakes environments, we are biased toward structure.

We achieve this through Orchestration. Rather than sending a single, open-ended prompt to a model, we use tools like n8n or Activepieces to build a sequence of controlled, deterministic steps. This approach treats the AI model as one component of a larger system rather than the system itself. A typical orchestrated workflow follows a rigorous three-step logic:

  • Pre-Processing: Before the AI sees a prompt, the system runs a deterministic check. It asks if the request violates policy or sits outside the data’s scope, stopping irrelevant queries before they consume tokens.
  • Narrow Tasks: We keep the AI’s job small and highly defined. Instead of asking it to “be a subject matter expert,” we feed it specific context and ask it to “summarize this paragraph” or “extract these fields”. This drastically reduces the space where a model might hallucinate.
  • Post-Processing: Finally, we verify the result programmatically. We check if the output is valid JSON, contains prohibited terms, or can be cross-referenced against the source text. If the answer fails, the orchestrator triggers a retry or flags it for human review.

You can’t make a probabilistic model 100% accurate, but through orchestration, you can contain the uncertainty. This approach ensures that we aren’t just innovating for the sake of innovation; we are providing the strategic rigor required for high-stakes digital environments.

Measurement is the Only Antidote to Guessing

In the past, infrastructure monitoring told us if a system was up or down. Now, we must ask if the answer is correct. Traditional uptime metrics are no longer enough.

Using AI observability, we now must track metrics such as grounding (ensuring answers are supported by verified sources), task success rates, and toxicity. We have to measure hallucination rates as rigorously as we once measured page load times. Without these signals, we’re just guessing, and guessing is not a delivery model.

What This Looks Like in Practice

When we apply this orchestration and measurement mindset, the results move from impressive experiment to reliable tool. Here are some ways we’re applying these systems for our clients:

  • Growing a Government Workforce: We developed an AI-powered career pathing tool for a federal agency to help employees navigate complex roles. Orchestration ensures the AI matches skills to actual job classifications rather than hallucinating roles that don’t exist.
  • AI-Powered Lesson Plans: For a major museum, we’re discussing ways to use AI to generate lesson plans based on digitized collections. Orchestration and measurement will ensure the AI only uses verified archival data, preventing historical inaccuracies.
  • CMS Integration and Brand Voice: We’re integrating AI into Content Management Systems to help teams generate content that adheres to brand voice. The orchestrator acts as a digital editor, ensuring text meets accessibility and style guidelines before it reaches a draft stage.

Questions That Change the Conversation

Before advancing any AI project, we stop asking if the model is smart enough and start asking if the architecture is resilient enough. We use these five questions to pressure-test the orchestration:

  • What error rate is acceptable, and to whom? 
  • What is the specific dataset we’ll test against? 
  • How do we audit the way a system responds to improve the results? 
  • When does a real person need to step in and review a result?
  • How does this system reinforce, rather than risk, the organization’s mission

If those questions are difficult to answer, the system isn’t ready for production.

From Experiment to Infrastructure

We operate in a world where trust is fragile. Our goal isn’t just to make AI impressive; it’s to make it dependable. When we get this right, institutions move faster, and communities get the clear, accurate information they deserve. Reaching that bar takes more discipline, not less. But for the missions we support, it’s the only way to truly turn ideas into lasting impact. Explore Forum One’s AI Strategy & Enablement services to learn more.

Written by

Are you ready to create impact?

We'd love to connect and discuss your next project.