Knowledge workers, including clinical staff making consequential decisions, often do this: open ChatGPT, open Claude, open Gemini, open Perplexity, paste the same question into all four, read the answers, pick the one that feels most right, and move on. No record of what was asked. No record of what was considered. No way to reconstruct the reasoning later.
For most contexts that's inefficient. For a regulated healthcare environment where a decision needs to be defensible, it's a problem.
A panel of experts, not a lucky guess.
The Referee is a platform that turns the "open five tabs" workflow into something structured, repeatable, and auditable.
The user builds a Model Set, a named configuration that specifies which AI models run in parallel, with individual system prompts and temperature settings tuned for the specific use case. Clinical triage gets a different panel to treatment protocol review. Each preset is built once and reused consistently.
They submit one prompt. All models in the panel respond simultaneously, streamed side-by-side. Then a designated Referee model synthesises a single consolidated answer, using a configurable strategy: summarise, rank by confidence, reconcile differences, or simply pick the strongest response.
Every session is saved. Every model's response is stored. The Referee's verdict, the timestamp, the token counts, the costs; all of it persists and is searchable. A clinician can pull up any past decision and show exactly how it was reached.
Built for control.
Two design decisions shaped everything.
The first was BYOK, bring your own key. The client supplies their own API keys for each AI provider, stored encrypted at rest. That means they control their data residency and their spend. Nothing passes through Hephon's infrastructure once the platform is deployed.
The second was OpenRouter as the integration layer. Rather than building separate integrations for every AI provider, the platform routes through a single proxy that gives access to over 100 models by ID. When a new model is released, it's available in the platform the same day, no engineering required.
Where it stands.
The Referee is in active build. The client approved the design in the kickoff workshop and signed a long-term agreement before a single line of backend code was written. That level of buy-in before delivery is rare, and it's a direct result of how clearly the platform solved a problem the client already lived with every day.
Hard adoption metrics will be published once the pilot deployment is complete.
"Replacing 'open five tabs and guess' with a structured, auditable decision process, built for a healthcare context where the reasoning has to be reconstructable."
- Multiple AI models run in parallel on a single prompt, responses streamed side-by-side
- A Referee model synthesises the final answer using a configurable strategy
- Every session saved with full provenance, model responses, verdict, timestamp, cost
- 100+ models available through a single integration layer
- Client-controlled API keys, full data residency and spend control
- Built for regulated environments where decisions need to be defensible





