The Problem: Scheduling is Harder Than It Looks
Booking a medical appointment online seems simple — but behind the scenes it's one of the most friction-laden workflows in digital health. A typical scheduling flow spans more than 5 pages, requires over 15 clicks, and asks the user to fill out 25 to 40 text fields covering patient demographics, insurance details, provider preferences, and appointment reasons.
This is a long-horizon task: many sequential, dependent steps where any single misstep — a mistyped field, an unrecognized insurance code, a timeout — causes the entire flow to fail. And unlike consumer apps, healthcare portals are fragmented across dozens of incompatible systems with no consistent structure, making automation extraordinarily difficult.
The Solution: Multi-Agent with Built-in Error Recovery
Our system addresses this with a multi-agent architecture where specialized agents handle different parts of the scheduling workflow. Each agent has the ability to navigate and understand a website the way a human does — perceiving the current page state, deciding what action to take, and executing it (clicking, typing, submitting).
But what makes this system truly robust is what happens when things go wrong. Rather than failing silently or crashing, our architecture includes a built-in actor-critic error recovery mechanism that continuously monitors and corrects the agent's behavior in real time.
Actor-Critic: Borrowed from Reinforcement Learning
The actor-critic pattern is a concept from reinforcement learning, adapted here for real-world web navigation. In the original RL formulation, an actor policy proposes actions and a critic evaluates their expected value. We apply the same principle to agentic task execution:
clicks a button,
fills a field,
navigates a page
page changes,
errors appear,
confirmations load
was this step
successful?
what's next?
The Actor is the execution agent — it takes concrete actions in the web environment. When told to book an appointment, it navigates pages, reads form fields, fills in patient information, selects a physician, and submits confirmations.
The Critic watches each action and evaluates whether it succeeded. If the portal showed an unexpected error page, the critic catches it. If a required field was skipped or incorrectly validated, the critic flags it. Critically, the critic doesn't just detect failure — it instructs the actor to retry with a different strategy. Maybe a different input format, a different navigation path, or a fallback approach.
This closed feedback loop transforms a brittle sequential script into a self-correcting system — one that can handle the messy, inconsistent reality of healthcare web portals at scale.
Why This Matters
Traditional RPA (Robotic Process Automation) solutions break the moment a portal changes its layout. LLM-only approaches hallucinate or lose track of state across long sequences. Our multi-agent actor-critic system combines the adaptability of large language models with the reliability of continuous verification — making it robust enough to deploy in a production healthcare context where errors have real consequences.
This work has resulted in 5 pending patents covering novel agentic architectures for fragmented web domains, including the actor-critic input scheme specifically.