It's the Dependency Graph, Stupid: A Guide to Agent Architecture

Mark

14 Jun 2025 — 4 min read

The "single-agent versus multi-agent" showdown is a distraction. It misses the deeper point: an agent architecture must mirror the task's dependency graph. Multi-agent designs shine when subtasks are independent, while single-threaded flows dominate when state is tightly coupled. The choice isn't ideological; it's a classic engineering trade-off between latency and coherence.

The path forward isn't choosing a static architecture, but building adaptive systems that are smart enough to choose their own on the fly—and learn from the experience.

The Core Tension: Parallel vs. Coupled

The debate is perfectly framed by two recent posts. Anthropic details how their multi-agent system achieved a 90.2% performance improvement on breadth-first research, cutting latency by up to 90%. Conversely, a post from Cognition AI argues against multi-agent systems for state-coupled tasks like generating game assets, where parallel agents lacking shared context create stylistic mismatches and incoherent results.

Both are correct. The success of an architecture is dictated by whether subtasks can finish without reading or mutating one another's state. The architecture must fit the graph, not the other way around. But if the concept is so simple, why hasn't it been solved?

The Three Hurdles of Modern Orchestration

Applying this old idea to probabilistic, natural-language systems presents three formidable hurdles.

LLMs Are Bad at Dependency Analysis: Ask a state-of-the-art model to identify which subtasks in a complex project can run in parallel. It will give you a plausible-sounding answer that is wrong often enough to be dangerous. Our internal tests show that without significant scaffolding, even top models fail on non-trivial dependency analysis over 50% of the time. While emerging methods like SELT (Self-Evaluation Tree Search) are tackling this, it remains a frontier problem.

Context-as-State is a Nightmare: Parallel agents need just enough shared context to maintain coherence (like the art style in the Flappy Bird example) but not so much that they interfere with each other. It's the classic shared memory problem from distributed computing, except the memory is unstructured natural language and the processors are non-deterministic. This has led to patterns like Graph-RAG, which uses knowledge graphs to provide a shared, structured understanding of the world for all agents.

The Fragility of Static Plans: Agentic systems are prone to compounding errors. A single sub-agent failure or a misinterpreted instruction can derail an entire workflow. A static plan, whether parallel or sequential, cannot gracefully recover from the unexpected realities of execution.

The Solution: From Static Plans to Adaptive Orchestrators

The solution isn't a better agent, but a better orchestrator. The next breakthrough will come from systems with three key capabilities, drawing inspiration from decades of research in algorithmic search.

Execution Strategy Search: Today's agents have amnesia. An intelligent system should treat task execution as a search problem. For any complex request, there exists a vast "strategy space" of possible execution graphs (sequential, parallel, or hybrid). The orchestrator's first job is to find an optimal path through this space. Its "memory" is a learned heuristic that guides this search. It should remember: "For competitor analysis, a parallel-fetch-then-sequential-synthesis strategy has a 95% success rate and is 4x faster." This isn't just logging; it's building a heuristic function that maps task patterns to promising execution strategies, pruning the search space and dramatically accelerating planning.

Meta-Learning to Refine the Heuristic: A system must perform a post-mortem on every complex task to improve its own planning model. Which decomposition decisions led to rework? Where did we assume independence but find a hidden coupling? This meta-learning loop turns execution traces into a better search heuristic. It's like a chess engine analyzing its past games to improve its evaluation function.

Dynamic Runtime Adaptation: An agent system shouldn't follow its initial plan blindly. The initial plan is just the best guess from the strategy search. When parallel branches start returning conflicting data, the system should recognize the hidden dependency and serialize the workflow. When a sequential task reveals an opportunity for parallelization (e.g., a research step uncovers 10 independent reports to summarize), it should spin up new workers. The execution graph must reshape itself based on the reality of the task, not the assumptions of the plan.

Adaptive Orchestration in Practice: The Riloworks Approach

This is what we've found to work, in practice, over a wide range of knowledge working tasks. When a user asks Rilo to "process this prospect list," it doesn't default to a fixed strategy. It analyzes the task structure to inform its initial strategy search.

Enriching company data from disparate sources? That's a "read-only" task with no cross-dependencies. Rilo identifies this as a parallelizable graph for maximum speed.
Updating Salesforce records with complex dependencies between accounts, contacts, and opportunities? That's a "read-write" task where order matters. Rilo prunes parallel paths from its search and executes sequentially to ensure data integrity.

Crucially, the system learns from each execution, refining its internal heuristics to remember which decomposition strategies lead to the fastest, most reliable outcomes for a given task type, constantly improving its own orchestration logic.

The Practitioner's Guide: What Works Today

Forget the agent-count debate. Focus on the task's underlying structure.

If you're building AI systems:

Decompose First, Architect Second: Use cheaper models to analyze task structure and build a dependency graph before spawning agents. Default to sequential unless parallelism is obvious and safe.
Track What Works: Log every decomposition and its outcome. Build domain-specific heuristics that guide future planning.
Manage Context Explicitly: Differentiate between global context (shared constraints for all agents) and local context (task-specific information), with clear protocols for handoffs.

If you're using AI systems:

Structure Requests to Expose Parallelism: Instead of "Analyze my competitors," say "For competitors A, B, and C, find their pricing, recent funding, and key hires."
Specify Dependencies Explicitly: "First, find the top 5 companies in this space, then for each one, analyze their latest earnings report."

The Real Question

Once we stop arguing about agent count, we can ask better questions:

How do we teach models to reliably recognize task dependencies?
Can we build "task compilers" that search a strategy space to optimize execution plans for LLM workloads?
How do we build systems that improve their own planning heuristics with experience?

These questions matter because they are about advancing capability, not defending a dogma. The single-vs-multi-agent debate is a distraction. The real game is building systems smart enough to figure out for themselves when to parallelize and when to focus. We solved this for CPUs. Now it's time to solve it for LLMs.