0%

从0到1学习智能体开发3:子代理分而治之

v3: 子代理机制

~450 行代码,+1 个工具,分而治之。

v2 添加了规划。但对于大型任务如”探索代码库然后重构认证”,单一 Agent 会撞上上下文限制。探索过程把 20 个文件倒进历史记录,重构时失去焦点。

v3 添加了 Task 工具:生成带有隔离上下文的子代理。

问题

单 Agent 的上下文污染:

1
2
3
4
5
主 Agent 历史:
[探索中...] cat file1.py -> 500 行
[探索中...] cat file2.py -> 300 行
... 15 个文件 ...
[现在重构...] "等等,file1 里有什么来着?"

解决方案:把探索委托给子代理

1
2
3
4
5
主 Agent 历史:
[Task: 探索代码库]
-> 子代理探索 20 个文件
-> 返回: "认证在 src/auth/,数据库在 src/models/"
[现在用干净的上下文重构]

代理类型注册表

每种代理类型定义其能力:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
AGENT_TYPES = {
"explore": {
"description": "只读,用于搜索和分析",
"tools": ["bash", "read_file"], # 不能写
"prompt": "搜索和分析。不要修改。返回简洁摘要。"
},
"code": {
"description": "完整代理,用于实现",
"tools": "*", # 所有工具
"prompt": "高效实现更改。"
},
"plan": {
"description": "规划和分析",
"tools": ["bash", "read_file"], # 只读
"prompt": "分析并输出编号计划。不要改文件。"
}
}

Task 工具

1
2
3
4
5
6
7
8
9
{
"name": "Task",
"description": "为聚焦的子任务生成子代理",
"input_schema": {
"description": "短任务名(3-5 词)",
"prompt": "详细指令",
"agent_type": "explore | code | plan"
}
}

主代理调用 Task → 子代理运行 → 返回摘要。

子代理执行

Task 工具的核心:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def run_task(description, prompt, agent_type):
config = AGENT_TYPES[agent_type]

# 1. 代理特定的系统提示词
sub_system = f"You are a {agent_type} subagent.\n{config['prompt']}"

# 2. 过滤后的工具
sub_tools = get_tools_for_agent(agent_type)

# 3. 隔离的历史(关键:没有父上下文)
sub_messages = [{"role": "user", "content": prompt}]

# 4. 同样的查询循环
while True:
response = client.messages.create(
model=MODEL, system=sub_system,
messages=sub_messages, tools=sub_tools
)
if response.stop_reason != "tool_use":
break
# 执行工具,追加结果...

# 5. 只返回最终文本
return extract_final_text(response)

关键概念:

概念 实现
上下文隔离 全新的 sub_messages = []
工具过滤 get_tools_for_agent()
专门化行为 代理特定的系统提示词
结果抽象 只返回最终文本

工具过滤

1
2
3
4
5
def get_tools_for_agent(agent_type):
allowed = AGENT_TYPES[agent_type]["tools"]
if allowed == "*":
return BASE_TOOLS # 不给 Task(演示中不递归)
return [t for t in BASE_TOOLS if t["name"] in allowed]
  • explore:只有 bash + read_file
  • code:所有工具
  • plan:只有 bash + read_file

子代理不获得 Task 工具(防止无限递归)。

进度显示

子代理输出不污染主聊天:

1
2
3
4
5
6
你: 探索代码库
> Task: 探索代码库
[explore] 探索代码库 ... 5 个工具, 3.2s
[explore] 探索代码库 - 完成 (8 个工具, 5.1s)

这是我发现的: ...

实时进度,干净的最终输出。

典型流程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
用户: "把认证重构为 JWT"

主 Agent:
1. Task(explore): "找到所有认证相关文件"
-> 子代理读取 10 个文件
-> 返回: "认证在 src/auth/login.py,session 在..."

2. Task(plan): "设计 JWT 迁移方案"
-> 子代理分析结构
-> 返回: "1. 添加 jwt 库 2. 创建 token 工具..."

3. Task(code): "实现 JWT tokens"
-> 子代理写代码
-> 返回: "创建了 jwt_utils.py,更新了 login.py"

4. 总结更改

每个子代理有干净的上下文。主代理保持聚焦。

对比

方面 v2 v3
上下文 单一,增长中 每任务隔离
探索 污染历史 包含在子代理中
并行 可能(演示中没有)
新增代码 ~300 行 ~450 行

模式

1
2
3
4
5
复杂任务
└─ 主 Agent(协调者)
├─ 子代理 A (explore) -> 摘要
├─ 子代理 B (plan) -> 计划
└─ 子代理 C (code) -> 结果

同样的 Agent 循环,不同的上下文。这就是全部技巧。


分而治之。上下文隔离。

完整源码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
#!/usr/bin/env python3
"""
v3_subagent.py - Mini Claude Code: Subagent Mechanism (~450 lines)

Core Philosophy: "Divide and Conquer with Context Isolation"
=============================================================
v2 adds planning. But for large tasks like "explore the codebase then
refactor auth", a single agent hits problems:

The Problem - Context Pollution:
-------------------------------
Single-Agent History:
[exploring...] cat file1.py -> 500 lines
[exploring...] cat file2.py -> 300 lines
... 15 more files ...
[now refactoring...] "Wait, what did file1 contain?"

The model's context fills with exploration details, leaving little room
for the actual task. This is "context pollution".

The Solution - Subagents with Isolated Context:
----------------------------------------------
Main Agent History:
[Task: explore codebase]
-> Subagent explores 20 files (in its own context)
-> Returns ONLY: "Auth in src/auth/, DB in src/models/"
[now refactoring with clean context]

Each subagent has:
1. Its own fresh message history
2. Filtered tools (explore can't write)
3. Specialized system prompt
4. Returns only final summary to parent

The Key Insight:
---------------
Process isolation = Context isolation

By spawning subtasks, we get:
- Clean context for the main agent
- Parallel exploration possible
- Natural task decomposition
- Same agent loop, different contexts

Agent Type Registry:
-------------------
| Type | Tools | Purpose |
|---------|---------------------|---------------------------- |
| explore | bash, read_file | Read-only exploration |
| code | all tools | Full implementation access |
| plan | bash, read_file | Design without modifying |

Typical Flow:
-------------
User: "Refactor auth to use JWT"

Main Agent:
1. Task(explore): "Find all auth-related files"
-> Subagent reads 10 files
-> Returns: "Auth in src/auth/login.py..."

2. Task(plan): "Design JWT migration"
-> Subagent analyzes structure
-> Returns: "1. Add jwt lib 2. Create utils..."

3. Task(code): "Implement JWT tokens"
-> Subagent writes code
-> Returns: "Created jwt_utils.py, updated login.py"

4. Summarize changes to user

Usage:
python v3_subagent.py
"""

import os
import subprocess
import sys
import time
from pathlib import Path

from dotenv import load_dotenv

load_dotenv()

try:
from anthropic import Anthropic
except ImportError:
# 依赖缺失时直接退出。这个脚本需要:
# - anthropic: 调用模型与 tools
# - python-dotenv: 本地从 .env 读取 API_KEY / BASE_URL 等配置
sys.exit("Please install: pip install anthropic python-dotenv")


# =============================================================================
# Configuration
# =============================================================================

API_KEY = os.getenv("ANTHROPIC_API_KEY")
BASE_URL = os.getenv("ANTHROPIC_BASE_URL")
MODEL = os.getenv("MODEL_NAME", "claude-sonnet-4-20250514")
WORKDIR = Path.cwd()

# 初始化 Anthropic 客户端:
# - BASE_URL 为空:默认走官方服务
# - BASE_URL 非空:走“兼容 Anthropic API 的网关/代理”(例如企业内网转发)
client = Anthropic(api_key=API_KEY, base_url=BASE_URL) if BASE_URL else Anthropic(api_key=API_KEY)


# =============================================================================
# Agent Type Registry - The core of subagent mechanism
# =============================================================================

# 子代理类型注册表:v3 的核心抽象。
# 这个表把“子任务的意图”映射为:
# - 可用工具白名单(只读 / 全量)
# - 子代理的 system prompt(引导输出风格与边界)
# - 用途描述(用于让主模型知道有哪些子代理可选)
#
# 关键点:子代理的“能力边界”不是靠模型自觉,而是靠工具过滤强制实现。
AGENT_TYPES = {
# Explore: Read-only agent for searching and analyzing
# Cannot modify files - safe for broad exploration
"explore": {
"description": "Read-only agent for exploring code, finding files, searching",
"tools": ["bash", "read_file"], # No write access
"prompt": "You are an exploration agent. Search and analyze, but never modify files. Return a concise summary.",
},

# Code: Full-powered agent for implementation
# Has all tools - use for actual coding work
"code": {
"description": "Full agent for implementing features and fixing bugs",
"tools": "*", # All tools
"prompt": "You are a coding agent. Implement the requested changes efficiently.",
},

# Plan: Analysis agent for design work
# Read-only, focused on producing plans and strategies
"plan": {
"description": "Planning agent for designing implementation strategies",
"tools": ["bash", "read_file"], # Read-only
"prompt": "You are a planning agent. Analyze the codebase and output a numbered implementation plan. Do NOT make changes.",
},
}


def get_agent_descriptions() -> str:
"""
生成子代理类型描述字符串,供 system prompt / Task tool 文档展示。

目的:
- 让“主模型”在可选项清单中知道有哪些 agent_type 可选
- 每个类型的用途是什么
"""
return "\n".join(
f"- {name}: {cfg['description']}"
for name, cfg in AGENT_TYPES.items()
)


# =============================================================================
# TodoManager (from v2, unchanged)
# =============================================================================

class TodoManager:
"""
任务清单管理器(v2 引入,v3 复用)。

中文说明:
- TodoWrite 让“计划”从模型脑内外化出来,便于人和模型共同追踪进度。
- 这里保留最小约束:同时只能一个 in_progress,最多 20 条。
"""

def __init__(self):
self.items = []

def update(self, items: list) -> str:
validated = []
in_progress = 0

for i, item in enumerate(items):
content = str(item.get("content", "")).strip()
status = str(item.get("status", "pending")).lower()
active = str(item.get("activeForm", "")).strip()

if not content or not active:
raise ValueError(f"Item {i}: content and activeForm required")
if status not in ("pending", "in_progress", "completed"):
raise ValueError(f"Item {i}: invalid status")
if status == "in_progress":
in_progress += 1

validated.append({
"content": content,
"status": status,
"activeForm": active
})

if in_progress > 1:
raise ValueError("Only one task can be in_progress")

self.items = validated[:20]
return self.render()

def render(self) -> str:
if not self.items:
return "No todos."
lines = []
for t in self.items:
mark = "[x]" if t["status"] == "completed" else \
"[>]" if t["status"] == "in_progress" else "[ ]"
lines.append(f"{mark} {t['content']}")
done = sum(1 for t in self.items if t["status"] == "completed")
return "\n".join(lines) + f"\n({done}/{len(self.items)} done)"


TODO = TodoManager()


# =============================================================================
# System Prompt
# =============================================================================

SYSTEM = f"""You are a coding agent at {WORKDIR}.

Loop: plan -> act with tools -> report.

You can spawn subagents for complex subtasks:
{get_agent_descriptions()}

Rules:
- Use Task tool for subtasks that need focused exploration or implementation
- Use TodoWrite to track multi-step work
- Prefer tools over prose. Act, don't just explain.
- After finishing, summarize what changed."""


# =============================================================================
# Base Tool Definitions
# =============================================================================

# BASE_TOOLS 是所有代理(主代理/子代理)共享的“基础工具集”。
# v3 的新增不在这些基础工具里,而是单独定义 TASK_TOOL 并只给主代理使用。
BASE_TOOLS = [
{
"name": "bash",
"description": "Run shell command.",
"input_schema": {
"type": "object",
"properties": {"command": {"type": "string"}},
"required": ["command"],
},
},
{
"name": "read_file",
"description": "Read file contents.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"limit": {"type": "integer"}
},
"required": ["path"],
},
},
{
"name": "write_file",
"description": "Write to file.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
},
"required": ["path", "content"],
},
},
{
"name": "edit_file",
"description": "Replace text in file.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"old_text": {"type": "string"},
"new_text": {"type": "string"},
},
"required": ["path", "old_text", "new_text"],
},
},
{
"name": "TodoWrite",
"description": "Update task list.",
"input_schema": {
"type": "object",
"properties": {
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"content": {"type": "string"},
"status": {
"type": "string",
"enum": ["pending", "in_progress", "completed"]
},
"activeForm": {"type": "string"},
},
"required": ["content", "status", "activeForm"],
},
}
},
"required": ["items"],
},
},
]


# =============================================================================
# Task Tool - The core addition in v3
# =============================================================================

# Task 工具是“子代理机制”的对外接口:
# - 模型调用 Task(...) 来触发一次子代理运行
# - 子代理会在“独立 messages”里执行若干工具调用
# - 子代理最终只返回“总结文本”给父代理(减少上下文污染)
#
# 这里把 agent_type 的枚举写进 schema,让模型只能选择已注册的类型。
TASK_TOOL = {
"name": "Task",
"description": f"""Spawn a subagent for a focused subtask.

Subagents run in ISOLATED context - they don't see parent's history.
Use this to keep the main conversation clean.

Agent types:
{get_agent_descriptions()}

Example uses:
- Task(explore): "Find all files using the auth module"
- Task(plan): "Design a migration strategy for the database"
- Task(code): "Implement the user registration form"
""",
"input_schema": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Short task name (3-5 words) for progress display"
},
"prompt": {
"type": "string",
"description": "Detailed instructions for the subagent"
},
"agent_type": {
"type": "string",
"enum": list(AGENT_TYPES.keys()),
"description": "Type of agent to spawn"
},
},
"required": ["description", "prompt", "agent_type"],
},
}

# Main agent gets all tools including Task
ALL_TOOLS = BASE_TOOLS + [TASK_TOOL]


def get_tools_for_agent(agent_type: str) -> list:
"""
Filter tools based on agent type.

Each agent type has a whitelist of allowed tools.
'*' means all tools (but subagents don't get Task to prevent infinite recursion).

中文说明:
- 这是“能力裁剪”的实现:不同 agent_type 只能看到不同工具集合。
- 即使是 tools="*" 的 code 子代理,这个 demo 也不会给它 Task 工具,
目的是避免演示里出现“子代理再生成子代理”的递归链条(现实系统可以允许,但需要额外控制)。
"""
allowed = AGENT_TYPES.get(agent_type, {}).get("tools", "*")

if allowed == "*":
return BASE_TOOLS # All base tools, but NOT Task (no recursion in demo)

return [t for t in BASE_TOOLS if t["name"] in allowed]


# =============================================================================
# Tool Implementations
# =============================================================================

def safe_path(p: str) -> Path:
"""Ensure path stays within workspace."""
# 文件系统“沙箱”:把相对路径 resolve 成绝对路径后,检查是否仍在 WORKDIR 内。
path = (WORKDIR / p).resolve()
if not path.is_relative_to(WORKDIR):
raise ValueError(f"Path escapes workspace: {p}")
return path


def run_bash(cmd: str) -> str:
"""Execute shell command with safety checks."""
# 这是最危险的工具:会执行模型生成的 shell 命令。
# 这里用非常粗糙的黑名单拦截 + 截断输出来降低风险与上下文膨胀。
if any(d in cmd for d in ["rm -rf /", "sudo", "shutdown"]):
return "Error: Dangerous command"
try:
r = subprocess.run(
cmd, shell=True, cwd=WORKDIR,
capture_output=True, text=True, timeout=60
)
return ((r.stdout + r.stderr).strip() or "(no output)")[:50000]
except Exception as e:
return f"Error: {e}"


def run_read(path: str, limit: int = None) -> str:
"""Read file contents."""
try:
lines = safe_path(path).read_text().splitlines()
if limit:
lines = lines[:limit]
return "\n".join(lines)[:50000]
except Exception as e:
return f"Error: {e}"


def run_write(path: str, content: str) -> str:
"""Write content to file."""
try:
fp = safe_path(path)
fp.parent.mkdir(parents=True, exist_ok=True)
fp.write_text(content)
return f"Wrote {len(content)} bytes to {path}"
except Exception as e:
return f"Error: {e}"


def run_edit(path: str, old_text: str, new_text: str) -> str:
"""Replace exact text in file."""
try:
fp = safe_path(path)
text = fp.read_text()
if old_text not in text:
return f"Error: Text not found in {path}"
fp.write_text(text.replace(old_text, new_text, 1))
return f"Edited {path}"
except Exception as e:
return f"Error: {e}"


def run_todo(items: list) -> str:
"""Update the todo list."""
# TodoWrite 的执行端:更新 TODO.items 并返回渲染文本。
try:
return TODO.update(items)
except Exception as e:
return f"Error: {e}"


# =============================================================================
# Subagent Execution - The heart of v3
# =============================================================================

def run_task(description: str, prompt: str, agent_type: str) -> str:
"""
Execute a subagent task with isolated context.

This is the core of the subagent mechanism:

1. Create isolated message history (KEY: no parent context!)
2. Use agent-specific system prompt
3. Filter available tools based on agent type
4. Run the same query loop as main agent
5. Return ONLY the final text (not intermediate details)

The parent agent sees just the summary, keeping its context clean.

Progress Display:
----------------
While running, we show:
[explore] find auth files ... 5 tools, 3.2s

This gives visibility without polluting the main conversation.

中文说明(为什么 v3 能“隔离上下文”):
- 子代理不是在同一个 messages 上继续聊,而是新建 sub_messages = [{"role":"user","content": prompt}]。
- 因此子代理看不到父代理的历史,也不会把“探索过程的细节输出”塞进父代理上下文。
- 父代理只得到最终总结文本(一个短 block.text),从而保留 token 给真正要做的事(写代码/输出结论)。
"""
if agent_type not in AGENT_TYPES:
return f"Error: Unknown agent type '{agent_type}'"

config = AGENT_TYPES[agent_type]

# 子代理的 system prompt:包含类型、工作目录、以及该类型的专用约束/任务风格。
# 例如 explore 强调“只读 + 总结”,plan 强调“输出编号计划且不修改”。
sub_system = f"""You are a {agent_type} subagent at {WORKDIR}.

{config["prompt"]}

Complete the task and return a clear, concise summary."""

# 关键:根据 agent_type 过滤工具。
# 例如 explore/plan 只有 bash + read_file;code 则拥有全部 BASE_TOOLS(但没有 Task)。
sub_tools = get_tools_for_agent(agent_type)

# ISOLATED message history - this is the key!
# The subagent starts fresh, doesn't see parent's conversation
# 注意:这里的 prompt 是子代理“看到的用户问题”,父代理的 messages 不会被继承。
sub_messages = [{"role": "user", "content": prompt}]

# Progress tracking
print(f" [{agent_type}] {description}")
start = time.time()
tool_count = 0

# 子代理也跑同样的“工具调用循环”:
# - response.stop_reason == "tool_use" 时执行工具并回传 tool_result
# - 否则结束,提取最终文本作为“总结”返回给父代理
#
# 这里不打印模型的自然语言内容到主会话,只显示一行进度条,避免“过程输出污染”。
while True:
response = client.messages.create(
model=MODEL,
system=sub_system,
messages=sub_messages,
tools=sub_tools,
max_tokens=8000,
)

if response.stop_reason != "tool_use":
break

tool_calls = [b for b in response.content if b.type == "tool_use"]
results = []

for tc in tool_calls:
tool_count += 1
output = execute_tool(tc.name, tc.input)
results.append({
"type": "tool_result",
"tool_use_id": tc.id,
"content": output
})

# Update progress line (in-place)
# 通过 \r 覆盖当前行,实现“单行刷新”的进度显示:
# 父代理能看到子任务在跑,但不会把每次工具输出都刷到屏幕上。
elapsed = time.time() - start
sys.stdout.write(
f"\r [{agent_type}] {description} ... {tool_count} tools, {elapsed:.1f}s"
)
sys.stdout.flush()

sub_messages.append({"role": "assistant", "content": response.content})
sub_messages.append({"role": "user", "content": results})

# Final progress update
elapsed = time.time() - start
sys.stdout.write(
f"\r [{agent_type}] {description} - done ({tool_count} tools, {elapsed:.1f}s)\n"
)

# 只把最终的 text block 作为子代理结果返回。
# 父代理把它当作 tool_result.content 放回自己的上下文,实现“只带总结,不带过程”。
for block in response.content:
if hasattr(block, "text"):
return block.text

return "(subagent returned no text)"


def execute_tool(name: str, args: dict) -> str:
"""Dispatch tool call to implementation."""
# 工具分发器:把模型的 tool_use.name 映射到具体执行函数。
# 其中 Task 是 v3 新增,会触发 run_task(...) 生成子代理并返回其总结。
if name == "bash":
return run_bash(args["command"])
if name == "read_file":
return run_read(args["path"], args.get("limit"))
if name == "write_file":
return run_write(args["path"], args["content"])
if name == "edit_file":
return run_edit(args["path"], args["old_text"], args["new_text"])
if name == "TodoWrite":
return run_todo(args["items"])
if name == "Task":
return run_task(args["description"], args["prompt"], args["agent_type"])
return f"Unknown tool: {name}"


# =============================================================================
# Main Agent Loop
# =============================================================================

def agent_loop(messages: list) -> list:
"""
Main agent loop with subagent support.

Same pattern as v1/v2, but now includes the Task tool.
When model calls Task, it spawns a subagent with isolated context.

中文说明(主代理与子代理如何协作):
- 主代理拥有 ALL_TOOLS(包含 Task),因此可以“发起子任务”。
- 子代理没有 Task(见 get_tools_for_agent),因此不会无限递归。
- 主代理把子代理的总结当作普通 tool_result 回到 messages 中,让模型基于总结继续推进主任务。
"""
while True:
response = client.messages.create(
model=MODEL,
system=SYSTEM,
messages=messages,
tools=ALL_TOOLS,
max_tokens=8000,
)

tool_calls = []
for block in response.content:
if hasattr(block, "text"):
print(block.text)
if block.type == "tool_use":
tool_calls.append(block)

if response.stop_reason != "tool_use":
messages.append({"role": "assistant", "content": response.content})
return messages

results = []
for tc in tool_calls:
# Task tool has special display handling
if tc.name == "Task":
# 子代理运行时会自己打印进度条;主代理这里只打印一个简短标题行即可。
print(f"\n> Task: {tc.input.get('description', 'subtask')}")
else:
print(f"\n> {tc.name}")

output = execute_tool(tc.name, tc.input)

# Don't print full Task output (it manages its own display)
# Task 的输出是“子代理总结文本”,通常不需要像普通工具一样打印 preview,
# 否则可能重复输出(子代理也会显示进度/完成行)。
if tc.name != "Task":
preview = output[:200] + "..." if len(output) > 200 else output
print(f" {preview}")

results.append({
"type": "tool_result",
"tool_use_id": tc.id,
"content": output
})

messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": results})


# =============================================================================
# Main REPL
# =============================================================================

def main():
print(f"Mini Claude Code v3 (with Subagents) - {WORKDIR}")
print(f"Agent types: {', '.join(AGENT_TYPES.keys())}")
print("Type 'exit' to quit.\n")

history = []

while True:
try:
user_input = input("You: ").strip()
except (EOFError, KeyboardInterrupt):
break

if not user_input or user_input.lower() in ("exit", "quit", "q"):
break

history.append({"role": "user", "content": user_input})

try:
agent_loop(history)
except Exception as e:
print(f"Error: {e}")

print()


if __name__ == "__main__":
main()

带print中间结果的源码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
#!/usr/bin/env python3
"""
v3_subagent.py - Mini Claude Code: Subagent Mechanism (~450 lines)

Core Philosophy: "Divide and Conquer with Context Isolation"
=============================================================
v2 adds planning. But for large tasks like "explore the codebase then
refactor auth", a single agent hits problems:

The Problem - Context Pollution:
-------------------------------
Single-Agent History:
[exploring...] cat file1.py -> 500 lines
[exploring...] cat file2.py -> 300 lines
... 15 more files ...
[now refactoring...] "Wait, what did file1 contain?"

The model's context fills with exploration details, leaving little room
for the actual task. This is "context pollution".

The Solution - Subagents with Isolated Context:
----------------------------------------------
Main Agent History:
[Task: explore codebase]
-> Subagent explores 20 files (in its own context)
-> Returns ONLY: "Auth in src/auth/, DB in src/models/"
[now refactoring with clean context]

Each subagent has:
1. Its own fresh message history
2. Filtered tools (explore can't write)
3. Specialized system prompt
4. Returns only final summary to parent

The Key Insight:
---------------
Process isolation = Context isolation

By spawning subtasks, we get:
- Clean context for the main agent
- Parallel exploration possible
- Natural task decomposition
- Same agent loop, different contexts

Agent Type Registry:
-------------------
| Type | Tools | Purpose |
|---------|---------------------|---------------------------- |
| explore | bash, read_file | Read-only exploration |
| code | all tools | Full implementation access |
| plan | bash, read_file | Design without modifying |

Typical Flow:
-------------
User: "Refactor auth to use JWT"

Main Agent:
1. Task(explore): "Find all auth-related files"
-> Subagent reads 10 files
-> Returns: "Auth in src/auth/login.py..."

2. Task(plan): "Design JWT migration"
-> Subagent analyzes structure
-> Returns: "1. Add jwt lib 2. Create utils..."

3. Task(code): "Implement JWT tokens"
-> Subagent writes code
-> Returns: "Created jwt_utils.py, updated login.py"

4. Summarize changes to user

Usage:
python v3_subagent.py
"""

import os
import subprocess
import sys
import time
from pathlib import Path

from dotenv import load_dotenv

load_dotenv()

try:
from anthropic import Anthropic
except ImportError:
sys.exit("Please install: pip install anthropic python-dotenv")


# =============================================================================
# Configuration
# =============================================================================

API_KEY = os.getenv("ANTHROPIC_API_KEY")
BASE_URL = os.getenv("ANTHROPIC_BASE_URL")
MODEL = os.getenv("MODEL_NAME", "claude-sonnet-4-20250514")
WORKDIR = Path.cwd()

client = Anthropic(api_key=API_KEY, base_url=BASE_URL) if BASE_URL else Anthropic(api_key=API_KEY)

def _debug_enabled():
return os.getenv("DEBUG_AGENT", "").strip().lower() in ("1", "true", "yes", "y", "on")


def _dbg(label, *values):
if not _debug_enabled():
return
print(f"\n[debug] {label}", file=sys.stderr)
for v in values:
print(v, file=sys.stderr)
print("[debug] ---", file=sys.stderr)


# =============================================================================
# Agent Type Registry - The core of subagent mechanism
# =============================================================================

AGENT_TYPES = {
# Explore: Read-only agent for searching and analyzing
# Cannot modify files - safe for broad exploration
"explore": {
"description": "Read-only agent for exploring code, finding files, searching",
"tools": ["bash", "read_file"], # No write access
"prompt": "You are an exploration agent. Search and analyze, but never modify files. Return a concise summary.",
},

# Code: Full-powered agent for implementation
# Has all tools - use for actual coding work
"code": {
"description": "Full agent for implementing features and fixing bugs",
"tools": "*", # All tools
"prompt": "You are a coding agent. Implement the requested changes efficiently.",
},

# Plan: Analysis agent for design work
# Read-only, focused on producing plans and strategies
"plan": {
"description": "Planning agent for designing implementation strategies",
"tools": ["bash", "read_file"], # Read-only
"prompt": "You are a planning agent. Analyze the codebase and output a numbered implementation plan. Do NOT make changes.",
},
}


def get_agent_descriptions() -> str:
"""Generate agent type descriptions for the Task tool."""
return "\n".join(
f"- {name}: {cfg['description']}"
for name, cfg in AGENT_TYPES.items()
)


# =============================================================================
# TodoManager (from v2, unchanged)
# =============================================================================

class TodoManager:
"""Task list manager with constraints. See v2 for details."""

def __init__(self):
self.items = []

def update(self, items: list) -> str:
validated = []
in_progress = 0

for i, item in enumerate(items):
content = str(item.get("content", "")).strip()
status = str(item.get("status", "pending")).lower()
active = str(item.get("activeForm", "")).strip()

if not content or not active:
raise ValueError(f"Item {i}: content and activeForm required")
if status not in ("pending", "in_progress", "completed"):
raise ValueError(f"Item {i}: invalid status")
if status == "in_progress":
in_progress += 1

validated.append({
"content": content,
"status": status,
"activeForm": active
})

if in_progress > 1:
raise ValueError("Only one task can be in_progress")

self.items = validated[:20]
return self.render()

def render(self) -> str:
if not self.items:
return "No todos."
lines = []
for t in self.items:
mark = "[x]" if t["status"] == "completed" else \
"[>]" if t["status"] == "in_progress" else "[ ]"
lines.append(f"{mark} {t['content']}")
done = sum(1 for t in self.items if t["status"] == "completed")
return "\n".join(lines) + f"\n({done}/{len(self.items)} done)"


TODO = TodoManager()


# =============================================================================
# System Prompt
# =============================================================================

SYSTEM = f"""You are a coding agent at {WORKDIR}.

Loop: plan -> act with tools -> report.

You can spawn subagents for complex subtasks:
{get_agent_descriptions()}

Rules:
- Use Task tool for subtasks that need focused exploration or implementation
- Use TodoWrite to track multi-step work
- Prefer tools over prose. Act, don't just explain.
- After finishing, summarize what changed."""


# =============================================================================
# Base Tool Definitions
# =============================================================================

BASE_TOOLS = [
{
"name": "bash",
"description": "Run shell command.",
"input_schema": {
"type": "object",
"properties": {"command": {"type": "string"}},
"required": ["command"],
},
},
{
"name": "read_file",
"description": "Read file contents.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"limit": {"type": "integer"}
},
"required": ["path"],
},
},
{
"name": "write_file",
"description": "Write to file.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
},
"required": ["path", "content"],
},
},
{
"name": "edit_file",
"description": "Replace text in file.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"old_text": {"type": "string"},
"new_text": {"type": "string"},
},
"required": ["path", "old_text", "new_text"],
},
},
{
"name": "TodoWrite",
"description": "Update task list.",
"input_schema": {
"type": "object",
"properties": {
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"content": {"type": "string"},
"status": {
"type": "string",
"enum": ["pending", "in_progress", "completed"]
},
"activeForm": {"type": "string"},
},
"required": ["content", "status", "activeForm"],
},
}
},
"required": ["items"],
},
},
]


# =============================================================================
# Task Tool - The core addition in v3
# =============================================================================

TASK_TOOL = {
"name": "Task",
"description": f"""Spawn a subagent for a focused subtask.

Subagents run in ISOLATED context - they don't see parent's history.
Use this to keep the main conversation clean.

Agent types:
{get_agent_descriptions()}

Example uses:
- Task(explore): "Find all files using the auth module"
- Task(plan): "Design a migration strategy for the database"
- Task(code): "Implement the user registration form"
""",
"input_schema": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Short task name (3-5 words) for progress display"
},
"prompt": {
"type": "string",
"description": "Detailed instructions for the subagent"
},
"agent_type": {
"type": "string",
"enum": list(AGENT_TYPES.keys()),
"description": "Type of agent to spawn"
},
},
"required": ["description", "prompt", "agent_type"],
},
}

# Main agent gets all tools including Task
ALL_TOOLS = BASE_TOOLS + [TASK_TOOL]


def get_tools_for_agent(agent_type: str) -> list:
"""
Filter tools based on agent type.

Each agent type has a whitelist of allowed tools.
'*' means all tools (but subagents don't get Task to prevent infinite recursion).
"""
allowed = AGENT_TYPES.get(agent_type, {}).get("tools", "*")

if allowed == "*":
return BASE_TOOLS # All base tools, but NOT Task (no recursion in demo)

return [t for t in BASE_TOOLS if t["name"] in allowed]


# =============================================================================
# Tool Implementations
# =============================================================================

def safe_path(p: str) -> Path:
"""Ensure path stays within workspace."""
path = (WORKDIR / p).resolve()
if not path.is_relative_to(WORKDIR):
raise ValueError(f"Path escapes workspace: {p}")
return path


def run_bash(cmd: str) -> str:
"""Execute shell command with safety checks."""
if any(d in cmd for d in ["rm -rf /", "sudo", "shutdown"]):
return "Error: Dangerous command"
try:
r = subprocess.run(
cmd, shell=True, cwd=WORKDIR,
capture_output=True, text=True, timeout=60
)
return ((r.stdout + r.stderr).strip() or "(no output)")[:50000]
except Exception as e:
return f"Error: {e}"


def run_read(path: str, limit: int = None) -> str:
"""Read file contents."""
try:
lines = safe_path(path).read_text().splitlines()
if limit:
lines = lines[:limit]
return "\n".join(lines)[:50000]
except Exception as e:
return f"Error: {e}"


def run_write(path: str, content: str) -> str:
"""Write content to file."""
try:
fp = safe_path(path)
fp.parent.mkdir(parents=True, exist_ok=True)
fp.write_text(content)
return f"Wrote {len(content)} bytes to {path}"
except Exception as e:
return f"Error: {e}"


def run_edit(path: str, old_text: str, new_text: str) -> str:
"""Replace exact text in file."""
try:
fp = safe_path(path)
text = fp.read_text()
if old_text not in text:
return f"Error: Text not found in {path}"
fp.write_text(text.replace(old_text, new_text, 1))
return f"Edited {path}"
except Exception as e:
return f"Error: {e}"


def run_todo(items: list) -> str:
"""Update the todo list."""
try:
return TODO.update(items)
except Exception as e:
return f"Error: {e}"


# =============================================================================
# Subagent Execution - The heart of v3
# =============================================================================

def run_task(description: str, prompt: str, agent_type: str) -> str:
"""
Execute a subagent task with isolated context.

This is the core of the subagent mechanism:

1. Create isolated message history (KEY: no parent context!)
2. Use agent-specific system prompt
3. Filter available tools based on agent type
4. Run the same query loop as main agent
5. Return ONLY the final text (not intermediate details)

The parent agent sees just the summary, keeping its context clean.

Progress Display:
----------------
While running, we show:
[explore] find auth files ... 5 tools, 3.2s

This gives visibility without polluting the main conversation.
"""
if agent_type not in AGENT_TYPES:
return f"Error: Unknown agent type '{agent_type}'"

config = AGENT_TYPES[agent_type]

# Agent-specific system prompt
sub_system = f"""You are a {agent_type} subagent at {WORKDIR}.

{config["prompt"]}

Complete the task and return a clear, concise summary."""

# Filtered tools for this agent type
sub_tools = get_tools_for_agent(agent_type)

# ISOLATED message history - this is the key!
# The subagent starts fresh, doesn't see parent's conversation
sub_messages = [{"role": "user", "content": prompt}]
_dbg("sub_messages (start)", sub_messages)

# Progress tracking
print(f" [{agent_type}] {description}")
start = time.time()
tool_count = 0

# Run the same agent loop (silently - don't print to main chat)
while True:
_dbg("sub_messages (loop start)", sub_messages)
response = client.messages.create(
model=MODEL,
system=sub_system,
messages=sub_messages,
tools=sub_tools,
max_tokens=8000,
)
_dbg("sub_response.stop_reason", getattr(response, "stop_reason", None))
_dbg("sub_response.content", getattr(response, "content", None))

if response.stop_reason != "tool_use":
break

tool_calls = [b for b in response.content if b.type == "tool_use"]
_dbg("sub_tool_calls", tool_calls)
results = []

for tc in tool_calls:
tool_count += 1
_dbg("sub_tool_use", {"id": tc.id, "name": tc.name, "input": tc.input})
output = execute_tool(tc.name, tc.input)
_dbg("sub_tool_output", output)
results.append({
"type": "tool_result",
"tool_use_id": tc.id,
"content": output
})
_dbg("sub_results (after append)", results)

# Update progress line (in-place)
elapsed = time.time() - start
sys.stdout.write(
f"\r [{agent_type}] {description} ... {tool_count} tools, {elapsed:.1f}s"
)
sys.stdout.flush()

sub_messages.append({"role": "assistant", "content": response.content})
sub_messages.append({"role": "user", "content": results})
_dbg("sub_messages (after tool results append)", sub_messages)

# Final progress update
elapsed = time.time() - start
sys.stdout.write(
f"\r [{agent_type}] {description} - done ({tool_count} tools, {elapsed:.1f}s)\n"
)

# Extract and return only the final text
# This is what the parent agent sees - a clean summary
for block in response.content:
if hasattr(block, "text"):
return block.text

return "(subagent returned no text)"


def execute_tool(name: str, args: dict) -> str:
"""Dispatch tool call to implementation."""
if name == "bash":
return run_bash(args["command"])
if name == "read_file":
return run_read(args["path"], args.get("limit"))
if name == "write_file":
return run_write(args["path"], args["content"])
if name == "edit_file":
return run_edit(args["path"], args["old_text"], args["new_text"])
if name == "TodoWrite":
return run_todo(args["items"])
if name == "Task":
return run_task(args["description"], args["prompt"], args["agent_type"])
return f"Unknown tool: {name}"


# =============================================================================
# Main Agent Loop
# =============================================================================

def agent_loop(messages: list) -> list:
"""
Main agent loop with subagent support.

Same pattern as v1/v2, but now includes the Task tool.
When model calls Task, it spawns a subagent with isolated context.
"""
while True:
_dbg("messages (loop start)", messages)
response = client.messages.create(
model=MODEL,
system=SYSTEM,
messages=messages,
tools=ALL_TOOLS,
max_tokens=8000,
)
_dbg("response.stop_reason", getattr(response, "stop_reason", None))
_dbg("response.content", getattr(response, "content", None))

tool_calls = []
for block in response.content:
if hasattr(block, "text"):
print(block.text)
if block.type == "tool_use":
tool_calls.append(block)
_dbg("tool_calls", tool_calls)

if response.stop_reason != "tool_use":
messages.append({"role": "assistant", "content": response.content})
_dbg("messages (after assistant append)", messages)
return messages

results = []
for tc in tool_calls:
# Task tool has special display handling
if tc.name == "Task":
print(f"\n> Task: {tc.input.get('description', 'subtask')}")
else:
print(f"\n> {tc.name}")

output = execute_tool(tc.name, tc.input)
_dbg("tool_use", {"id": tc.id, "name": tc.name, "input": tc.input})
_dbg("tool_output", output)

# Don't print full Task output (it manages its own display)
if tc.name != "Task":
preview = output[:200] + "..." if len(output) > 200 else output
print(f" {preview}")

results.append({
"type": "tool_result",
"tool_use_id": tc.id,
"content": output
})
_dbg("results (after append)", results)

messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": results})
_dbg("messages (after tool results append)", messages)


# =============================================================================
# Main REPL
# =============================================================================

def main():
print(f"Mini Claude Code v3 (with Subagents) - {WORKDIR}")
print(f"Agent types: {', '.join(AGENT_TYPES.keys())}")
print("Type 'exit' to quit.\n")

history = []

while True:
try:
user_input = input("You: ").strip()
except (EOFError, KeyboardInterrupt):
break

if not user_input or user_input.lower() in ("exit", "quit", "q"):
break

history.append({"role": "user", "content": user_input})
_dbg("history (after user append)", history)

try:
agent_loop(history)
_dbg("history (after agent_loop)", history)
except Exception as e:
print(f"Error: {e}")

print()


if __name__ == "__main__":
main()