You've read a dozen "AI agents are here!" posts this month. Anthropic shipped Managed Agents on April 8. Google launched ADK. OpenAI pushed their Agents SDK. Every pitch deck says "autonomous agents" like it's a magic spell. But you've never built the actual loop that powers all of them — the 50 lines of Python where a model calls a function, reads the result, and decides what to do next. Without that, every vendor demo looks equally impressive, and you can't tell a real feature from expensive gift-wrapping.
Why "Just Use a Platform" Fails You
The obvious move: pick a managed platform and let it handle everything. But here's the problem — you're outsourcing understanding. Anthropic's Managed Agents charges $0.08 per session-hour. OpenAI's Agents SDK has its own abstractions. Google ADK uses graph-based workflows. Each platform wraps the same core pattern in different opinions about memory, permissions, and billing — but if you don't know the pattern itself, you can't evaluate which opinions matter for YOUR use case. A 2024 MIT Sloan Management Review analysis of enterprise AI projects puts the pilot-to-production success rate for agentic AI at roughly 5%. The failure mode isn't the framework. It's teams who don't understand what's underneath.
So let's build the thing.
The Recipe: 50 Lines to X-Ray Vision
Step 0: Install the SDK
As of April 27, 2026, the latest Anthropic Python SDK — a toolkit that lets your Python code talk to Claude's brain — is v0.97.0. One command:
pip install anthropic
Set your API key — a secret password that identifies your account to Anthropic's servers:
export ANTHROPIC_API_KEY="sk-ant-..."
Step 1: Define Your Tools
Tools are functions Claude can ask you to run. You describe them using JSON Schema — a standardized format for saying "this function takes these inputs in these shapes." Claude reads these descriptions like a restaurant menu and picks what to order.
Critical insight: descriptions matter more than names. Claude decides when to call a tool based on what the description says. Vague description = unpredictable behavior.
/faion is the faion-network umbrella skill — it pulls relevant methodology from 12 specialist domains (dev, product, marketing, ops, ...) into the conversation, grounded in your current session.
/faion
What's the right way to design parameter schemas for an MCP tool that an
LLM will call autonomously? Focus on description quality, required fields,
and enum constraints.
Here's a real tool definition — a weather lookup and a calendar creator:
tools = [
{
"name": "get_weather",
"description": "Get current weather for a city. Use when the user asks about weather, temperature, or outdoor conditions.",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name, e.g. 'New York'"},
"units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
},
{
"name": "create_event",
"description": "Create a calendar event. Use when the user wants to schedule, book, or plan something.",
"input_schema": {
"type": "object",
"properties": {
"title": {"type": "string"},
"start": {"type": "string", "format": "date-time"},
"end": {"type": "string", "format": "date-time"}
},
"required": ["title", "start", "end"]
}
}
]
Notice: required controls behavior. Skip a required field and watch Claude hallucinate values. Over-require fields and valid calls get blocked.
Step 2: Build the Execution Layer
This is where YOUR code runs. Claude never executes anything itself — it sends a polite request like "please run get_weather with {city: 'Chicago'}", and your code does the actual work:
def run_tool(name, input_data):
if name == "get_weather":
# In production, you'd call a weather API here
return {"temp": 72, "condition": "sunny", "city": input_data["city"]}
if name == "create_event":
return {"event_id": "evt_456", "status": "created", "title": input_data["title"]}
return {"error": f"Unknown tool: {name}"}
Step 3: The Loop — Where Agency Happens
This is the part every platform wraps in thousands of lines of abstraction. The agentic loop — the cycle where Claude calls tools, reads results, and decides its next move — is roughly 20 lines:
import json
import anthropic
client = anthropic.Anthropic()
messages = [{"role": "user", "content": "What's the weather in Chicago? If it's nice, schedule a picnic tomorrow at noon."}]
response = client.messages.create(
model="claude-sonnet-4-20250514", max_tokens=1024,
tools=tools, messages=messages
)
while response.stop_reason == "tool_use":
# Claude may request MULTIPLE tools at once — process ALL of them
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = run_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)
})
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
response = client.messages.create(
model="claude-sonnet-4-20250514", max_tokens=1024,
tools=tools, messages=messages
)
print(response.content[0].text)
That prompt triggers TWO autonomous tool calls: first get_weather (to check Chicago), then create_event (to schedule the picnic) — and Claude chains them without you telling it to. This is the "aha" moment: the model reads the weather result, decides the weather is nice, and calls the next tool on its own.
Every agent platform is this loop plus opinions about memory, permissions, billing, and deployment layered on top.
Step 4: Error Handling — Don't Crash, Teach
When a tool fails, don't crash your loop. Send the error back with is_error: True so Claude can adapt — retry with different input, explain the problem, or try a different approach:
/faion
What error handling patterns work best in agentic tool-use loops —
specifically for recoverable errors vs fatal ones? When should the
agent retry vs bail out?
def run_tool_safe(name, input_data):
try:
return run_tool(name, input_data), False
except Exception as e:
return {"error": str(e)}, True
# Inside the loop, replace the tool_results construction:
result, is_err = run_tool_safe(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result),
"is_error": is_err
})
Claude reads the error and responds intelligently: "I tried to schedule the picnic but the calendar only allows events during business hours. Want me to try Saturday morning instead?"
Step 5: Add a Safety Fuse
A runaway loop burns tokens — tiny word-chunks that Claude processes, roughly ¾ of an English word each — until you kill it. Add a counter:
MAX_ITERATIONS = 10
iteration = 0
while response.stop_reason == "tool_use" and iteration < MAX_ITERATIONS:
iteration += 1
# ... same loop body ...
if iteration >= MAX_ITERATIONS:
print("Loop hit safety limit. Something's probably wrong.")
No managed platform will tell you this is the first thing to add. They handle it silently — and charge you for the privilege.
Gotchas That Bite in Production
1. The "first tool only" bug. Claude can return multiple tool_use blocks in one response — checking weather in three cities simultaneously, for instance. If you only process the first block (common in tutorials), the model gets confused by missing results. Symptom: Claude keeps repeating the same request. Fix: iterate ALL blocks, return ALL results in one message. The official tutorial calls this the Ring 2 → Ring 3 progression.
2. Context window bloat. The context window — how much text Claude can "see" at once, like working memory — fills up fast. Every tool call and result accumulates. A verbose command output can eat thousands of tokens in one turn. The Agent SDK handles compaction automatically; the raw SDK does not. Fix: summarize large outputs before feeding them back, or truncate to the relevant parts.
3. No sandboxing. Your run_tool function runs with YOUR permissions. If Claude asks to delete a file and your tool obeys, the file is gone. According to Momentic's analysis, their team learned this the hard way over 3 months of autonomous agent runs: "sandboxing isn't optional... things get out of hand faster than you think."
4. Missing required fields. If your schema says required: ["city"] but a user says "what's the weather?", Claude will hallucinate a city rather than ask. Fix: make the description explicit — "If the user doesn't specify a city, ask them instead of guessing."
What You Can Do Now
You have a working agentic loop. Every vendor announcement from here maps directly to it. Managed Agents? That's your loop plus checkpointing, monitoring, and crash recovery — for $0.08/session-hour. OpenAI's Agents SDK vs Google ADK? Same loop, different handoff and graph-based opinions.
The next platform pitch lands on your desk. Instead of trusting the demo, you ask: which layer of MY loop does this replace — and is that layer actually my bottleneck?





