अपना पहला AI Agent एक शाम में कैसे बनाएं

तुम रोज़ ChatGPT इस्तेमाल करते हो। Text paste करो, जवाब आया, productive feel हुआ। लेकिन 2026 की हर tech conference में एक शब्द बार-बार सुनाई दे रहा है: agents। तुम्हारा PM बोलता है "हमें एक agent चाहिए।" CTO बोलता है "agents ही future हैं।" LinkedIn पर agent के thinkpieces की बाढ़ आई हुई है। और तुम बैठे सोच रहे हो: "भाई, मुझे तो पता ही नहीं ये है क्या।"

यहाँ gap है। Chatbot तुम्हारे type करने का wait करता है और reply देता है — जैसे किसी smart दोस्त को WhatsApp पर message करना। AI agent अलग चीज़ है। Agent के पास एक goal होता है, वो खुद अपने tools चुनता है, errors handle करता है, और तब तक काम करता रहता है जब तक job पूरा न हो जाए या decide न कर ले कि ये हो नहीं सकता। कोई उसका हाथ नहीं पकड़ता। फ़र्क वही है — किसी से सवाल पूछना vs किसी को काम पर रखना।

आज रात, तुम ये gap बंद करते हो। तुम एक real agent बनाओगे — Python में, scratch से, बिना किसी framework के — जो web search करेगा, information analyze करेगा, decisions लेगा, और एक research report disk पर save करेगा। सोने से पहले तुम्हें वो exact pattern समझ आ जाएगा जो Claude Code, Codex, Devin, और हर दूसरे agent product को power करता है जो ₹17,000/month charge कर रहे हैं। Pattern खुद करीब 30 lines का है।

क्या बना रहे हैं

एक Research Agent जो:

तुमसे एक topic लेता है
Web पर relevant information search करता है
जो मिलता है उसे पढ़कर analyze करता है
एक structured research summary लिखता है
Result को file में save करता है

ये कोई toy demo नहीं है। यही architecture production agents में use होता है — tool use, reasoning loops, structured output। इसमें और एक "production agent" में बस error handling और scale का फ़र्क है।

Step 1: Project setup करो (10 minutes)

mkdir research-agent && cd research-agent
python3 -m venv venv
source venv/bin/activate

pip install anthropic httpx

दो dependencies:

anthropic — Python SDK (software development kit — Claude के API से बात करने के लिए एक pre-built library)
httpx — Python से web requests भेजने के लिए

तुम्हें एक Anthropic API key भी चाहिए — basically एक password जो तुम्हारे code को Claude से बात करने देता है। console.anthropic.com से ले लो। नए accounts को $5 free credits मिलते हैं, जो इस agent को सैकड़ों बार run करने के लिए काफ़ी हैं।

export ANTHROPIC_API_KEY=sk-ant-...
touch agent.py

Step 2: Tools define करो (15 minutes)

Bina tools के agent बस एक chatbot है। Tools वो functions हैं जो agent को real world से interact करने देते हैं — web search करो, files पढ़ो, APIs call करो (APIs — programs internet पर एक-दूसरे से कैसे बात करते हैं, machine-to-machine messaging समझो)।

हम अपने agent को दो tools देंगे:

# agent.py

import anthropic
import httpx
import json
import os
from datetime import datetime

client = anthropic.Anthropic()
MODEL = "claude-haiku-4.5"

# Tool definitions — ये Claude को बताते हैं क्या-क्या available है
tools = [
    {
        "name": "web_search",
        "description": "Search the web for information on a topic. Returns results with titles, URLs, and snippets.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "The search query"
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "save_file",
        "description": "Save text content to a file on disk.",
        "input_schema": {
            "type": "object",
            "properties": {
                "filename": {
                    "type": "string",
                    "description": "Name of the file to save"
                },
                "content": {
                    "type": "string",
                    "description": "Content to write to the file"
                }
            },
            "required": ["filename", "content"]
        }
    }
]

ये definitions restaurant के menu की तरह काम करती हैं। Claude descriptions पढ़ता है और decide करता है कब कौन सा tool use करना है — तुम order hard-code नहीं करते। input_schema वाला part JSON Schema use करता है — data को describe करने का एक standard format, ताकि Claude को पता रहे कि हर tool को exactly कौन से parameters चाहिए। हाँ, तुम अपने data format को describe करने के लिए एक और data format use करते हो। Programming में welcome है।

Step 3: Tools implement करो (15 minutes)

Definitions Claude को बताती हैं क्या exist करता है। अब हम वो code लिखते हैं जो actually run होता है जब Claude कोई tool call करता है। यहाँ theory की practice से मुलाक़ात होती है — या ज़्यादा सही कहें तो, तुम्हारे शानदार abstractions का HTML parsing से सामना होता है जैसे 2003 चल रहा हो:

def execute_tool(tool_name: str, tool_input: dict) -> str:
    """Execute a tool and return the result as a string."""
    if tool_name == "web_search":
        return do_web_search(tool_input["query"])
    elif tool_name == "save_file":
        return do_save_file(tool_input["filename"], tool_input["content"])
    else:
        return f"Error: Unknown tool '{tool_name}'"


def do_web_search(query: str) -> str:
    """Search using DuckDuckGo's HTML endpoint. No API key needed."""
    try:
        response = httpx.get(
            "https://html.duckduckgo.com/html/",
            params={"q": query},
            headers={"User-Agent": "ResearchAgent/1.0"},
            timeout=10.0,
        )
        response.raise_for_status()

        text = response.text
        results = []
        parts = text.split('class="result__snippet"')
        for part in parts[1:6]:  # 5 results तक grab करो
            snippet_end = part.find("</a>")
            if snippet_end > 0:
                snippet = part[:snippet_end]
                clean = snippet.replace("<b>", "").replace("</b>", "")
                clean = clean.split(">")[-1] if ">" in clean else clean
                if clean.strip():
                    results.append(clean.strip())

        if results:
            return "Search results:\n" + "\n".join(
                f"- {r}" for r in results
            )
        return f"Search completed but no clear results for: {query}"

    except Exception as e:
        return f"Search error: {str(e)}"


def do_save_file(filename: str, content: str) -> str:
    """Save content to the output directory."""
    os.makedirs("output", exist_ok=True)
    filepath = os.path.join("output", filename)
    try:
        with open(filepath, "w") as f:
            f.write(content)
        return f"File saved successfully: {filepath}"
    except Exception as e:
        return f"Error saving file: {str(e)}"

Web search DuckDuckGo का HTML endpoint use करता है — कोई API key नहीं, कोई signup नहीं, कोई cost नहीं। HTML parsing jugaad से चल रही है (raw page markup scrape कर रहे हैं proper data feed की जगह), लेकिन काम चलता है। Production के लिए Brave Search API (2,000 free queries/month) या self-hosted SearXNG लगा लेना।

Step 4: Agent loop बनाओ (20 minutes)

ये पूरे काम का दिल है। हर agent product जिसके पास flashy landing page और $50M की valuation है, इन्हीं 30 lines का कोई न कोई version run कर रहा है:

def run_agent(topic: str, max_turns: int = 10) -> str:
    """Run the research agent on a topic."""
    print(f"\n{'='*60}")
    print(f"Research Agent — Topic: {topic}")
    print(f"{'='*60}\n")

    system_prompt = """You are a research agent. Your job is to research a topic
thoroughly and produce a well-structured summary.

Your process:
1. Search for information on the topic (multiple searches with different angles)
2. Analyze what you find
3. Write a comprehensive research summary
4. Save the summary to a file

Be thorough — do at least 3 different searches to cover the topic well.
Be critical — evaluate sources and note conflicting information.
When done, save the final summary as a markdown file.

Current date: """ + datetime.now().strftime("%Y-%m-%d")

    messages = [
        {
            "role": "user",
            "content": f"Research this topic and produce a detailed summary: {topic}"
        }
    ]

    # The agent loop
    for turn in range(max_turns):
        print(f"--- Turn {turn + 1} ---")

        response = client.messages.create(
            model=MODEL,
            max_tokens=4096,
            system=system_prompt,
            tools=tools,
            messages=messages,
        )

        print(f"Stop reason: {response.stop_reason}")

        if response.stop_reason == "tool_use":
            tool_results = []

            for block in response.content:
                if block.type == "tool_use":
                    tool_name = block.name
                    tool_input = block.input
                    tool_id = block.id

                    print(f"  Tool: {tool_name}")
                    print(f"  Input: {json.dumps(tool_input, indent=2)[:200]}")

                    result = execute_tool(tool_name, tool_input)
                    print(f"  Result: {result[:200]}...")

                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": tool_id,
                        "content": result,
                    })

            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})

        elif response.stop_reason == "end_turn":
            final_text = ""
            for block in response.content:
                if hasattr(block, "text"):
                    final_text += block.text

            print(f"\nAgent completed in {turn + 1} turns.")
            return final_text

    return "Agent did not complete within turn limit."

इस loop को तोड़कर समझो, क्योंकि यही पूरा जादू है:

भेजो — task Claude को tool definitions के साथ भेजो
Claude सोचता है — और decide करता है कि tool use करना है या final text देना है
अगर tool_use — हम tool execute करते हैं और result नए message के रूप में वापस भेजते हैं
Claude result देखता है — और अगला move decide करता है
Repeat — जब तक Claude end_turn न बोले, मतलब काम ख़त्म

Critical insight: तुमने "पहले search, फिर analyze, फिर लिखो" hard-code नहीं किया। Claude task के हिसाब से workflow खुद figure out करता है। यही agent को script से अलग बनाता है। Script तुम्हारी instructions follow करता है। Agent अपनी follow करता है।

stop_reason field key है। Claude का API या तो "tool_use" return करता है (मुझे tool call करना है) या "end_turn" (मेरा काम हो गया)। तुम्हारा loop बस check करता है कौन सा आया और accordingly act करता है।

Step 5: Entry point add करो (5 minutes)

Boring part। लेकिन boring parts भी exist करने ज़रूरी हैं, वरना कुछ run नहीं होता — ये सबक GitHub पर आधे AI demo repos ने अभी तक नहीं सीखा:

if __name__ == "__main__":
    import sys

    if len(sys.argv) > 1:
        topic = " ".join(sys.argv[1:])
    else:
        topic = input("Enter research topic: ")

    result = run_agent(topic)

    print(f"\n{'='*60}")
    print("Research complete. Check the output/ directory.")
    print(f"{'='*60}")

Step 6: Run करो (5 minutes)

python agent.py "current state of MCP protocol adoption in 2026"

Terminal देखो। Agent खुद problem को think through करेगा:

============================================================
Research Agent — Topic: current state of MCP protocol adoption in 2026
============================================================

--- Turn 1 ---
Stop reason: tool_use
  Tool: web_search
  Input: {"query": "MCP model context protocol adoption 2026"}
  Result: Search results: - The MCP ecosystem has grown...

--- Turn 2 ---
Stop reason: tool_use
  Tool: web_search
  Input: {"query": "MCP servers enterprise production 2026"}
  Result: Search results: - Amazon Bedrock AgentCore...

--- Turn 3 ---
Stop reason: tool_use
  Tool: web_search
  Input: {"query": "MCP protocol limitations challenges 2026"}
  Result: Search results: - Stateful sessions fight with...

--- Turn 4 ---
Stop reason: tool_use
  Tool: save_file
  Input: {"filename": "mcp-research-2026.md", "content": "# MCP Protocol..."}
  Result: File saved successfully: output/mcp-research-2026.md

--- Turn 5 ---
Stop reason: end_turn

Agent completed in 5 turns.

पाँच turns। तीन searches, एक file save, एक final summary। किसी ने इसे बोला नहीं कि अलग-अलग angles से search करो — इसने खुद decide किया। यही agency है, scripting नहीं।

Step 7: Smart बनाओ (30 minutes)

Basic agent चल रहा है। अब तीन upgrades जो इसे demo से उस चीज़ में बदल देंगे जो तुम actually इस्तेमाल करते रहोगे।

Sessions के बीच Memory

अभी हर run zero से शुरू होता है। Agent को एक simple memory दे दो — एक JSON file (एक structured text format जो programs आसानी से read और write कर सकते हैं) जो store करे पहले क्या research हुआ:

from pathlib import Path

MEMORY_FILE = "memory.json"

def load_memory() -> list:
    """Load previous research topics and findings."""
    if Path(MEMORY_FILE).exists():
        with open(MEMORY_FILE) as f:
            return json.load(f)
    return []

def save_memory(topic: str, summary: str):
    """Save this research session to memory."""
    memory = load_memory()
    memory.append({
        "date": datetime.now().isoformat(),
        "topic": topic,
        "summary": summary[:500],
    })
    memory = memory[-20:]  # Last 20 entries रखो
    with open(MEMORY_FILE, "w") as f:
        json.dump(memory, f, indent=2)

Memory को system prompt में inject करो — वो instruction text जो Claude के behavior को shape करता है:

memory = load_memory()
if memory:
    memory_context = "\n\nPrevious research sessions:\n"
    for m in memory[-5:]:
        memory_context += f"- [{m['date'][:10]}] {m['topic']}: {m['summary'][:100]}...\n"
    system_prompt += memory_context

अब agent को पता है पहले क्या research हुआ था। वो past findings reference कर सकता है, duplicate searches avoid कर सकता है, और previous work पर build कर सकता है।

Thinking tool

ये एक trick है। एक ऐसा tool add करो जो literally कुछ करता ही नहीं:

tools.append({
    "name": "think",
    "description": "Use this tool to think through your approach before acting. Write out your reasoning and what you need to find out next.",
    "input_schema": {
        "type": "object",
        "properties": {
            "thought": {
                "type": "string",
                "description": "Your reasoning and plan"
            }
        },
        "required": ["thought"]
    }
})

Tool executor में बस confirmation return करता है:

elif tool_name == "think":
    print(f"  Thinking: {tool_input['thought'][:300]}")
    return "Thought recorded. Continue with your plan."

एक ऐसा tool क्यों add करें जो कुछ करता ही नहीं? क्योंकि ये agent को act करने से पहले सोचने की एक structured जगह देता है। इसके बिना, Claude सीधे tool calls पर कूद जाता है। इसके साथ, Claude रुकता है, plan करता है, फिर execute करता है — और noticeably बेहतर results देता है। Anthropic इस technique को अपने tool use guide में document करता है, और production agents इस पर depend करते हैं।

Error recovery

def execute_tool_safe(tool_name: str, tool_input: dict) -> str:
    """Execute a tool with automatic retries."""
    for attempt in range(3):
        try:
            result = execute_tool(tool_name, tool_input)
            if "error" in result.lower() and attempt < 2:
                print(f"  Retry {attempt + 1}...")
                continue
            return result
        except Exception as e:
            if attempt < 2:
                print(f"  Error, retrying: {e}")
                continue
            return f"Tool failed after 3 attempts: {str(e)}"

Web requests fail होते हैं। APIs down जाते हैं। Timeouts होते हैं। Production agent give up करने से पहले retry करता है। तीन attempts with fallback — ये baseline है।

Final structure

research-agent/
├── agent.py          # ~150 lines Python
├── memory.json       # Auto-created, session history store करता है
├── output/           # Auto-created, research reports store करता है
│   └── *.md
└── requirements.txt  # anthropic, httpx

200 lines से कम। दो dependencies। Zero frameworks।

Framework क्यों नहीं?

तुम सोच रहे होगे: "LangChain या LlamaIndex क्यों नहीं use करें?" (दोनों popular Python frameworks हैं जो LLM calls के ऊपर pre-built abstractions add करते हैं।)

क्योंकि ऊपर वाला agent loop 30 lines का है। LangChain 15 dependencies और तीन layers of abstraction add कर देगा same result के लिए।

Framework use करो जब:

तुम्हें 10+ tools चाहिए complex routing logic के साथ
तुम्हें conversation memory चाहिए जो हज़ारों users में scale करे
तुम्हें multiple agents चाहिए जो एक ही task पर coordinate करें
तुम simple Python से आगे निकल चुके हो और किसी और की architecture चाहिए

Framework छोड़ो जब:

तुम अपना पहला agent बना रहे हो
तुम्हारे agent में 2-5 tools हैं
तुम हर line समझना चाहते हो जो run हो रही है
"चलता है और समझ आता है" "चलता है और मुझे abstraction पर भरोसा है" से बेहतर है

March 2026 तक, Anthropic SDK documentation वही bare-bones loop pattern दिखाता है जो हमने अभी बनाया। Official recommendation है कि बिना framework के शुरू करो।

आज रात तुमने क्या बनाया

Inventory लो:

एक working AI agent — goal लेता है, autonomously pursue करता है
Tool use — agent external systems call करता है (web search, file I/O)
एक reasoning loop — Claude next action results के basis पर decide करता है, hard-coded script से नहीं
Memory — agent past sessions याद रखता है और उन पर build करता है
Error handling — fail करने से पहले retry करता है
Output persistence — results actual files में तुम्हारी disk पर land होते हैं

यही core architecture है Claude Code, Devin, OpenAI Codex, और हर दूसरे agent product की। उनके पास बेहतर tools हैं, ज़्यादा error handling है, और बड़े context windows हैं — model एक बार में कितना text "देख" सकता है, जैसे उसकी working memory। लेकिन loop वही है जो तुमने अभी लिखा।

आगे कहाँ जाना है

तुम्हें अब fundamental pattern समझ आ गया। बाकी सब इसके ऊपर engineering है:

ज़्यादा tools — calculator, web scraper, database connector, code executor
बेहतर memory — vector databases (systems जो text को meaning के हिसाब से store करते हैं, सिर्फ़ keywords से नहीं) past sessions में semantic search के लिए
Parallel tool calls — एक के बाद एक के बजाय multiple searches एक साथ चलाओ
Multi-agent systems — एक दूसरा agent जो पहले agent का काम review करे, जैसे code review
MCP integration — Model Context Protocol, AI agents को external tools से connect करने का standard, USB जैसा लेकिन data sources के लिए

तुम किसी का framework नहीं सीख रहे जो छह महीने में मर सकता है। तुम pattern सीख रहे हो। वही pattern जो 2024 में काम करता था, 2026 में करता है, और 2028 में करेगा — क्योंकि underlying mechanic (model decide करता है → tool execute होता है → result वापस जाता है) ऐसे ही सब agent systems काम करते हैं, चाहे marketing में कोई भी नाम दे दो।

"AI agent" industry चाहती है कि तुम मानो कि agents बनाने के लिए PhD और $100M Series B चाहिए। असल में एक loop समझना है: model को call करो, check करो कि tool चाहिए कि नहीं, tool execute करो, result वापस भेजो, repeat। बस। बाकी सब उस loop के ऊपर engineering है — और अब तुम्हें इतना आता है कि वो engineering खुद कर सको।