हर Agent Platform usage-based billing करता है। किसी ने Kill Switch नहीं दिया।

तुमने एक autonomous agent deploy किया — एक AI जो बिना तुम्हारे बटन दबाए खुद टास्क चलाता रहता है — रात भर support tickets प्रोसेस करने के लिए। तुम सो गए। Agent काम करता रहा। मीटर चलता रहा। कोई देख नहीं रहा था।

यही सेटअप है। अब प्रॉब्लम सुनो: "autonomous" और "metered" का कॉम्बो बराबर है एक ऐसी credit line के जिसकी कोई लिमिट नहीं, और तुम्हारी finance टीम को तो किसी ने बताया भी नहीं।

हर Vendor में एक ही छेद है

एक ही हफ्ते में — 8 से 15 अप्रैल, 2026 — Anthropic और OpenAI दोनों ने production-grade agent runtimes लॉन्च कर दिए, ऐसे environments जहाँ AI agents अपने आप चलते हैं, consumption-based billing के साथ और per-session spending cap बिल्कुल ज़ीरो। Google के पास तो महीनों से यही gap था। तीन vendors, एक ही अंधा spot:

Anthropic ने 8 अप्रैल को Managed Agents लॉन्च किए — $0.08 प्रति session-hour प्लस token costs (tokens — वो word-chunks जो AI पढ़ता है, लगभग एक English शब्द का ¾)। 14 अप्रैल को Claude Code Routines आया daily run limits के साथ (Pro के लिए 5, Max के लिए 15, Teams के लिए 25) — लेकिन प्रति run कोई dollar ceiling नहीं।
OpenAI ने 15 अप्रैल को अपना Agents SDK अपडेट किया नई safety features के साथ। SDK में token counters तो हैं, लेकिन कोई max_cost_usd parameter नहीं। बस एक spending cap है — organization-wide monthly limit — एक नंबर जो सब users और products में शेयर होता है।
Google अपने Vertex AI Agent Engine की pricing रखता है — जो दिसंबर 2025 में GA हुआ और फरवरी 2026 से billing शुरू हुई — $0.0864 प्रति vCPU-hour (vCPU — cloud में एक virtual processor slice) और कोई session-level cutoff documented नहीं। ये बाकी दोनों से पहले से बिना spending guardrail के चल रहा है।

हर platform request rate cap करता है अपने infrastructure को बचाने के लिए। कोई भी spending cap नहीं करता तुम्हारा पैसा बचाने के लिए।

वो Structural Incentive जिसकी कोई बात नहीं करता

Usage-based billing में, एक अटका हुआ agent जो तीन घंटे एक ही failed API call retry करता रहे — वो उतना ही revenue generate करता है जितना एक productive agent। Native kill switch बनाना — यानी एक circuit breaker (ऐसा mechanism जो threshold hit होने पर execution अपने आप रोक दे) — मतलब अपना revenue खुद cap करना। Incentive का गणित बेरहम है।

ये theoretical नहीं है। DEV Community की 23 मार्च की एक रिपोर्ट में चार LangChain agents (LangChain — AI agent chains बनाने का एक popular framework) एक recursive feedback loop में फँस गए, 11 दिन तक। बिल: $47,000। Detection method: एक इंसान ने invoice पढ़ा। कोई alert नहीं। Invoice। हाथ से।

एक अलग RunCycles analysis — 18 मार्च में एक GPT-4o research agent retry loop में घुस गया — एक घंटे में 200+ calls, एक single run का $1,400।

DIY Tax

Workarounds हैं। Python में एक bare-minimum cost guardrail कुछ ऐसा दिखता है:

import time

class AgentBudget:
    def __init__(self, max_usd: float = 5.0, cost_per_1k_tokens: float = 0.005):
        self.max_usd = max_usd
        self.cost_per_1k = cost_per_1k_tokens
        self.total_tokens = 0

    def track(self, tokens_used: int):
        self.total_tokens += tokens_used
        spent = (self.total_tokens / 1000) * self.cost_per_1k
        if spent >= self.max_usd:
            raise RuntimeError(f"Budget exceeded: ${spent:.2f} >= ${self.max_usd}")
        return spent

budget = AgentBudget(max_usd=10.0)
# हर LLM call को wrap करो:
spent = budget.track(tokens_used=3200)

Third-party proxies जैसे Helicone और Portkey dashboards और virtual keys offer करते हैं budget limits के साथ। लेकिन हर workaround वही oversight layer add करता है जिसे autonomous agents ने खत्म करना था।

जैसा कि PYMNTS ने 15 अप्रैल को रिपोर्ट किया, Anthropic ने enterprise billing flat-rate से usage-based में बदल दी। Redress Compliance के co-founder Fredrik Filipsson का अनुमान है कि इससे "heavy users की cost दोगुनी या तिगुनी" हो जाएगी। और ज़्यादा usage-based billing, लेकिन per-session budget का कोई बटन अभी भी नहीं।

तुम्हारे लिए इसका मतलब क्या है

आज तुम जो भी autonomous agent deploy करते हो, वो एक ऐसा process है जिसके पास तुम्हारे billing account का root access है और कोई sudo equivalent नहीं। Architecture decision साफ़ है: कभी भी बिना अपने code में cost wrapper लगाए agent deploy मत करो। SDK में max_cost_usd आने का इंतज़ार मत करो — वो parameter उसी दिन ship होगा जिस दिन किसी का पाँच अंकों वाला invoice X पर viral होगा, उससे पहले नहीं।

Cloud billing horror story जो इस feature को force करेगी — वो hypothetical नहीं है। बस कब का सवाल है। सिर्फ ये decide होना बाकी है कि किसका credit card ये सबक फंड करेगा।

हर Agent Platform usage-based billing करता है। किसी ने Kill Switch नहीं दिया।

हर Vendor में एक ही छेद है

वो Structural Incentive जिसकी कोई बात नहीं करता

DIY Tax

तुम्हारे लिए इसका मतलब क्या है

Keep reading

Agent Paradox: कम Autonomy, ज़्यादा Value

तीन AI Memory Systems, इनके काम करने का कोई सबूत नहीं

आठ Sandboxes और वो Lock-In जिसके बारे में किसी ने नहीं बताया

तुम्हारे Agent का Permission Dialog एक Placebo है