Agent Paradox: कम Autonomy, ज़्यादा Value

हर AI vendor ने अप्रैल में तुमसे एक ही सपना बेचा: autonomous agents — छोटे-छोटे digital कर्मचारी जो code लिखेंगे, tickets बंद करेंगे, customer emails का जवाब देंगे, और तुम बस "strategy" पर focus करो। Anthropic, OpenAI, Google — तीनों ने दो हफ्तों के अंदर agent platforms ship कर दिए। तुम्हारा LinkedIn feed अब robots की job fair जैसा दिखता है।

लेकिन बात ये है जो keynote में कोई नहीं बोलता: तुम्हारे boss ने अभी Slack पर पूछा है कि तुम्हारी टीम agents कब deploy कर रही है। Demos तो शानदार हैं — एक agent bug report से pull request तक मिनटों में पहुँच जाता है, audience ऐसे ताली बजाती है जैसे उन्होंने कभी bash script नहीं देखी। पर demo खत्म होने और cameras बंद होने के बाद असल में क्या होता है?

तीन launches एक के बाद एक आए। Anthropic ने Managed Agents ship किया 8 अप्रैल को — cloud-hosted agent APIs (यानी तुम्हारा software remotely agents को spin up और control कर सकता है) $0.08 प्रति session hour पर। OpenAI ने अपना Agents SDK अपडेट किया 15 अप्रैल को native sandbox execution के साथ — agents sealed box में code run करते हैं ताकि बाहर कुछ तोड़ न सकें। और Google Cloud Next शुरू हुआ 22 अप्रैल को "The Agentic Cloud" नाम की keynote के साथ, जिसमें ADK (Agent Development Kit) showcase किया गया जो इसी महीने launch हुआ। Google ने human-in-the-loop को first-class feature बनाया day one से — agent को task के बीच रोको, human से approval लो, फिर आगे बढ़ो।

Early adopters तेज़ी से कूदे। Rakuten ने पाँच departments में specialist agents deploy किए — product, sales, marketing, finance, HR — हर एक एक हफ्ते से कम में live। Asana के CTO ने बताया कि features "dramatically faster" ship हो रहे हैं। Notion ने Claude को सीधे workspaces में plug किया parallel task handling के लिए। और Sentry? Sentry ने पूरा दाँव लगा दिया: उनका agent flagged bug से opened pull request तक — zero human intervention। पूरी तरह autonomous। Vendor का सपना साकार।

लेकिन यहाँ uncomfortable हिस्सा आता है। अगर तुमने इस महीने independent research follow की है — और इस channel ने इतनी बार cite किया है कि regular readers numbers ज़ुबानी याद कर चुके होंगे — pattern कभी नहीं बदलता। AI code 1.7 गुना ज़्यादा defects ship करता है। PRs 20% बढ़े जबकि incidents 23.5% बढ़े। Developers AI code का पाँचवाँ हिस्सा delete करते हैं और 7% और भारी rewrite करते हैं। Gartner predict करता है 40% agentic projects 2027 तक मर जाएँगे। Output ज़्यादा, नतीजे बदतर। हर एक study में।

Andrej Karpathy ने 3 अप्रैल को ही बोल दिया था — इन तीनों platforms के ship होने से पहले। "Industry बहुत बड़ी छलांग लगा रही है और दिखावा कर रही है कि ये amazing है, जबकि है नहीं।" तीन हफ्ते और तीन launches बाद, किसी ने उन्हें गलत साबित नहीं किया।

इससे marketing और reality के बीच एक structural gap बनता है। Vendors maximum autonomy पर compete करते हैं क्योंकि stage पर demo बहुत अच्छा दिखता है। लेकिन production data उल्टा कहता है: narrow scope broad capability को हराता है। Read-heavy workflows (जहाँ agents analyze करते हैं पर modify कम करते हैं) write-heavy workflows को हराते हैं। किसी भी consequential action से पहले human checkpoints full autopilot को हराते हैं। Sentry की "fully autonomous" success भी इसलिए काम करती है क्योंकि bug-triage-to-PR एक inherently constrained domain है — इसलिए नहीं कि autonomy अपने आप में जीतती है।

Google शायद ये समझता है। उनका ADK human-in-the-loop को default path बनाकर ship करता है, afterthought नहीं। जैसा SiliconANGLE के John Furrier ने 20 अप्रैल को लिखा: "Features platforms के ऊपर बैठते हैं। Operating systems platform define करते हैं।" असली competition ये नहीं है कि सबसे autonomous agent कौन बनाता है — बल्कि ये है कि सबसे अच्छा control plane कौन बनाता है।

तो जब तुम्हारा boss agents के बारे में पूछे, keynote clip forward मत करो। किसी भी platform के बारे में एक सवाल पूछो: tightly constrained agent बनाना कितना आसान है — explicit scope boundaries, read-only defaults, और किसी भी consequential action से पहले mandatory human approval? अगर जवाब है "well, configure तो कर सकते हो..." — भाग जाओ। अगर ये default है — तो शायद कुछ काम का मिल गया।

सबसे smart agent platform war नहीं जीतेगा। सबसे controllable जीतेगा। और ये हर vendor की current roadmap priority को उलट देता है।

Agent Paradox: कम Autonomy, ज़्यादा Value

Keep reading

Anthropic ने उन्हीं platforms के ऊपर अपना platform बना दिया जो उसे fund करते हैं। मकान मालिकों को अभी पता चला।

तीन Agent Platforms, तीन अलग Species

Invisible Agents, Visible कानून: 102 दिन में EU AI को अपनी पहचान बताने पर मजबूर करेगा

तुम्हारे AI Agent के पास Backspace Key नहीं है