तुम्हारे AI Agent के पास Backspace Key नहीं है

तुमने इस महीने autonomous agents deploy किए। वो pull requests बनाते हैं, project tickets update करते हैं, production servers पर configurations push करते हैं, और Slack notifications भेजते हैं — सब कुछ तब जब तुम सो रहे हो। Pitch ये है: boring काम delegate करो, सुबह उठो तो done list तैयार मिले।

प्रॉब्लम: agents गलतियाँ करते हैं। कभी-कभार नहीं — UC Berkeley की MAST study, जो March 2025 में publish हुई, ने सात state-of-the-art multi-agent systems में 41% से 86.7% failure rates measure किए। और chatbot hallucination जैसा नहीं है ये जहाँ तुम regenerate मार दो — agent की गलती एक merged commit है, एक बना हुआ Jira ticket है, एक भेजा हुआ message है। Real systems में real actions। तुम deployed config को "regenerate" नहीं कर सकते।

8 April से 17 April के बीच, तीनों major platforms ने autonomous runtimes ship किए। 8 April को Anthropic ने Managed Agents launch किए — sandboxing, state persistence, error recovery (मतलब: crash के बाद resume)। 14 April को Anthropic ने Routines add किए — उनके cloud पर run होने वाले agents, schedules या webhooks से trigger होते हैं। 15 April को OpenAI ने Agents SDK v0.14 release किया sandboxed execution और "snapshotting" — failures के बाद container state recovery — के साथ। 17 April को Google ने Agent Development Kit (ADK) ship किया session-level state management और multi-agent orchestration के साथ। तीन platforms, zero rollback primitives — ऐसे mechanisms जो agent के गलत काम पूरा करने के बाद उसे reverse कर सकें।

मैंने पिछले हफ्ते checkpoint gap के बारे में लिखा था — platforms crash recovery solve कर रहे हैं mid-run। ये easy problem है। तुम्हारा agent task के बीच में die हो गया, platform उसकी state restore करता है, agent फिर try करता है। ठीक है। लेकिन ये scenario कोई solve नहीं कर रहा: तुम्हारा agent successfully complete हुआ। End तक run हुआ, green checkmarks report किए, और result कचरा है। PR broken logic merge करता है। Jira ticket existing ticket को duplicate करता है। Notion page correct data को hallucinated data से overwrite करता है। Agent crash नहीं हुआ — उसने confidently गलत काम finish किया।

जब एक agent bad pull request merge करता है, Asana में duplicate tasks बनाता है, या broken Notion page push करता है, तो ये होता है: तुम — human — को manually हर action identify करना पड़ता है जो agent ने लिया, उसके downstream effects trace करने पड़ते हैं (क्या किसी दूसरे agent ने bad PR पर react किया? क्या कोई webhook fire हुआ?), और एक-एक करके reverse करना पड़ता है। ये cleanup linearly scale करता है actions की संख्या के साथ। ज्यादा autonomy मतलब ज्यादा सफाई।

Rollback natively exist क्यों नहीं करता? दो कारण। पहला, reversibility के लिए transactional semantics चाहिए — compensating actions, idempotency keys, action journals। Underlying tools — GitHub, Linear, Slack, Notion — ये primitives agents को expose नहीं करते। तुम्हारा agent platform "undo" नहीं बोलता क्योंकि जिन tools को वो call करता है वो भी "undo" नहीं बोलते। दूसरा — और ये वो part है जो कोई openly नहीं कहता — business incentive नहीं है। हर agent action एक billable API call है। हर re-run failed rollback के बाद एक और billable session है। Platform vendors append-only execution से profit कमाते हैं। Undo build करने का मतलब है customers को कम compute cycles use करने की वजह देना। कोई voluntarily ऐसा revenue model नहीं चुनता।

Enter backup vendors, खुशी-खुशी वो gap fill कर रहे हैं जो agent platforms भरने से मना कर रहे हैं। 14 April को Commvault ने AI Protect launch किया — literally "Ctrl+Z for rogue AI agents" के नाम से market किया। ये agent session का blast radius map करता है, agent-caused changes को human changes से isolate करता है, और selective reversal enable करता है। जैसा Commvault CTO Pranay Ahlawat ने कहा: "Agentic environments में, agents data, systems, और configurations में ऐसे state mutate करते हैं जो तेज़ी से compound होता है और trace करना मुश्किल है।" Irony इतनी thick है कि चाकू से काट सकते हो: तुम्हारा AI platform builder undo नहीं बनाएगा क्योंकि उसके margins hurt होते हैं; तुम्हारा backup vendor बनाएगा, क्योंकि तुम्हारे agent की incompetence उनका addressable market है। दो blind spots, एक extremely profitable disaster।

Agent productivity equation को update चाहिए। अगर 30% autonomous runs को भी manual reversal चाहिए — और reversal original task से ज्यादा time लेता है — तो उस workflow का net ROI negative हो जाता है। तुमने happy path पर 10 minutes बचाए और sad path cleanup में 40 minutes खर्च किए।

जो platform पहले agent.rollback(session_id) ship करेगा वो enterprise trust जीतेगा। इसलिए नहीं कि enterprises को ऐसे agents चाहिए जो कभी fail न हों — सब कुछ fail होता है — बल्कि इसलिए कि उन्हें ऐसे agents चाहिए जिनकी failures की cost उनकी successes की savings से कम हो। तब तक, हर agent platform append-only है: वो काम कर सकता है, लेकिन undo नहीं कर सकता। तुम्हारे autonomous assistant के पास backspace key नहीं है।

तुम्हारे AI Agent के पास Backspace Key नहीं है

Keep reading

Agent Paradox: कम Autonomy, ज़्यादा Value

तीन Agent Platforms, तीन अलग Species

Anthropic ने उन्हीं platforms के ऊपर अपना platform बना दिया जो उसे fund करते हैं। मकान मालिकों को अभी पता चला।

तुम्हारे Agent का Permission Dialog एक Placebo है