You install apps from the App Store. You pull packages from npm. You deploy cloud images from AWS Marketplace. Every piece of software you run passed someone's quality gate — code signatures, permission audits, CVE scans. You don't think about it because the system works. Mostly.

Now your team needs to deploy a pre-built AI agent — a program that doesn't just sit there waiting for clicks, but acts autonomously on your infrastructure, sends emails, queries databases, makes decisions. You open the marketplace listing and find: a vendor logo, a paragraph of marketing copy, and an "Install" button. That's it.

The Agentic Cloud Arrives

Google Cloud Next 2026 opened today in Las Vegas, and CEO Thomas Kurian had one word for the keynote: "The Agentic Cloud." Translation: Google wants agents everywhere, and they want you to deploy them from their storefront. Agent Garden — a curated agent sample collection with one-click deploy. An expanded AI Agents section in Cloud Marketplace with A2A compatibility filters. ADK hit v1.0 for Python. Cloud revenue: $17.7B last quarter, up 48%. Revenue backlog doubled to $240B. Google isn't pitching a vision — they're building the mall and printing the lease agreements.

But here's what nobody on stage mentioned: Google Cloud Marketplace's review process checks integration completeness and pricing model. Not what the agent actually does when it holds your credentials and faces an ambiguous instruction.

Static vs. Behavioral: The Verification Gap

App stores verify static properties — permissions, code signatures, known vulnerabilities (CVEs — publicly cataloged security bugs). That works when software waits for your input. Agents don't wait. They reason, plan, and execute. Verifying what software is (safe, signed, compliant) is a solved problem. Verifying what software does across unpredictable runtime conditions — that's a fundamentally different challenge.

As ReversingLabs observed on April 15: "While an LLM's actions may be auditable, the reasoning behind those actions can be unknowable." That's not a philosophical distinction. It means marketplace scanners can verify an agent's code is clean while remaining structurally unable to predict its runtime behavior.

The Damage Is Already Documented

This isn't theoretical. Back in late January, the ClawHavoc attack demonstrated exactly how the gap gets exploited. Between January 27 and February 5, attackers planted 1,184 malicious skills on ClawHub — roughly one in five packages in the ecosystem. A single author account uploaded 677 of them. Nine CVEs. Skills inherit the full permissions of whatever agent runs them — private data access, API keys, the works. The marketplace had no behavioral verification to catch any of it.

Manifold Security launched its Manifest platform on April 14 to address exactly this problem — indexing 238,000+ skills across agent registries with execution graph analysis, mapping what an agent actually does at runtime rather than what it declares in metadata. Microsoft shipped an Agent Governance Toolkit on April 2 with Ed25519 plugin signing and dynamic trust scoring. These are meaningful steps. But they're governance toolkits and independent platforms — not marketplace-wide certification standards baked into the "Install" button.

You Are the QA Department

Until scalable behavioral certification exists — a way to verify what an agent does, not just what it claims — every agent you install from any marketplace is an unaudited autonomous actor running under your identity, with your credentials, on your infrastructure. First-party agents from Google or Microsoft come with brand-reputation trust. But marketplace economics demand third-party listings, community agents, long-tail integrations. That's where app stores live or die. And that's where nobody's checking.

Remember when you didn't think about app store verification because the system just worked? For agents, that system doesn't exist yet. The vendor that builds it won't just win a product feature — it'll own the trust layer that sits above every competing agent runtime. Google, Anthropic, OpenAI — all of them will need someone to answer the question their marketplaces currently dodge: what does this agent actually do?