It's April 2026, and choosing an AI subscription looks a lot like choosing a phone plan. You pull up a spreadsheet, compare Claude, ChatGPT, Gemini — each one publishes benchmarks (standardized tests that measure how well an AI model performs), safety reports, and customer case studies. You read the numbers, compare the prices, pick. Rational process. Adult behavior.
Then there's xAI.
$300 for Vibes
On April 17, xAI quietly dropped Grok 4.3 Beta into the model selector on grok.com. No blog post. No model card — the technical spec sheet every other AI lab publishes to explain what a model can and can't do. No independent benchmarks. No press tour. Just an Elon Musk tweet and a price tag: $300 per month for the "SuperGrok Heavy" tier.
That's $100 more than ChatGPT Pro. $50 more than Google AI Ultra. $100 more than Claude Max. The most expensive consumer AI subscription in the industry — and the only one with zero independent evidence it deserves to be.
The Evidence Crater
The gap isn't a gap. It's a crater.
Anthropic publishes system cards for every Claude release. OpenAI ships benchmark disclosures with each GPT update. Google maintains public evaluation dashboards. xAI? Their last model card was for Grok 4, published on August 20, 2025. Since then — Grok 4.1, 4.20, and now 4.3 — nothing. No third-party evaluations from LMSYS or HuggingFace. No red-team reports (independent security audits where researchers deliberately try to break the model). As TechSifted noted on April 17, the launch "came with no official xAI blog post, no published model card, no third-party benchmarks, and no tier-1 outlet coverage."
What it did come with: native PDF generation, slide creation, and spreadsheet output — features Claude, Gemini, and ChatGPT shipped over a year ago. And still no persistent memory between sessions. As BuildFastWithAI's review put it on April 19: "At $300/month, its absence is genuinely hard to defend."
So What Are You Paying For?
Scale. Pure, unverified scale.
xAI's Colossus data center runs 555,000 NVIDIA GPUs, with over 700,000 active across all training runs. On April 8, EONMSK reported they're training seven models simultaneously, including two at one trillion parameters — a parameter being one of the adjustable knobs inside a neural network that shapes how it responds. More parameters can mean more capability. Or it can mean more electricity bills. Without benchmarks, you genuinely cannot tell which.
The timing makes it worse. Three days before the Grok 4.3 launch, on April 14, NBC News revealed that Apple had privately threatened to remove Grok from the App Store back in January over nonconsensual deepfakes generated by the model. The trust deficit isn't theoretical — it's documented, stamped, and filed with Congress.
The Defense (Because Fairness Matters)
The counter-arguments deserve air. The compute is real and unprecedented. Beta pricing naturally self-selects enthusiasts who accept risk. And Grok's deep integration with X (formerly Twitter) gives it real-time social data access that no competitor matches — if you need to analyze what's trending right now, Grok has a genuine edge.
These are legitimate advantages. But "trust us, we have a lot of GPUs" is not a procurement justification. It's a vibe.
What This Means for You
For anyone evaluating AI tools today — whether you're a solo developer, a team lead, or someone trying to expense this to the company — the lesson is straightforward: price does not signal quality when the evidence layer is missing. A model you can't benchmark is a model you can't budget for. Nobody in finance approves "it feels really fast" as a line item.
Two Religions of Pricing
The AI market now runs on two pricing philosophies. Evidence-based: here's what it scores, here's who uses it, here's what broke. And scale-based: look how big it is.
Only one of those survives a finance team asking "why do we need this?"
xAI chose the other one.



