When Your AI Chatbot Becomes Someone Else’s Free Compute: The Pepper Lesson

Jamie Taylor
Friday, 5, 2026

The image shows a person holding up a large amount of money, which appears to be $20 bills. The individual is partially visible from behind and is wearing nail polish. The currency bills are stacked together in front of their face, creating a sense of abundance or a show of wealth. The background is blurred, drawing focus to the money.

ℹ️ Note
In short, for readers without a technical background: Earlier this year, Chipotle’s customer-service chatbot was repurposed by members of the public as a free coding assistant; the bill went to Chipotle. The exploit has since been patched, but the technique is now public and is being applied to other retailers. If your organisation is planning to add an AI chatbot to its website or app, this post explains why that matters and what to ask before launch. The middle sections describe the technical mechanics; “The Governance Gap” and “Three Questions Before You Ship Public-Facing AI” are the practical takeaways.

A customer support chatbot for a burrito chain spent several days in March answering LeetCode questions for anyone who asked. This is a true story, and it ends with a lesson that every leader deploying public-facing Artificial Intelligence (AI) should sit with for a few minutes.

Chipotle’s customer support chatbot, called Pepper, went viral in March 2026 after users discovered it would happily solve programming problems on request. The fix has since been deployed; the proof-of-concept that built on the loophole, less so. The story is closed at Chipotle. The pattern is now part of the public toolkit, and a community project is actively soliciting reverse-engineering work against several more retailers.

This post is not about Chipotle. It is about the gap that made the exploit possible, and about a class of attack that customer-facing AI deployments are now structurally exposed to.

What Happened With Pepper

Pepper is built on IPsoft’s Amelia platform. It was deployed to help customers with their orders, locations, and the usual run of customer-support topics. What it was not designed for, and yet would cheerfully do, was answer general programming questions for anyone who happened to ask.

That detail made it onto social media on the 12th and 13th of March 2026. Within days, a developer named Gonzih had reverse-engineered the backend and released a small OpenAI-compatible proxy. The proxy ran locally, exposed an endpoint that looked like any other Large Language Model (LLM) Application Programming Interface (API), and needed no authentication beyond an anonymous session against Chipotle’s own infrastructure.

A few weeks later, someone forked the open-source OpenCode project, hardcoded Pepper as the default model, and shipped it as "chipotlai-max". The hardcoded API key in the configuration is "burrito-2026". The cost field reads "$0.00 (powered by Chipotle’s cloud budget)".

Chipotle patched the loophole. The repository’s status table now marks Pepper as patched, and lists Home Depot, Lowe’s, Target, Starbucks, Walmart, and McDonald’s as candidate targets for the same approach. The HackerNews thread is the most useful single page for running commentary; the project repository and the original proxy are linked for completeness.

Denial of Wallet

The traditional shape of a malicious traffic attack against a public service is the Distributed Denial of Service (DDoS); many machines flooding a target with enough requests to deny availability to legitimate users. Denial of Wallet (DoW) is the same family of attack against a different target. Instead of denying availability, it denies viability; the attacker burns through the target’s metered cost budget faster than the target can absorb.

DoW is a known category in cloud security circles, where it has been discussed for years in the context of serverless functions and any other Application Programming Interface where the bill scales with usage. Public-facing AI is the latest, and arguably the most exposed, surface for it. Every token consumed by a chatbot is a line item on a vendor invoice. Every off-domain query is paid for by someone, and that someone is the deployer, not the user.

The Chipotle case is a relatively benign worked example. The proxy pools its requests through a small number of anonymous sessions, capped at five concurrent at the time of writing. That ceiling reads more like a meme than a serious attack. Strip the rate limit, scale the session pool, target a less-prepared retailer, and the same architecture becomes a cost-amplification attack against any customer-facing AI endpoint that has not thought about its exposure.

The patch closes one door. The pattern walks straight to the next.

The Governance Gap

There is a technical question and a strategic question inside this story, and they are worth separating.

The technical question is what should have stopped the Pepper exploit at the system level. A short list of controls would have caught it; scope enforcement at the system prompt and intent classifier, so that an off-domain query is refused rather than answered; per-session and per-Internet Protocol (IP) rate limits, with budget alarms that fire on token spend before they fire on the monthly bill; an off-topic query ratio tracked as a metric, so that the dashboard tells you the bot has been repurposed before HackerNews does; and authentication that does something, because anonymous session pools are the front door of every cost-amplification attack against a public API.

The strategic question is harder. Why was a deployment that lacked any of these controls put live in the first place? It would be tempting to file it as a one-off lapse. The fact that a forkable, ship-ready proxy now exists, and that several more retailers are listed as next-target candidates, suggests this is not an outlier. It is a category.

This is the same family of mistake I argued against recently in the AI moonshot post. In that piece the failure mode was a workflow problem amplified by AI speed; McDonald’s drive-thru ordering made the operational fragility legible in public. Pepper made the financial fragility legible in public. Different surface, same root cause; AI deployed before the scope, the workflow, and the threat model had been thought through.

In my work helping organisations adopt AI tooling, including for a globally recognised logistics business, the deployments that survive contact with reality are the ones where someone sat down and answered the next three questions before launch.

ℹ️ Note
The patch is in for one retailer. The pattern is in the wild for everyone else. Governance is the cheaper place to spend.

Three Questions Before You Ship Public-Facing AI

If you operate, or are about to launch, a customer-facing chatbot, these are the questions worth being able to answer in writing before the next product review.

✅ What conversations is this bot allowed to have? Scope, in writing, that an auditor could check against the actual logs. “Customer support topics” is not a scope; “order status, store locations, menu queries, and dietary information, with explicit refusal for any other request” is a scope.

✅ What conversations is this bot actually having? Monitoring, with an off-topic query ratio tracked as a metric. If you cannot tell the difference between a customer asking about guacamole and a developer asking about graph traversal, the bot is already mis-scoped and you do not know it yet.

✅ What is the worst case if a sufficiently motivated user treats this bot as a general-purpose LLM endpoint? A threat model with a real number on the worst-case spend. If that number embarrasses you, the controls you have today are not the controls you need.

If you cannot answer all three for the chatbot already in your product, that gap is the entire post.

Closing Thought

Pepper is patched. The proxy still exists. The fork still builds. The contributor list reads less like a prank and more like a reading list of retailers who have not yet had this happen to them.

The Moonshot piece argued that the biggest gains from AI adoption come from a discipline of small, deliberate wins rather than a single flagship bet. The same logic applies to defence. The biggest reductions in exposure for public-facing AI come from a discipline of small, deliberate governance moves; scope, monitoring, rate limits, refusal tests, budget alarms. None of those are glamorous. All of them are cheaper than the post-incident response, and considerably cheaper than the second incident.

If your organisation is planning a customer-facing AI deployment and these questions feel uncomfortable to answer, a consultation might be a useful starting point.