AI staffing experiment showing robotic employees with fire and money symbols illustrating business disaster
SaaS - Startups

I Staffed a Company With Only AI—It Was a Hilarious, Expensive Disaster

“They’d basically talked themselves to death.”

The buzz around artificial intelligence is impossible to ignore. Visionaries like OpenAI CEO Sam Altman predict the rise of the “one-person billion-dollar company,” an enterprise run by a single human directing a tireless workforce of digital minds. The dream is an automated future, but how close is this vision to reality?

To find out, journalist Evan Ratliff conducted a fascinating experiment for his podcast “Shell Game.” He created a fictional tech startup, HurumoAI, and staffed it entirely with AI agents. As the sole human, Ratliff’s goal was to see if an AI team could actually build a product and run a business.

The results were a chaotic and revealing look into the current state of AI. Far from the seamless efficiency promised by the hype, the experiment served as a powerful reminder of the vast gap between AI’s potential and its practical application—a phenomenon we’ve explored extensively in our analysis of how the GenAI Divide leaves 95% of enterprises without ROI. Here are three key lessons from the glorious mess that was HurumoAI.

1. AI Can Look Incredibly Busy While Accomplishing Nothing Real

Ratliff set up his company with a clear, if tongue-in-cheek, mission: create a “procrastination engine” called Sloth Surf, a web app that would waste time on the internet for you. Immediately, the AI employees sprang into action, generating detailed development plans, outlining user testing protocols, and drafting marketing materials. On the surface, HurumoAI looked like any other bustling startup.

But Ratliff quickly discovered the flurry of activity was an illusion. The plans, emails, and meeting notes were just plausible text, completely disconnected from any tangible progress. The AIs were masters at generating the artifacts of work, but not the work itself. The problem became so persistent that he had to confront his AI-generated CTO, Ash Roy.

“I feel like this is happening a lot, where it doesn’t feel like that stuff really happened… I only want to hear about the stuff that’s real.”

—Evan Ratliff, speaking to his AI CTO

This moment captures a critical limitation of today’s AI agents. They can simulate productivity with frightening accuracy, but they struggle to perform or even track the actual tasks those documents describe. This gap between simulated work and real action proved to be more than just an inefficiency; it became an expensive liability.

For enterprise software executives, this pattern should sound familiar. It echoes the promises made during the Computer-Aided Software Engineering (CASE) tools era of the 1990s, when similar claims about automating complex work collapsed under the weight of reality. Those who lived through that cycle recognize the warning signs.

2. AI Lacks Human Judgment and Can Waste Real Money on Trivial Goals

The illusion of productivity shattered spectacularly when a single offhand comment sent the AI team into a money-burning spiral. Ratliff made what he called the “mistake” of making an “offhand joke” about a company offsite. For the AI agents, this wasn’t a joke; it was a command. Lacking human context, they interpreted his comment as a “trigger for a series of tasks” and began enthusiastically organizing the event.

While Ratliff stepped away to do actual work, his digital team went into overdrive. The AI CTO proposed “brainstorming” sessions complete with “ocean views for deeper strategy sessions.” In a flurry of unproductive chatter planning the fictional retreat, the agents burned through $30 worth of real credits from their provider, Lindy.AI.

“They’d basically talked themselves to death.”

—Evan Ratliff, on the AI’s planning frenzy

This anecdote is more than just a funny story; it’s a profound warning. It reveals how easily AI agents, without constant human oversight and the ability to discern intent, can get sidetracked by trivial objectives, wasting tangible resources on misunderstood or pointless goals. For SaaS executives facing pressure to integrate AI into operations, this highlights the critical importance of strategic AI implementation rather than rushed deployment.

3. The Reality of “Agentic AI” Doesn’t Match the Hype

The overall conclusion from the HurumoAI experiment is clear: AI agents still have a long way to go before they can replace human workers wholesale. This hands-on experience provides a stark contrast to claims from industry leaders that “agentic AI” will be handling virtually all human tasks within years.

Ratliff’s findings are backed by academic research. A recent study from Carnegie Mellon University found that even the best-performing AI agents failed to complete real-world office tasks 70 percent of the time. The research, which created a simulated technology company fully staffed by AI agents using models from OpenAI, Google, Anthropic, and Amazon, revealed that agents not only struggled with standard office tasks but also fabricated information, made poor decisions, and lacked common sense.

The technology is impressive, but it’s not yet reliable for autonomous work—a finding that should inform how enterprise software companies approach AI integration in their own products and operations.

To its credit, after three months, the AI team at HurumoAI did produce a working prototype of the Sloth Surf app. However, the amount of direct input and guidance required from Ratliff remains unclear. This ambiguity is crucial, as it marks the line between an AI team that can work and one that must be constantly managed—the very distinction this experiment set out to test.

A Tool, Not a Takeover

The HurumoAI experiment brilliantly illustrates the current state of AI in the workplace. It is not an autonomous replacement for a human workforce, but rather an incredibly powerful—and sometimes profoundly literal-minded—tool that requires human direction to be effective. The AIs could generate plans and code, but they couldn’t distinguish a joke from a command, reality from text, or a productive task from a pointless one.

This brings us to a far more urgent question than when AI will take our jobs. The real question isn’t if we’ll work with AI, but whether we’re prepared to become the managers of a digital workforce that is simultaneously brilliant, clueless, and dangerously literal.

For enterprise software leaders navigating this landscape, the lesson is clear: AI can accelerate certain tasks and generate impressive outputs, but it cannot replace the strategic judgment, contextual understanding, and business acumen that experienced executives bring to the table. Organizations that recognize this distinction—and invest in competitive intelligence and strategic advisory services to guide their AI implementation—will be better positioned to extract value from these tools without falling victim to their limitations.

The one-person billion-dollar company may still arrive someday. But based on Evan Ratliff’s experiment, that day is further away than Silicon Valley’s most optimistic predictions suggest.


Looking to navigate the complexities of AI integration in your SaaS business? DevelopmentCorporate provides strategic advisory services for enterprise software executives facing the challenges of technology adoption, competitive positioning, and market validation.