Stop Debating, Build the Experiment Machine

Recently, I had a conversation with our engineering lead. He raised a concern many technical teams share: as AI coding capabilities improve and delivery speed accelerates, will we eventually run out of things to do?

My answer: You don’t need to worry. This is a classic Jevons Paradox.

In the 19th century, economist William Stanley Jevons observed something counterintuitive: when steam engines became more efficient and coal costs dropped, coal consumption increased rather than decreased. The reason was simple—lower costs made previously uneconomical tasks suddenly worthwhile.

AI works the same way. When delivery costs fall, demand doesn’t shrink—it explodes. The real question isn’t “will we run out of work,” but rather the game has changed. Are you ready?

The Death Spiral of Debate-Driven Development

In traditional product development, massive time gets consumed by discussions. In product meetings, designers argue the button should be bigger, PMs say it should be smaller, engineers say both are feasible, so the debate continues…

When should we trigger the paywall popup? What’s the pricing strategy? How should the interaction design work? These relatively detailed questions easily devolve into subjective back-and-forth.

You can make an argument, someone else can make the opposite argument. Human rationality—or more accurately, theory and reasoning—becomes largely irrelevant here. You can say anything: “I think users need this,” “my experience tells me this,” “I feel it should be designed this way”…

The result is endless meetings, hours of discussion with no conclusion.

Worse still, the vast majority of our hypotheses are actually wrong. According to experiment data from companies like Google and Microsoft, only 10-33% of product hypotheses ultimately prove effective—meaning 67-90% fail. These assumptions based on imagination and reasoning often don’t match real user needs.

The fundamental problem: we’re using reasoning to solve empirical problems. And the only answer to empirical problems lies with users.

0-to-1 vs 1-to-100: The Methodological Divide

But this doesn’t mean all stages should be experiment-driven. Different product development phases require completely different methodologies.

The 0-to-1 Phase: You’re seeking Product-Market Fit, still figuring out what users truly need. This stage requires:

Intuition and gut instinct
Rapid iteration
Deep user interviews
Doing things that don’t scale

As I discussed in a previous article, prematurely pursuing A/B testing and growth hacking before PMF is futile. Airbnb’s success didn’t come from optimizing button colors—it came from founders Brian Chesky and Joe Gebbia personally photographing NYC listings and manually crafting 10-star experiences.

At this stage, product direction requires taste—insight into human nature. If you keep A/B testing better horses, you’ll never invent the car.

The 1-to-100 Phase: You’ve found PMF and have stable user traffic. This stage requires:

Experiment-driven approach
Data validation
Rapid iteration
Scaled optimization

As Marshall Goldsmith’s widely quoted saying goes: “What got you here won’t get you there.”

These two phases have fundamentally different methodologies. Confuse them, and you’ll either over-optimize during 0-to-1 or keep guessing during 1-to-100.

Prerequisites and Traps of Experiment-Driven Development

Experiment-driven development isn’t a panacea. It has clear prerequisites and pitfalls to watch for.

Prerequisites:

PMF established: Product direction is clear, no longer fundamental exploration
Sufficient traffic: Can run statistically significant experiments
Clear metrics definition: Know what matters and what doesn’t

Trap One: Data-Centric Instead of User-Centric

Many people misunderstand data-driven. They become data-centric—optimizing metrics for metrics’ sake, chasing vanity metrics, ultimately harming user experience.

The correct approach: Data-driven but user-centric.

Data is the means; user value is the end. You can’t sacrifice overall experience by showing popups the moment users arrive just to boost conversion by 5%. Brian Chesky is clear: if you optimize short-term conversion at the expense of the 10-star experience, you’re killing the golden goose.

Trap Two: Blindly Pursuing Metrics

If you define the wrong metric or over-optimize a single metric, it might be worse than ignoring data entirely.

Metrics should reflect genuine user value. If your metric is “number of videos generated,” you might optimize for a pile of low-quality videos. If your metric is “average video quality users can achieve,” you’re actually solving the real problem.

Trap Three: Killing Disruptive Innovation

Pure data-driven approaches have a classic trap: local optimization vs global innovation.

When iPhone launched, critics argued users needed physical keyboards. In Airbnb’s early days, conventional wisdom said no one would stay in strangers’ homes. When Netflix shifted from DVD to streaming, short-term data was indeed negative.

The key: during paradigm shifts, rely on human judgment and taste. During optimization phases, rely on experiments and data.

The Experiment Cycle: From Hypothesis to Validation

With experiment-driven principles established, the product development process fundamentally transforms.

As I discussed in a previous piece on experiment culture, AI’s true value isn’t just boosting productivity—it’s enabling us to transform “I think” into “I tried.” When experiment costs drop dramatically, the necessity of debate vanishes.

A typical experiment cycle:

Define hypothesis: Clear hypothesis and success metrics
Rapid build: Use AI assistance to create MVP in days
Gradual rollout: Not 100% release, but first to a subset of users
Data validation: See data changes the next day
Fast decision: Continue, adjust, or kill

No need to argue for hours in conference rooms—you can ship the feature and let data speak. The premise: sufficient traffic and confirmed PMF.

The key to this cycle is speed. The faster you go from hypothesis to validation, the better. This is why momentum matters so much in the AI era.

Momentum as Moat

Now we can understand why momentum matters so much in the AI era.

If you can drive your shipping and experiments with data, delivering 5 product features per day or week, validating what works and what doesn’t with data, compared to teams still debating in meetings and writing PRDs, this is a generational gap.

Essentially, you’re wielding nuclear weapons while they’re still using cold steel. This growth battle isn’t even close to fair.

Momentum itself is the moat.

Competitors ship one feature per month; you ship five experiments. You learn five times faster. This learning compounds into better products. It’s not about “speed” itself, but about the velocity of the learning loop.

HeyGen’s product development handbook captures this beautifully: “Competitors ship one feature per month; we ship five experiments. We learn five times faster. That learning compounds into superior products.” This is speed’s true value—not delivering features quickly, but delivering learning quickly.

Some will ask: with such rapid iteration, what about technical debt? What about user experience consistency?

My view: Technical debt is much like financial debt. It’s not something to avoid, but a form of leverage.

You trade technical debt for product growth velocity. In this momentum-defined AI era, this is a reasonable trade-off. When technical debt accumulation outpaces growth velocity, that’s when you pause to address it.

As for user experience consistency, rapid experimentation does create challenges—if you run multiple experiments simultaneously, different users might see completely different product interfaces, causing confusion and increased support costs.

But this is a feature gate and gradual rollout problem. You don’t need to 100% rollout every experiment to all users. You can test on new users first, on only 10% of users, by region, by scenario. The key is ensuring each individual user sees a consistent experience, not a button on the left today and on the right tomorrow.

This requires more flexible thinking, not rigidly believing “experiment = full rollout.”

What AI Native Really Means: Ride the Wave

So what is an AI Native company?

Not just “using AI tools to accelerate development,” but designing the entire product development process from scratch as AI-first.

HeyGen systematically articulates this philosophy in their product development handbook. Their core insight: In the AI era, we operate without a stable technology foundation. Every few months, AI technology evolves dramatically. Model capabilities are unknown and changing rapidly.

Traditional software development assumes stable foundations. But in the AI era, this foundation changes every 2-3 months.

This isn’t a bug—it’s an opportunity. The key: Ride the wave, don’t fight the current.

From Stable Foundation to Surfing:

Traditional era thinking:

Build on stable foundations
Optimize for longevity
Plan 12-18 months ahead
Polish, then ship

AI era thinking:

Surf the technology wave
Build products that automatically improve
2-month realistic planning cycles (aligned with model upgrade cycles)
Ship to learn
Parallel experimentation

Core AI Native Principles:

Distinguish what changes vs. what stays constant: Models change, capabilities change, but users’ core problems and workflows don’t. Build systems around what doesn’t change while surfing model improvements.
Design self-improving products: When GPT-5 arrives, your product should automatically get better, not require refactoring. Build abstraction layers that let product experience ride on top of AI advancement.
Flexible architecture: Expect change. Version everything aggressively. Build replaceable systems.
2-month planning cycles: Long enough to build meaningful things, short enough to adapt when the landscape shifts. Synchronized with AI model upgrade cycles.
6-12 month strategic bets: While realistic planning is 2 months, predict capabilities 6-12 months out and position early.

This isn’t a difference in tools, but a fundamental difference in organizational structure and mindset.

Build the Machine that Builds the Machine

Elon Musk once said: “It’s important to build the machine that builds the machine.”

I now have a deeper understanding of this statement.

Pursuing extreme attention to product detail is reasonable—necessary, even. But these pursuits can’t be subjective “I think it’s good.” They require objective measurement standards. That objective standard: do users love it.

Whether users love it is largely evidenced by the metrics you define and the data you collect.

Therefore, building your product optimization workflow well matters more than building any single product. This is what I mean by “build the machine.”

What is this machine? It’s a complete, self-evolving product development system with three layers:

Technical Infrastructure Layer:

Feature flag system: Enables rapid feature toggles and gradual rollout
A/B testing platform: Supports multivariate experiments and statistical analysis
Real-time monitoring and analytics tools: Quickly identifies issues and opportunities
Abstraction layers designed for AI model upgrades: Products automatically improve as models evolve

Organizational Capability Layer:

Experiment-driven culture: Let data speak, not opinions debate
Rapid decision mechanisms: Two-way door decisions same day, avoid consensus traps
2-month planning cycles synchronized with AI model upgrades
Disagree and commit principle: Prioritize speed, correct quickly if wrong

Strategic Mindset Layer:

Correct metrics definition: Reflects genuine user value, avoids vanity metrics
Data-driven but user-centric: Data is the means, users are the end
Distinguish 0-to-1 vs 1-to-100 methodologies: Know when to rely on taste, when on data
Understand what changes (AI capabilities) vs. what doesn’t (user needs)

This is a meta-capability—you’re not optimizing individual products, but optimizing the “optimization capability” itself.

All the elements discussed earlier—experiment cycles, momentum, AI Native, Ride the Wave—converge in this machine. You’re not building a product; you’re building a system that continuously produces better products.

Most startups lack the capability to build this machine, which is why 0-to-1 and 1-to-100 are completely different problems.

But in the AI era, this is no longer optional—it’s required coursework.

The Quality Paradox

Some will ask: isn’t moving fast contradictory to pursuing excellence?

Answer: No contradiction. In fact, moving fast is the prerequisite for building better products long-term.

When competitors ship one feature per month, you ship five experiments. You learn five times faster. This learning compounds into superior products.

Moving fast doesn’t mean shipping features quickly—it means delivering customer value quickly (and learning quickly). Speed serves the ultimate goal: being the absolute best.

HeyGen’s quality bar is clear: for video content and creative tools especially, quality is non-negotiable. Users don’t love products because of polished UI—they love products that solve their problems with exceptional quality. The success metric: the average video quality any user can achieve on the platform.

This is the right north star.

The Game Has Changed

Back to the engineer’s concern: will AI leave us with nothing to do?

Answer: It won’t leave you with nothing to do, but it will fundamentally change what you do.

Delivery costs drop, demand explodes. But the real question is: are you adapting to new rules with old methods, or are you building the machine?

Product development in the AI era isn’t about building better products—it’s about building better assembly lines.

Stop debating. Start experimenting. Build your experiment machine.

This is the only way to compete in the AI era.

📧 Subscribe to Updates

The Death Spiral of Debate-Driven Development#

0-to-1 vs 1-to-100: The Methodological Divide#

Prerequisites and Traps of Experiment-Driven Development#

The Experiment Cycle: From Hypothesis to Validation#

Momentum as Moat#

What AI Native Really Means: Ride the Wave#

Build the Machine that Builds the Machine#

The Quality Paradox#

The Game Has Changed#