← BlogProductMarch 13, 2026

Smarter Synthetic Shoppers, More Accurate Results

We rebuilt how our synthetic shoppers are generated, paired across variants, and how significance is measured, to give you more accurate experiment results.

Squoosh Team

Product

5 min read

Our latest update makes three foundational improvements to how Squoosh identifies if a UX change is likely to increase revenue.

We validated these changes against a benchmark of real e-commerce A/B tests with known outcomes. The results: 81% accuracy at predicting which variant won the real-world test, up from 71%, and the rate of experiments reaching our highest significance tier jumped from 30% to over 43%.

An improved behavioral model

We fine-tuned our synthetic shoppers using a proprietary dataset of UX A/B tests and real shopper behavior, calibrating how they weigh trust signals, navigation quality, visual design, and friction against patterns that actually drive purchase decisions.

The result: the differences Squoosh surfaces between variants are more predictive of what happens when you ship. In our benchmark, Squoosh correctly predicted the winning variant 81% of the time, up from 71%. That means the recommendations you make are better supported by a model that more closely mirrors how real shoppers evaluate and decide.

Improved statistical testing for paired data

Squoosh has always used paired testing: the same synthetic shopper visits both original and variant, so differences reflect your changes rather than persona-to-persona variation. We improved how we analyze those pairs with a significance method purpose-built for this structure. The result: experiments reach our highest significance tier at over 43%, up from 30%, without needing to increase experiment size or run time.

Focusing on persuadable visitors

Not every visitor is persuadable. Determined buyers will convert regardless of what your site looks like. Window shoppers were never going to convert. Neither group tells you much about whether a change worked, as it's your ability to persuade those "on the fence" that moves the needle.

Therefore, our synthetic shoppers now concentrate on these fence-sitters who have potential to increase overall conversion. This fence-sitter cohort typically represents 5–10% of real traffic. A clear layout, sharp pricing, and smooth navigation tip them toward buying. Friction and confusion tip them toward leaving.

Because of this, Squoosh results will better predict whether your change will increase conversion, but will not mirror your overall Add to Cart rate. Add to Cart rates will typically run higher in Squoosh experiments than on your actual site. That's by design, not a bug.

What this means for you

Together, these changes improve the two things that matter most when presenting results: accuracy (Squoosh predicted the winning variant in 81% of benchmarked tests) and predictive validity, meaning the winners Squoosh identifies are more likely to win with real customers too.

No action needed on your end. All improvements apply automatically to new experiments. Questions about how these changes affect how you interpret results? Contact our team.

An improved behavioral model

Improved statistical testing for paired data

Focusing on persuadable visitors

What this means for you

New research, straight to your inbox.