Our latest update makes three foundational improvements to how Squoosh identifies if a UX change is likely to increase revenue.
We validated these changes against a benchmark of real e-commerce A/B tests with known outcomes. The results: 81% accuracy at predicting which variant won the real-world test, up from 71%, and the rate of experiments reaching our highest significance tier jumped from 30% to over 43%.
An improved behavioral model
We fine-tuned our synthetic shoppers using a proprietary dataset of UX A/B tests and real shopper behavior, calibrating how they weigh trust signals, navigation quality, visual design, and friction against patterns that actually drive purchase decisions.
The result: the differences Squoosh surfaces between variants are more predictive of what happens when you ship. In our benchmark, Squoosh correctly predicted the winning variant 81% of the time, up from 71%. That means the recommendations you make are better supported by a model that more closely mirrors how real shoppers evaluate and decide.
Improved statistical testing for paired data
Squoosh has always used paired testing: the same synthetic shopper visits both original and variant, so differences reflect your changes rather than persona-to-persona variation. We improved how we analyze those pairs with a significance method purpose-built for this structure. The result: experiments reach our highest significance tier at over 43%, up from 30%, without needing to increase experiment size or run time.
Focusing on persuadable visitors
Not every visitor is persuadable. Determined buyers will convert regardless of what your site looks like. Window shoppers were never going to convert. Neither group tells you much about whether a change worked, as it's your ability to persuade those "on the fence" that moves the needle.
Therefore, our synthetic shoppers now concentrate on these fence-sitters who have potential to increase overall conversion. This fence-sitter cohort typically represents 5–10% of real traffic. A clear layout, sharp pricing, and smooth navigation tip them toward buying. Friction and confusion tip them toward leaving.
Because of this, Squoosh results will better predict whether your change will increase conversion, but will not mirror your overall Add to Cart rate. Add to Cart rates will typically run higher in Squoosh experiments than on your actual site. That's by design, not a bug.
What this means for you
Together, these changes improve the two things that matter most when presenting results: accuracy (Squoosh predicted the winning variant in 81% of benchmarked tests) and predictive validity, meaning the winners Squoosh identifies are more likely to win with real customers too.
No action needed on your end. All improvements apply automatically to new experiments. Questions about how these changes affect how you interpret results? .