Interpreting test result in Mida report

Created by LiRou C, Modified on Thu, 12 Dec at 1:23 AM by LiRou C

Mida employs a sequential testing methodology powered by a frequentist statistical engine - a robust approach to A/B testing that offers several key advantages for decision-making.



What is Sequential Testing? 


Unlike traditional fixed-horizon testing that requires a predetermined sample size, sequential testing allows for continuous monitoring of results as data accumulates. This means you can:

  • Check results at any time during the test
  • Stop the test early when clear winners or losers emerge
  • Continue collecting data when results are inconclusive
  • Maintain statistical validity throughout the monitoring process



The Frequentist Framework: 


Mida's statistical engine is built on frequentist principles, which:

  • Calculates the probability of observing the test results if there were truly no difference between variants
  • Controls the false positive rate (Type I error) at your chosen significance level (typically 95%)
  • Provides confidence intervals to show the range of likely true effect sizes
  • Makes no assumptions about prior probabilities, relying purely on observed data


This combination of sequential testing and frequentist statistics ensures:

  1. Efficient resource use by enabling early stopping when appropriate
  2. Protection against false conclusions through rigorous statistical controls
  3. Clear, interpretable results based on observed data
  4. Flexible monitoring without compromising statistical validity



Test Result Cases:


Case 1: Clear Winner


Green: This shows a winning variant. Given that the required confidence level is 95%, this test result is considered statistically significant because the statistical significance value of 99.71% surpasses the required threshold of 95%. Statistical significance refers to the probability that the differences observed in the test are not due to chance. In this context, a 99.71% statistical significance means that there is a less than 0.3% likelihood that the results occurred by chance. And, with an improvement of 233.04%, this is a meaningful lift.


Next, look at the confidence interval of the difference of means. The confidence interval provides the range of expected lift values, at the 95% confidence level. In other words, the lower bound is the “worst case” scenario of possible lift and the upper bound is the “best case” scenario. Here, you see a range from 0.44% to 1.32%. Since both numbers show a good increase compared to the Control CR (0.26%), you can feel confident about the change.



Case 2: Clear Loser



Red: This shows a losing variant. Given that the required confidence level is 90%, then the statistical significance of the losing variant should ideally be less than 10% (100%-90%).  The statistical significance value of Variant 1 is 6.71% which implies that the observed difference occurred due to randomness is roughly 6.71%, a figure that is below the required threshold of 10%. In other words, you can be (100 - 6.71) = 93.29% confident that the losing variant, Variant 1, is indeed inferior to the winning variant, Control. 


Looking at the confidence interval, we see a range from 4.69% to 5.29% Conversion Rate (CR). In simpler terms, 4.69% represent the “worst case” scenario, and 5.29% represent the “best case” scenario of Variant 1's performance. Since both values are below the 5.38% CR of Control, you can be confident that the Control is the winner.



Case 3: Inconclusive Result




Gray: This indicates that the test doesn't have definitive results and hasn't reached statistical significance yet. Depending on what you're trying to achieve with your experiment, here are some options you may want to consider:


1. Let It Run Longer: In some instances, you might need to allow the experiment more time to gather a larger sample size and achieve more accurate results.


2. Simplify Variations: If you have too many variations, consider reducing them. For instance, you might bring four variations down to just two or three.


3. Prioritize Brand Consistency: If the results are similar between two variations, choose the one that aligns best with your brand's guidelines.


4. Repeat the Test: Running the same test again can be beneficial for confirming your initial findings. Keep in mind that factors like the time of the year, or fluctuations in website traffic, may affect the end results.


5. Keep It As Is: Occasionally, your original design or strategy may not need any changes and is the most suitable version.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article