Skip to main content

Sample Size Determination

Notes I've taken on sample size determination, mostly comprising of information from courses on codecademy.com.

A/B Testing

Rules

  • Don’t continue to run the test after the predetermined sample size, until “significant” results are found
  • Don’t stop a test before reaching the predetermined sample size, just because your results reach significance early (unless there are ethical reasons that require you to stop, like a prescription drug trial)

Parameters

Baseline conversion rate: expected engagement, based on historical data.

number of converted visitorstotal number of visitors\frac{\text{number of converted visitors}}{\text{total number of visitors}}

Desired lift: smallest difference we care to measure. Also known as minimum detectable effect.

100×newoldold100 \times \frac{\text{new} - \text{old}}{\text{old}} conversion rate×minimum desired lift+conversion rate\text{conversion rate} \times \text{minimum desired lift} + \text{conversion rate}

Sample Size Calculator

  • Margin of error: the furthest we expect the true value to be from what we measure in our survey
  • Population size: generally 100,000
  • Likely sample proportion: a guess of what we expect the results to be (from a previous survey or a pilot study, 50% if no data is available)
  • Confidence level: the probability that the margin of error contains the true proportion (e.g. if we choose a confidence level of 99%, we can expect that after multiple repetitions of the survey, the true value will lie within our specified margin of error 99% of the time)