Sample Size Determination

Notes I've taken on sample size determination, mostly comprising of information from courses on codecademy.com.

A/B Testing

Don’t continue to run the test after the predetermined sample size, until “significant” results are found
Don’t stop a test before reaching the predetermined sample size, just because your results reach significance early (unless there are ethical reasons that require you to stop, like a prescription drug trial)

Baseline conversion rate: expected engagement, based on historical data.

\frac{\text{number of converted visitors}}{\text{total number of visitors}}

Desired lift: smallest difference we care to measure. Also known as minimum detectable effect.

100 \times \frac{\text{new} - \text{old}}{\text{old}}

\text{conversion rate} \times \text{minimum desired lift} + \text{conversion rate}

Margin of error: the furthest we expect the true value to be from what we measure in our survey
Population size: generally 100,000
Likely sample proportion: a guess of what we expect the results to be (from a previous survey or a pilot study, 50% if no data is available)
Confidence level: the probability that the margin of error contains the true proportion (e.g. if we choose a confidence level of 99%, we can expect that after multiple repetitions of the survey, the true value will lie within our specified margin of error 99% of the time)