How to design an experiment in A/B testing.
A unit of diversion is how we define what an individual subject is in the experiment. When we define a unit of diversion, we assume every draw of the subject in independent.
Commonly used units of diversion are:
- User identifier (id): Typically, the username or email address used on the website. It is typically stable and unchanging. If user id is used as a unit of diversion, then it is either in the test group or the control group. User ID is personally identifiable
- Anonymous id (cookie): This is usually an anonymous identifier such as a cookie. It changes with browser or device. People may often refresh their cookies every time they log in. It is difficult to refresh a cookie on an app or a phone compared to the computer.
- Event: An event is a page load that can change for each user. This is used typically for changes that is not user facing.
Less used units of diversion are:
- Device id: Typically available for mobile devices. It is tied to a specific device and cannot be changed by the user.
- IP address: The ip address is location specific, but may change as the user changes location (e.g. testing on infrastructure change to test impact on latency)
3 main considerations in selecting an appropriate unit of diversion are:
- Consistency
- Ethical
- Variability
Variability is higher when it is calculated empirically than when calculated analytically. This is because the unit of analysis (i.e. the denominator in the metric) is different from the unit of variability.
A unit of analysis is the denominator of your metric.
E.g. If unit of diversion is a query, then coverage (= #queries with ads / # queries) will have lower variability compared to using a cookie as a unit of diversion. This is because when a query is used, the unit of diversion matches the unit of analysis (which is the denominator of the metric. i.e. query
Pingback: A/B Testing Cheat sheet To Ace the Interview - Bello Data