Posted on and Updated on

“Multi-armed bandit” A/B testing optimality proved?

Correct me if I’m wrong, but it seems that this paper proves optimality of “multi-armed bandit” approach to A/B testing. The latter one was described in this post earlier this year.

For those who do not understand what it is about: A/B testing requires investment in the form of sample size (usually it is equal to number of unique users), which is time and money. “Multi-armed bandit” approach is about optimising this investment.

I wouldn’t say you’re ancient if you aren’t doing it already, but it’s interesting to see how abstract science creates new opportunities for business.

Posted on and Updated on

Measuring time spent on page

One of the challenges of A/B testing is insufficient observations due to low traffic. In other words, if you measured the conversion rate on our web site, it would take months or even years before we’d get conclusive result. What you can try to measure are microconversion and microobservations. That’s what I was up to recently. There are couple of microobservation types I identified so far: time spent and the depth. The time spent is basically how much time a visitor has spent on the web site in seconds and the depth is how many clicks he made after seeing the landing page. As you might notice, you always have some time spent and depth measurements, unless the visitor is a bot.

The other way you can enlarge your data set is by using visits instead of visitors. In case of time spent and depth metrics it makes much more sense.

I used standard Nginx userid module in order to identify visitors. When a visitor requests a page, a special action in C++ application is requested through a subrequest using ssi module. This actions registers the UID and the experiment in memory table and assigns a variant (A or B). Then it returns the variant in response and it gets stored in an Nginx variable. After that I use the value of this variable to display proper variant of the page.

In order to track time I use a Java script that sends periodic updates to the server. Nginx sends these requests to the C++ application via FastCGI and the application updates the timestamps in memory tables. The depth tracker works in same way, but the tracking action gets invoked only when the page is loaded. Although periodic updates might produce intensive load on the server even for medium sites, as you might already know for Nginx it’s a piece of cake.

A separate thread in the C++ application saves the content of memory tables to a file periodically, and that’s how the observations get stored permanently.

Of course this application requires Java script working on client’s browser, but who doesn’t have it nowadays? A positive side effect is that you get bots automatically filtered out.

One of the interesting questions is what statistical distribution do the time spent and the depth have? My hypothesis was that they have exponential distribution. For me it is still not completely clear. I spent some time implementing code for calculating statistical properties of exponential distribution. It is not trivial and results don’t look very trustworthy. I haven’t had success with exponential distribution yet. Instead I’m using normal distribution properties for the time spent and the depth at the moment. After removing outliers, these numbers look very trustworthy.