Efficiently Tracking Response Time Percentiles

November 6th, 2009 by  |  Published in Java  |  9 Comments

As we’ve recently started feeling that response times of one of our webapps got worse, we decided to spend some time tweaking the apps’ performance. As a first step, we wanted to get a thorough understanding of current response times. For performance evaluations, using minimum, maximum or average response times is a bad idea: “The ‘average’ is the evil of performance optimization and often as helpful as ‘average patient temperature in the hospital'” (MySQL Performance Blog). Instead, performance tuners should be looking at the percentile: “A percentile is the value of a variable below which a certain percent of observations fall” (Wikipedia). In other words: the 95th percentile is the time in which 95% of requests finished. Therefore, a performance goals related to the percentile could be similar to “The 95th percentile should be lower than 800 ms”. Setting such performance goals is one thing, but efficiently tracking them for a live system is another one.

I’ve spent quite some time looking for existing implementations of percentile calculations (e.g.  1, 2). All of them required storing response times for each and every request and calculate the percentile on demand or adding new response times in order. This was not what I wanted. I was hoping for a solution that would allow memory and CPU efficient live statistics for hundreds of thousands of requests. Storing response times for hundreds of thousands of requests and calculating the percentile on demand does neither sound CPU nor memory efficient.

Such a solution as I was hoping for simply seems not to exist. On second thought, I came up with another idea: For the type of performance evaluation I was looking for, it’s not necessary to get the exact percentile. An approximate answer like “the 95th percentile is between 850ms and 900ms” would totally suffice. Lowering the requirements this way makes an implementation extremely easy, especially if upper and lower borders for the possible results are known. For example, I’m not interested in response times higher than several seconds – they are extremely bad anyway, regardless of being 10 seconds or 15 seconds.

So here is the idea behind the implementation:

  1. Define any random number of response time buckets (e.g. 0ms – 100ms, 100ms – 200ms, 200ms – 400ms, 400ms – 800ms,800ms – 1200ms, …)
  2. Count number of responses and number of response each bucket (For a response time of 360ms, increment the counter for the 200ms – 400ms bucket)
  3. Estimate the n-th percentile by summing counter for buckets until the sum exceeds n percent of the total

It’s that simple. And here is the code:  PercentileCounter.java and Percentile.java

Some highlights:

public void increment(final int millis) {
    final int i = index(millis);
    if (i < _limits.length) {
public int estimatePercentile(final double percentile) {
    if (percentile < 0.0 || percentile > 100.0) {
        throw new IllegalArgumentException("percentile must be between 0.0 and 100.0, was " + percentile);
    for (final Percentile p : this) {
        if (percentile - p.getPercentage() <= 0.0001) {
            return p.getLimit();
    return Integer.MAX_VALUE;

This approach only requires two int values (= 8 byte) per bucket, allowing to track 128 buckets with 1K of memory. More than sufficient for analysing response times of a web application using a granularity of 50ms). Additionally, for the sake of performance, I’ve intentionally implemented this without any synchronization(e.g. using AtomicIntegers), knowing that some increments might get lost.

By the way, using Google Charts and 60 percentile counters, I was able to create a nice graph out of one hour of collected response times:

Response times


  1. Markus Kohler says:

    November 10th, 2009 at 12:39 am (#)

    The graph looks very cool!

  2. Dag Liodden says:

    November 11th, 2009 at 12:20 pm (#)

    Great stuff! You forgot to let PercentileCounter implement Iterable to make it compile though. 🙂 Go a class for generating the GChart url too?

  3. Stefan Fußenegger says:

    November 11th, 2009 at 2:15 pm (#)

    @Markus Thanks! I’ve always believed that data is nothing without (sexy) visualization.

    @Dag Thanks, fixed that. I’veuploaded two additional files: (IntervalPercentileChartImage.java and GoogleChartModel.java) While these files require some effort to make them compile, they should give a very good start to generate Google Chart URLs from PecentileCounters. That’s the extra effort you could have avoided with a small donation – you’d be surprised what people do for beer 😀 (After doing this yourself, I’d be very happy to upload the changed/working version though)

  4. Andy says:

    January 5th, 2010 at 9:10 pm (#)

    Try the simple algorithm defined in the paper “Sequential Procedure for Simultaneous Estimation of Several Percentiles” (Raatikainen). It’s fast, requires 2*m+3 markers (for m percentiles) and tends to an accurate approximation quickly.

  5. Stefan Fußenegger says:

    January 11th, 2010 at 3:56 pm (#)

    Hi Andy, thanks for the hint. I’ll *try* to have a look.

  6. Jakob Eriksson says:

    August 7th, 2012 at 10:18 am (#)

    @Andy, simple I am sure, but to me at least, the algorithm is hidden in mathematical notation. I don’t understand that paper. 🙂

  7. Andre says:

    March 29th, 2013 at 9:53 pm (#)

    Very nice, this helped immensely. I’ve craeted a .net version based on this (http://andrevdm.blogspot.com/2013/03/efficiently-tracking-response-time.html).

    Thanks for the great explanation

  8. Stefan Fußenegger says:

    April 3rd, 2013 at 3:14 pm (#)

    Hi Andre,

    thanks for your comment and giving credit in your blog post. Do you mind telling me what percentiles you are tracking?

    Cheers, Stefan

  9. Andre says:

    April 3rd, 2013 at 5:57 pm (#)

    Hi Stefan,

    Initially performance stats for fetching data for a web service from a SQLite DB. Used to profile & compare in memory vs. on disk. The percentiles and the historic view gave a huge amount of insight over what I was originally getting from averages

    Going forward I’ll be using it to track the performance of many different things, web services, db calls, proprietary endpoints, message queues etc

    Very useful!

    Thanks again

Leave a Response