The Quantile-Quantile Plot June 6, 2008Posted by gordonwatts in statistics.
We often need to compare two histograms for similarity – we are curious to know if the two histograms are from the same source. Perhaps the most common time this occurs for us is comparing our Monte Carlo based background model to actual data. If we didn’t get the background model right we can’t go on and look for any signal!
We often have 100′s if not in the low 1000′s of these plots to compare. And we can’t really do it by eye – that introduces a human bias. So usually we resort to various statistical methods to compare them. The most common that I’m aware of is the so-called “K-S” test (Kolmogorov-Smirnov test). This produces a single number, and the nice thing is you can sort on that number and look at the worst cases and use those to guide you finding something incorrect.
Recently, on an internal discussion list, someone proposed the so-called “q-q” plot, sort for quantile-quantile plot. The plot attached to this blog-posting is an example. There are two batches of data, #1 along the vertical axis, and #2 along the horizontal access. Lets say that in sample 2, that 20% of the data is below 475, but in sample #1 20% of the data is below 550. Now do that for every % fraction (“quantile”). You can do this at all “%”‘s to make a plot as above. If the two samples were similar you might expect the point to fall along the 45 degree line (or on both sides of it due to statistical fluctuations).
I like this a lot better than a single number – you can tell what went wrong. The only disadvantage is you can’t sort by these plots and then look start by looking for the worst agreement. I wonder if you did something like the sum of the deviation from the 45 degree line if that would order in a way similar to the KS test?
More TFractionFitter March 30, 2008Posted by gordonwatts in physics, statistics.
A very technical post on the performance of TFractionFitter.
Looking at the original Barlow and Beeston paper we found that the approximation breaks down in the case of weighted events. If you want to have some fun, change the initial normalization of the templates that you pass to TFF.
As I mentioned, we used this fit method in some of our current work. For my Friday night entertainment I decided to see if I could test it out a bit.
Constructing a few test cases, and then running them 1000′s of times (ensemble tests) isn’t all that hard (well, see below). Recall TFractionFitter takes several Monte Carlo templates and tries to find components of them in a data histogram. I constructed the templates out of two Gaussians for this test. The data histogram I built was 30% of one of the two Gaussians, and 70% of the other.
In a simple test where I create the data histogram once, with 10,000 entries, and then re-create the template histograms over and over (about 150 times) and perform the fit each time, I found that the fitter got the fractions correct – it correctly identified the 30% and the 70%. Further, as is mentioned in the paper, if I looked at how much it got the fraction incorrect each time, it does indeed look like the errors are underestimated (the exact amount depends on how I arrange the two Gaussians).
To test the weighting as described by Mike, I altered each template’s normalization to be 50% of its initial value, but left the data histogram the same. I got the same result as before. I tried altering the number of entries in one or the other and still the same. So, thankfully, I can’t reproduce this (I was using 5.18 to do this testing).
One thing I discovered — my initial choice of Gaussians had the second Gaussian about 20% off the end of the histogram I was using. The result? TFractionFitter got the answer wrong consistently by about 5%. 5% isn’t that big a deal if you get your errors right — but it didn’t. It was almost always more than one sigma low or high (depending on which fraction you looked at) – a clear bias. The distributions we are using in our work look more like falling exponential – so I’ll have to test that out and see if that shows the same thing. This is another buyer-be-ware comment similar to Mike’s.
I used this weekend project as an excuse to put my money where my mouth is… sort of (meaning I spent time, not money!). I’ve long talked about, on this blog and other places, that if we are to really take advantage of multi-core processors we are going to have to change the way we write code. So for this little project I thought was a perfect excuse to use a new programming language — a functional programming language. I used F#, which is based on ML (OCaml is the current standard implementation of ML). If anyone is interested, I can write some (also technical like this post) posts on what it was like. I used F# instead of OCaml, in part because it has access to all the ROOT libraries due to some other work I’ve done.