Jumping the Gun

Jumping the Gun April 4, 2011

Posted by gordonwatts in Uncategorized.
trackback

The internet has come to physics. Well, I guess CERN invented the internet, but, when it comes to science, our field usually moves at a reasonable pace – not too fast, but not (I hope) too slow. That is changing, however, and I fear some of the reactions in the field.

The first I heard about this phenomena was some results presented by the PAMELA experiment. The results were very interesting – perhaps indicating dark matter. The scientists showed a plot at a conference to show where they were, but explicitly didn’t put the plot into any public web page or paper to indicate they weren’t done analyzing the results or understanding their systematic errors. A few days later a paper showed up on arXiv (which I cannot locate) using a picture taken during the conference while the plot was being shown. Of course, the obvious thing to do here is: not talk about results before they are ready. I and most other people in the field looked at that and thought that these guys were getting a crash course in how to release results. The rule is: you don’t show anything until you are ready. You keep it hidden. You don’t talk about it. You don’t even acknowledge the existence of an analysis unless you are actually releasing results you are ready for the world to get its hands on and play with it as it may.

I’m sure something like that has happened since, but I’ve not really noticed it. But a paper out on the archives on April 1 (yes) seems to have done it again. This is a paper on a Z’ set of models that might explain a number of the small discrepancies at the Tevatron. A number of the results they reference are released and endorsed by the collaborations. But there is one source that isn’t – it is a thesis: Measurement of WW+WZ Production Cross Section and Study of the Dijet Mass Spectrum in the l-nu + Jets Final State at CDF (really big download). So here are a group of theorists, basically, announcing a CDF result to the world. That makes a bit uncomfortable. What is worse, however, is how they reference it:

In particular, the CDF collaboration has very recently reported the observation of a 3.3 excess in their distribution of events with a leptonically decaying W+- and a pair of jets [12].

I’ve not seen any paper released by the CDF collaboration yet – so that above statement is definitely not true. I’ve heard rumors that the result will soon be released, but they are rumors. And I have no idea what the actual plot will look like once it has gone through the full CDF review process. And neither do the theorists.

Large experiments like CDF, D0, ATLAS, CMS, etc. all have strict rules on what you are allowed to show. If I’m working on a new result and it hasn’t been approved, I am not allowed to even show my work to others in my department except under a very constrained set of circumstances*. The point is to prevent this sort of paper from happening. But a thesis, which was the source here, is a different matter. All universities that I know of demand that a thesis be public (as they should). And frequently a thesis will show work that is in progress from the experiment’s point of view – so they are a great way to look and see what is going on inside the experiment. However, now with search engines one can do exactly the above with relative ease.

There are all sorts of potential for over-reaction here.

On the experiment’s side they may want to put restrictions on what can be written in a thesis. This would be punishing the student for someone else’s actions, which we can’t allow.

On the other hand, there has to be a code-of-standards that is followed by people writing papers based on experimental results. If you can’t find the plot on the experiment’s public results pages then you can’t claim that the collaboration backs it. People scouring the theses for results (as you can bet there will be more now) should get a better understanding of the quality level of those results: sometimes they are exactly the plots that will show up in a paper, other times they are an early version of the result.

Personally, I’d be quite happy if results found in theses would stimulate conversation and models – and those could be published or submitted to the archive – but then one would hold off making experimental comparisons until the results were public by the collaboration.

The internet is here – and this information is now available much more quickly than before. There is much less hiding-thru-obscurity than there has been in the past, so we all have to adjust. Smile

* Exceptions are made for things like job interviews, students presenting at national conventions, etc.

Update: CDF has released the paper…

Comments»

1. Andy - April 4, 2011: As long as we’re going to dig stuff up with google, look at this earlier talk on the same CDF analysis:

Click to access Cavaliere.pdf

Page 20 shows that the excess at 150 GeV is only in the muon channel… that makes it more or less interesting, depending on your perspective. 🙂
And page 22 shows an Mjj plot of the same data where the excess is much less convincing.

Reply
Bjoern - April 4, 2011: Interestingly enough the excess in EM is seen in the thesis but now in the RA talk above. I am not able to figure out which is the more recent plot and what the differences are. Would be interesting to know.

Reply
2. Mike Miller - April 4, 2011: I think this is a topic worthy of a lot more discussion. I do think that the importance and sophistication of HEP Collider results merit a far more thorough internal “vetting” than some other portions of science. However, I think that in HEP we’ve lost something very, very valuable in moving science along, namely the ability to stand back and say “we don’t fully understand these data.” There used to be a clear cut boundary where that question can be asked (anywhere within the collaboration) but I personally witnessed that line retreating backwards to the working group and, sometimes, further. Somehow, I think that is very wrong. I’d much rather prefer a more open discussion of work in progress in the hopes of pushing the field along faster. Witness the Dark Matter bonanza that I’ve gotten involved in. We literally debate findings and detector acceptance on the arxiv. It feels more combative, sure, but I know personally that the public banter has helped move the field forward faster than private debate on internal Ge backgrounds or scintillation efficiency would have.

Second, I think we’ve also lost the ability to be wrong and make mistakes. I’d personally be game for moving the field forward by releasing an incomplete or even flat out mis-interpreted result if I was reasonably certain that someone more clever than me (or my 500 collaborators) could provide a better understanding of the situation, likely by having a broader context in which to interpret and analyze.

Lastly, I feel that in HEP the following is also at play: there is simply a mass shortage of really new results, and we are in a mode where we need to squeeze the last drops of water out of the stone. I believe it was Ting that famously said, “If you design the right experiment, you don’t need statistics.” We are now putting enormous effort in for important, but fleetingly rare, new insights into nature.

Sorry, that came off pretty “ranty”

Reply
3. gordonwatts - April 4, 2011: http://arxiv.org/abs/1104.0243 – Bjorn pointed out there are more (and it is going to get worse over the next few days). We need a “trending topics” or “trending papers” feature for the pre-print archive!

Reply
4. gordonwatts - April 4, 2011: Thanks, Andy & Bjoern, for the posts. I think that is the point – we need to wait for CDF to actually release the paper and then we can know what everyone settled on as the final result.

Reply
5. Gordon Watts - April 4, 2011: Crap. Second re-post. Sometimes the web interface sucks! 🙂

Mike – no worries about the rant. This is a blog not a big physics conference – so you are supposed to rant!

Your first two points I’d like to see what others say… I’ve got some ill formed opinions at the moment. So I hope others will chime in. I’d be particularly curious if there is a difference in age/career stage as to opinions.

Your last point. “Build the rigth experiment” – well, if you know something is there you can build the right experiment. Or you just get lucky, and the 100 experiments before were wrong – and depsite their failure guiding you to get the right one, they get no credit. You are right that we are putting enoumous effort into understanding the way the universe works. And, implied, I think, in your statement is a value judgement: is it worth it?

Reply
6. A Discovery At the Tevatron! – Maybe « Collider Blog - April 6, 2011: […] — even before the CDF Collaboration released their results! (I agree completely with Gordon Watts’ comments about this.) You can also read a report in the New York Times. I’ll not comment on […]

Reply
7. Tony Smith - April 6, 2011: Gordon,
I will not here talk about how my physics model might be supported by the results of arXiv 1104.0699 (I already did that in a comment over at Tommaso’s blog)
but
here are questions about the interplay of Viviana Cavaliere’s thesis and the the big splash of publicity (New York Times etc):

The Cavaliere thesis has been available for public download from Fermilab as FERMILAB-THESIS-2010-51 for some time,
and seems to contain all the interesting results of 1104.0699,
such as the thesis saying: “… the discrepancy found between the best fit to the dijet mass of the known components and the data in the region [120; 160] GeV/c… has a significance of 3.3 .
The latter results are in the process of being approved by the Collaboration …”.

How is it that the Collaboration allows public access to the results of 1104.0699 prior to approval by the Collaboration
and
then, substantially later,
makes a big press-publicity splash about it as though it had just been released for the first time ?

A second question is: Tommaso Dorigo said in a comment on Peter Woit’s blog:
“… it is nice that CDF is not afraid to publish these kinds of signals any more –
ten years ago they would rather sit on such a thing and die than publish it …”.
What caused that change in attitude by CDF ?

Tony

Reply
8. Gordon Watts - April 6, 2011: Fit on a falling distribution… 🙂

So, it *has* been released for the first time. Regardless of the fact that the two plots might be identical (or not), what was in the thesis was not CDF saying it had a new result. That was one of the main points of this post. Until it appears on the experiment’s public page the experiment has not signed off and approved the result. A thesis is, in many respects, independent of the experiment.

As to what has changed in CDF – I don’t know as I’m not in CDF any longer. You should ask them. 🙂 However, at the Tevatron gets towards end-of-run, it is less and less likely that you can add alarge amount of data to your result – so this might be the best you can do – so you release it w/out interpretation (this is not the case here as I think one can x2 the data even now – and I’m sure they are very busy doing that). Also, the competition with the LHC means that if you have something you’d better get it out there. You’ll be remembered more for getting it right than making a mistake once in a while (depending on how bad the mistake is, of course). But these are just guesses.

Reply
9. Suspicious Bump « Not Even Wrong - April 6, 2011: […] For other blog postings about this well worth reading, try Michael Schmitt, Resonaances, Gordon Watts and Flip […]

Reply
10. S. Olsen - April 7, 2011: I think this shows that there are too many theorists, and they have too little to do.

Reply
11. Matti Pitkänen - April 7, 2011: TGD suggests two explanations for the possibly discover new boson. The first explanation is in terms of an octet of exotic weak gauge bosons predicted by topological explanation of family replication phenomenon. This explanation fails because the boson prefers decay to quark pairs.

Second explanation is in terms of 15 year old of a scaled up copy of hadron physics whose proton would have mass around 512 GeV. I talked about this with Tony Smith with long time ago! The pions of this physics would be produced abundantly and charged pion would decay to W and quark pairs in accordance with the basic observation. The mass of the pion would be by naive scaling 71.7 GeV but p-adic scaling by factor two is possible and produces 143.4 GeV mass: Lubos mentions 145 GeV mass as the most probable estimate.

For details see my blog posting

Matti Pitkanen

Reply
12. Phil - April 9, 2011: I entirely agree with Mike Miller: I think we err too much on the side of attempting to only present things when every i is dotted and every t crossed, as if our colleagues are unable to interpret anything more complicated than a single number with an error bar.

As a concrete example, I recently heard someone opine that we couldn’t possibly release the results from two different analyses of the same data without indicating that one is the “preferred” result. This seems to me to imply that people don’t realise that two equally good, equally correct analyses can give different results.

Reply
13. Suspicious Bump | Yonkers Mesothelioma Lawyer - April 10, 2011: […] For other blog postings about this well worth reading, try Michael Schmitt, Resonaances, Gordon Watts and Flip […]

Reply
14. gordonwatts - April 11, 2011: Phil – sorry, workshops and traveling have meant I’ve not been able to get back to this.

So, I’ve been part of these discussions. If two analyses measure the same thing and get different answers then there is no way I’d let it out of the door either. You rae using the same data in both cases, and the same data is giving you two different answers to the same question.

Perhaps you are refering to something else?

As far as our collegues interpreting a single number or error bar – at some level, isn’t that true? Imagine if you had to totally understand the jet energy scale determination in order to interpret a result? So – where do we draw the line? What is too experiment specific and thus should be carefully meausred by the collaboration, and what dirty laundry is it ok to let hang out?

Reply
15. Phil - April 11, 2011: Gordon, I see your point. I guess I was thinking more of the case where the two results are (statistically) consistent with one another, or (in the case I was referring to) where the difference in the results comes from a different statistical method for coming up with the error bar: one doesn’t have to read much of the PHYSTAT talks and proceedings to realise that error bars are not necessarily unique!

Your question about how much interpretation should be asked of the reader is interesting, and I don’t know exactly where to draw the line, but I think perhaps the discussions within collaborations tend to underestimate how much interpretation readers (especially specialists within the field) can and do do when looking at a result (see: any journal club, or the jet energy scale plots of the W->jj data from CDF that Tommaso Dorigo posted on his blog [ http://www.science20.com/quantum_diaries_survivor/blog/jet_energy_scale_explanation_cdf_signal-77886 ])

Reply