Reproducibility… September 26, 2013Posted by gordonwatts in Analysis, Data Preservation, Fermilab, reproducible.
I stumbled across an article on reproducibility recently, “Science is in a reproducibility crisis: How do we resolve it?”, with the following quotes which really caught me off guard:
Over the past few years, there has been a growing awareness that many experimentally established "facts" don’t seem to hold up to repeated investigation.
They made a reference to a 2010 alarmist New Yorker article, The Truth Wears Off (there is a link to a PDF of this article on the website, but I don’t know if it is legal, so I won’t link directly here).
Read that quote carefully: many. That means a lot. It would be all over! Searching on the internet, I stumbled on a Nature report. They looked carefully at a database of medical journal publications and retraction rates. Here is a image of the retraction rates the found as a function of time:
First, watch it for the axes here – multiply the numbers on the left by 10 to the 5th (100000), and numbers on the right by 10 to the –2 (0.01). IN short, the peak rate is 0.01%. This is a tiny number. And, as the report points out, there are two ways to interpret the results:
This conclusion, of course, can have two interpretations, each with very different implications for the state of science. The first interpretation implies that increasing competition in science and the pressure to publish is pushing scientists to produce flawed manuscripts at a higher rate, which means that scientific integrity is indeed in decline. The second interpretation is more positive: it suggests that flawed manuscripts are identified more successfully, which means that the self-correction of science is improving.
The truth is probably a mixture of the two. But this rate is still very very small!
The reason I harp on this is because I’m currently involved in a project that contains reproducibility as one of its possible uses: preserving the data of the DZERO experiment, one of the two general purpose detectors on the now-defunct Tevatron accelerator. Through this I’ve come to appreciate exactly how difficult and potentially expensive this process might be. Especially in my field.
Lets take a very simple example. Say you use Excel to process data for a paper you are writing. The final number comes from this spreadsheet and is copied into the conclusions paragraph of your paper. So you can now upload your excel spreadsheet to the journal along with the draft of the paper. The journal archives it forever. If someone is puzzled by your result, they can go to the journal and download the spreadsheet and see exactly what you did (aka modern economics papers). Win!
Only wait. What if the numbers that you typed into your spreadsheet came from some calculations you ran. Ok. You need to include that. And the inputs to the calculations. And so on and so on. For a medical study you would presumably have to upload the anonymous medical records of each patient, and then everything from there to the conclusion about a drug’s safety or efficacy. Uploading raw data from my field is not feasible – it is petabytes in size. This is all ad-hoc – the tools we use do not track the data as they flow through them.
As an early prof I was involved in a study that was trying to replicate and extend a result from a prior experiment. We couldn’t. The group from the other experiment was forced to resurrect code on a dead operating system, and figure out what they did – reproduce it – so they could ask our questions. The process took almost a year. In the end we found one error in that original paper – but the biggest change was just that modern tools were better had a better model of physics and that was the main reason we could not replicate their results. It delayed the publication of our paper by a long time.
So, clearly, it is useful to have reproducibility. Errors are made. Bias gets involved even with the best of intentions. Sometimes fraud is involved. But these negatives have to be balanced against the cost of making all the analyses reproducible. Our tools just aren’t there yet and it will be both expensive and time consuming to upgrade them. Do we do that? Or measure a new number, rule out a new effect, test a new drug?
Given the rates above, I’d be inclined to select the latter. And have a process of evolution of the tools. No crisis.
So long, and thanks for all the protons! September 29, 2011Posted by gordonwatts in D0, Fermilab, physics life.
add a comment
And there were a lot of protons!
This is a picture of the Cockroft-Walton at Fermilab’s Tevatron. This is where it all starts.
It isn’t that much of an exaggeration to say that my career started here. You are looking through a wire cage at one half of the Cockroft-Walton – the generator creates a very very very large electric field that ionizes Hydrogen gas (two protons and two electrons) by ripping one of the protons off. The gas, now charged, can be accelerated by an electric field. This is how protons start in the Tevatron.
And that is how most of the experimental data that I used for my Ph.D. research , post-doc research, and tenure research started. Basically, my career from graduate student to tenure is based on data from the Tevatron. The Tevatron delivers its last beam this Friday, at 2pm Central time (the 30th).
I’ll miss working at Fermilab. I’ll miss working at DZERO (the most recent Fermilab experiment I’ve been on). I’ll also miss the character of the experiments – CDF and DZERO now seem like such small experiments. Only 500 authors. I feel like I know everyone. It is a community in a way that I’ve not felt at the LHC yet. And I’ll miss directly owning a bit of the experiment – something I joined the LHC too late to do. But most of all I’ll miss the people. True – many of them have made the transition to the LHC – but not all of them. For reasons of travel, or perhaps retirement, these people I’ll probably see a lot less over the next 10 years. And that is too bad.
I’ll remain connected with DZERO for some time to come. I’m helping out with doing some paper reviews and I’m helping out with data preservation – making sure the DZERO data can be accessed long after the experiment has ceased running.
Tevatron. It has been a fantastic run. You have made my career. And I’ve had a wonderful time with the science opportunities you’ve provided.
So long, and thanks for all the (anti-)protons.
Tevatron Saw the Haiti Earthquake January 19, 2010Posted by gordonwatts in D0, Fermilab, physics, physics life.
What you are looking at there is an ACNET plot. I stare at plots similar to this when I’m on shift all the time. The top two plots – the green and red, are position monitors on the quadruple magnets just outside CDF and D0. They are quite stable until the earthquake. The Tevatron was running when this happened, and you can see in that lower red plot that some protons were knocked out of the ring by the ground shaking.
Note these movements are so small you never would have been able to detect them unaided. However, as my wife put it, “that is one expensive seismograph!” 🙂
See CERN History December 1, 2009Posted by gordonwatts in CERN, Fermilab, physics life.
This is a quick note to draw your attention to a small retrospective program that CERN has put together – “From the Proton Synchroton to the Large Hadron Collider – 50 Years of Nobel Memories in High-Energy Physics” – yeah, yeah, it is like a Microsoft product name, but check out the list of speakers – 13 of them are Nobel prize winners. And these are all “memory” talks – so they should be quite entertaining. The event will be video-broadcast over the internet – a link should appear in that agenda page where you can watch. The time is European central – which is 9 hours ahead of Pacific time in the USA.
The context for this event is the turn-on of the LHC, of course. The accelerator recently took the title of “most powerful accelerator in the world” away from Fermilab – and is on its way to a turn-on and real data. Ironically, I was on shift at Fermilab a few hours before this event happened – my plan was to call up the ATLAS control room if it did happen and congratulate them… but I was asleep by the time it actually happened.
I’m at CERN now – and the atmosphere is electric. This review talk is a perfect stepping stone for the future.
Fizzle! August 4, 2009Posted by gordonwatts in ATLAS, Fermilab, LHC, Tenure, university.
The biggest, most expensive physics machine in the world is riddled with thousands of bad electrical connections.
So starts a mostly accurate article in the New York Times about the current state of the LHC. There is good news and bad news in this sentence. To paraphrase a famous politician currently sight-seeing north of South Korea, it really depends on your definition of the word bad. To most people, if someone says that the electrical connection between your light and the wall socket is bad, then that means your light won’t work. That is the normal definition of bad. We High Energy Physicists have a different definition of bad. 🙂
For us, bad means that the connection isn’t going to conduct as much current as it could (I had a blog post about this a while back – but this article contains an excellent explanation – well worth registering if you have to to read it). And this is the reason behind the timing of this article. As I mentioned in that article it would not be until the beginning of August that the LHC group of scientists would have finished measuring all those connections – all those splices – and know exactly how bad they were. Tomorrow the LHC and CERN will announce exactly what energy they will run the LHC at initially.
But scientists say it could be years, if ever, before the collider runs at full strength, stretching out the time it should take to achieve the collider’s main goals…
And that is the bad part of the news. The bad connections mean that we can’t run at the full 14 TeV energy – we will run something short of that (I’m betting it will be 7.5 TeV – if I get it right it isn’t because I have inside information from the accelerator group!). The article is correct that running at this reduced energy won’t give us the access to the science we’d all expected and hoped for if we were running at 14 TeV.
But another thing to keep in mind is: we need data. Any data. And not to discover something new – because we need to tune up and commission our detectors! We’ve never run these things in anything but a simulated collider environment or looking for cosmic rays. We would probably be able to keep ourselves busy for almost a year with two months of data.
Peter Limon, a physicist from Fermilab got it right:
“These are baby problems,” said Peter Limon, a physicist at the Fermi National Accelerator Laboratory in Batavia, Ill., who helped build the collider.
Indeed, these are birthing problems – no one has ever run a machine like this before. Which brings me to the one spot in the article that got my hackles up:
“I’ve waited 15 years,” said Nima Arkani-Hamed, a leading particle theorist at the Institute for Advanced Study in Princeton. “I want it to get up running. We can’t tolerate another disaster. It has to run smoothly from now.”
Nima, whom I also know (and like), is a theorist. If an experimentalist said this we would all make them run outside turn around three times, and spit to the north to cancel the jinx they would have just placed on the machine. I think we can all guarantee that there are going to be other failures and problems that occur. We hope none of them are as bad as this last one. But if they are, we will do exactly what we’ve done up to now: pick up the bits, study them, figure out exactly what we did wrong, and then fix it better than it was originally made, and try again.
There was one last quote in that article I would have liked to have seen more of a back story to:
Some physicists are deserting the European project, at least temporarily, to work at a smaller, rival machine across the ocean.
The story behind this is fascinating because it is where science meets humanity. The machine across the ocean is the Tevatron at Fermilab (I’m on one of the experiments there, DZERO). There is plenty of science still there, and the race for the Higgs is very much alive – more so with each delay in the LHC. So scientifically it is attractive. But, there is also the fact that a graduate student in the USA must use real data in their thesis. Thus the delays in the LHC mean that it will take longer and longer for the graduate students to graduate. In the ATLAS LHC experiment the canonical number of graduate students quoted I hear is about 800. Think of that – 800 Ph.D.’s all getting ready to graduate – about 1/3rd or more of them waiting for the first data (talk about a “big bang”). Unfortunately, you can’t be a graduate student forever – so at some point the LHC is taking long enough and you have to move back to the USA in order to get a timely thesis. Similar pressures exist for post-docs and professors trying to get tenure.
UPDATE: Just announced earlier today: they will start with 3.5×3.5 – that is, 7 TeV center of mass. This is exactly half the design energy of the LHC. The hope is that if all runs well at that energy they can slowly ramp up to 4×5 or 8 TeV. At 8 things start to get interesting as a decent amount of data at 8 will provide access to things that the Fermilab Tevatron can’t. Fingers crossed all goes well!
2009. Ready or not January 2, 2009Posted by gordonwatts in ATLAS, CERN, D0, Fermilab, LHC, politics, science.
We’ve made it through the first day of 2009. I have mixed feelings about this coming year.
- Federal Science Funding Levels. The economy is crashing down around our ears. Business responds quickly (layoffs :() – government is a bit slower. If things followed their natural course of action that would mean science funding, along with everything else, will take yet another hit. However, the incoming Obama administration seems to be committed to spending the USA’s way out of this recession, so in the end funding might not change very much. I am hopeful that hard sciences funding will remain at least stable.
- Federal Science Funding Directions. Climate change is what the Obama administration is focused on. There is a good chance that if you are researching something connected with climate change you may have access to increased funding opportunities. I would expect a funding profile similar to NIH’s funding during its years of increase. I would like to think that funding will spill over into the physical sciences – it should because there are connections between the physical sciences and clean air technologies. All of this is applied scientific research. I hope that the pure research funding gets an increase as well, as an investment in this countries future (particle physics is pure research, of course). I’m feeling neutral here.
- Federal Science. Obama’s science team is just a BLAST of fresh air when compared to the current administration’s. After all, his DOE nominee is a Nobel prize winning experimental physicist. Even if the science advisor isn’t elevated to a cabinet position (PDF), there will be someone in the room that knows a great deal about science, research, and how it is done. Even if there are cuts to science funding, I’m very hopeful there will be intelligent cuts rather that unscientifically motivated cuts. I’m very hopeful in this respect.
- State Universities. The economy in states is depressing. Some states, like my own (Washington) that rely on sales tax are being hit hard and very fast. State universities can’t escape that, obviously, and my university is no exception. Unfortunately, this usually translates to reduced raises, inability to counter offers from outside, reduced support for research, etc. In our own department I wouldn’t be surprised if some people left for other universities that, for whatever reason, were able to make good offers in this awful climate. There is, in fact, already evidence this is happening. The only consolation is most universities are in the same boat, and so most of them are having similar problems. I know less about private universities, but I do know the endowments of many of them are also having difficulty. I’m very downbeat about this: it will be a rough two years at least, I think.
- My Science. When it comes to the Tevatron and the LHC… Well, I see no reason that the Tevatron shouldn’t continue to break records in luminosity (they just broke one earlier this week). And the experiments will continue to be flooded with data. While it is possible for one experiment or the other to have a catastrophic failure, I doubt that will happen. And they should continue to produce papers and science at a furious rate. I also am looking forward to real LHC collision data this year. While I hope it will be at the full 14 TeV, I suspect it is more likely to be at 2 TeV, just a hair above the Tevatron’s luminosity. We’ll hopefully know what the machine scientists think about that sometime in February. I’m really hopeful about this.
- New Years Resolutions. Well, I made only one. That way I have a hope of keeping it: make bread more often. 🙂 I think there is a chance that I will keep this one. Especially now that I’ve said it publically. 🙂
Of course, this should also be a fun year, as noted by the Beacon News:
Frustrated with their failed attempt to destroy the world in 2008, the scientists at Fermilab and their counterparts at Switzerland’s CERN physics lab resolve to perfect their new device, the Large Planet-Sucking Black-Hole-o-Tron.
Here is to another great year of data collection and science at the Tevatron and first collision data at the LHC!
Green is a Relative Thing December 23, 2008Posted by gordonwatts in D0, Fermilab.
I’m ending a series of 3 owl shifts at DZERO right now. The Tevatron, the accelerator at Fermilab, has been going great guns all week. It finally broke today. You know it is bad when the post to Channel 13, the web page that tells you the status of the machine, “Experts working on LRF3; no estimate.” The Linac is busted. That means no data for a while.
Looking at the accelerator’s log book (not accessible from outside fermilab) we found an interesting entry (we means myself and the other 3 people here on shift):
An energy-conservation timeline has been loaded
We called to find out what that means. Mike, from the Main Control Room, told us that is is like putting your car at idle. The Main Injector normally is constantly ramping protons up to 150 GeV energy and slamming them into a target. It does this once about every 2 seconds. With the Linac broken, however, there are no protons to accelerate – so why ramp every 2 seconds. It takes energy to ramp… The effective equivalent of putting your car as idle when you are at a stop light rather than keeping it revving at 4000 RPM’s.
Fermilab uses a lot of power – in 2007 the power consumed was about that required to run 45,000 homes. A lot!! As you can imagine this has impacts both on operating costs and general “greenness” (pollution, etc.). There is a broad effort to reduce power at Fermilab, but this is the first one I have seen in the science program. Very cool.
You might ask – since there is no beam, why run the Linac at all? Why not just shut it off. I will point you to a previous posting of mine:
On Tuesday I decided to shut down my home computer. I’m not sure why I decided to do that – I almost never do. … When I hit the “power” button on the 2.5 year old Dell XPS/200 machine the power light briefly flickered yellow… and that was it.
The accelerator is so large and so complex and there are so many different parts (and computers!) that shutting it down and then turning it on is something that is only done when a very long shutdown is planned. Very long means months. Otherwise things fail and then it takes much longer to get back to doing the science.
For those not familiar with the operation of the Tevatron, the “no estimate” isn’t as bad as it might sound. It just means the experts who have looked at the problem scratched their collective heads and said “Hmmm, I don’t recognize this!”. Usually that means it will take several hours to get things going again. Experiments treat it as an opportunity, actually. The machine has no protons circulating and so we can take special calibrations. Or sometimes we can get access to the detector and fix things.
Tomorrow I jump on a plane and my actual Christmas break starts! Happy holiday’s everyone!
The Atom Smashers November 24, 2008Posted by gordonwatts in Fermilab, science, USA.
The Atom Smashers (http://www.pbs.org/independentlens/atomsmashers/) will show on PBS on Tuesday night. It looks like it focuses on Fermilab and the particle physics research occurring there. I like their tag line:
After funding cut backs, Fermilab—a premier U.S. government research laboratory focusing on particle physics—is struggling to survive. Physics, politics and international competition collide as scientists race to find one of the most elusive sub-atomic particles ever theorized: the Higgs boson.
Elsewhere on the site the film makers claim they don’t try to answer questions – but rather to get you to "think":
We hope this film will raise the awareness of America’s strange relationship with science. We don’t attempt to answer questions in our film, but rather to raise them. Is this research worth doing? Should we care about it? Should the U.S. participate in it or let it get done elsewhere? Also, we hope to help demystify science and scientists. We’d love it if a viewer came away thinking, “You know, those scientists are not really that different from me."
That last line being one of the main points of this blog!! Leave a comment if you get a chance to see it – I’d like to know what you think!
This show is part of PBS’ Independent Lens project. I have no idea if it will be available online. I hope so as I don’t have a TV receiver (their videos online are all very short, so I might be out of luck)!
P.S. Sorry about the links (and lack of them) – the computer I’m on doesn’t have my normal blogging software and so is a pain-in-the-butt to use.
Parts Not Available? eBay! October 6, 2008Posted by gordonwatts in D0, Fermilab.
The DZERO detector – or at least parts of it – are quite old now. And the goal of the Tevatron is to collect as much data as possible as cheaply as possible. So what do you do when you need a spare part that hasn’t been manufactured in 12 years? You can redesign the system to use a modern part… or you go to eBay. I didn’t realize this, but this is what we do at Fermilab for very old parts. How cool is that!? Smart use of money… which I’m guessing is going to be in very short supply in the near future!
UPDATE: Turns out I misheard. The parts were actually purchased on the grey market – eBay came up when the person was explaining what the grey market was. Sorry about that!
5 fb-1 – thanks, Fermilab! September 29, 2008Posted by gordonwatts in D0, Fermilab.
add a comment
Fermilab just reached 5 fb-1 of data delivered to the experiments. When things started in March of 2001 I don’t think I ever expected us to get here – but the recent performance of the Tevatron has been stellar! The DZERO experiment has recorded 4.36 fb-1 of data (I expect CDF is close to that). The 13% dead time is due to downtime on our detector’s part – broken bits and normal trigger dead time.
The current results the Tevatron is releasing are all for 3 fb-1 of data – so we have an additional 2/5ths worth of data to improve everything (like our Higgs).