jump to navigation

HEP in a Database March 19, 2008

Posted by gordonwatts in D0, computers.
7 comments

Not everyone is satisfied with ROOT as the “tool” to analyze HEP data. Back in D0’s Run I all the data was loaded into a commercial database.

So, before you roll your eyes - you are right. HEP is littered with database train wrecks (can anyone say Objectivity?). However, most of those had to do with trying to store every single last bit of data that came off the data acquisition system in the database. And then also store reconstructed data. And then, in some cases, even the analysis level objects. In fact, ROOT grew out disagreement with this vision (and you can tell who won…).

This project, however, was different. The goal was to store only the high level physics information. For a reconstructed jet, for example, they had the four vector and some other quantities (like electromagnetic fraction of calorimeter energies - 28 values in all). They had separate markers for tight very high quality electrons and loose, lower quality, electrons. Same for muons, jets, etc. To understand the limitations of this — and what you might or might not do with this tool: if you changed your jet energy scale you would have to completely re-load the database. This is not something you do frequently, but you get the idea: this is to do your final selection - the last mile of your analysis. Indeed, the test case was to repeat the Run 1 top discovery analysis. However, if you can do selection quickly imagine the power for scanning over a large SUSY parameter space!

How much data? About 62 million events. As a raw ntuple it was 62.4 GB of ntuples (small by today’s standards, of course!). It took almost 1000 hours to generate these ntuples - applying jet energy scale, etc. After being inserted into the database it was 80 GB of raw data, and another 30 GB of database index data.

They used Microsoft’s SQL Server for this. On a qual 450 MHz Pentium II with 256 MB of memory. Does that tell you how long ago this experiment was done!?

Actually, their DB design was pretty clever. All electrons in one table, all jets in another. Then another table which just listed all tight electrons, and another one that listed all loose electrons, etc.

So, how fast did this thing run? So, looking for a Z boson goes to two electrons took about 7 seconds. It found about 6000 events - the right number. Looking for a W boson decaying to an electron and neutrino took about 18 seconds to find 86,000 events. That is pretty darn good!

Are there plans to do this in ATLAS? Well, perhaps. We have a physics summary database - but it isn’t complete (e.g. doesn’t have all the jets in an event). It its design goal is different: you use it to select a sample of events you actually want to run over.

The project was lead by Rich Partridge at Brown University (with a lot of help from an undergraduate Matt Bowen). For more raw information you can see a talk by Rich at a SLAC meeting the other day (CERN ATLAS agendas, look for meetings on Feb 27, the SLAC ATLAS forum).

At any rate, this was something I’ve been meaning to write about for a while. Unfortunately for an approach like this, about 95% of an analyzer’s time is spent trying to understand what exactly is a tight electron - and its fake rate. However, anything that makes for fast turn around is a boon in my book!

Understanding an old Level 3 Bug March 17, 2008

Posted by gordonwatts in D0, computers.
1 comment so far

About one or two years ago I had to fix a bug in the D0 DAQ Supervisor. The Supervisor is responsible for coordinating the configuration of 400 or 500 farm nodes and about 80 front end crates that generate the data. It is massively multi-threaded. When it is at its busiest it has over 200 threads running. Most are simply me being too lazy to do anything but block while trying to send data to the Internet. Back in the day it ran on a slow dual-core machine under Linux and I did my best to avoid all locks that I could in my multi-threaded code - because locking is expensive, and the Supervisor needed every bit of speed help it could get back then (on a modern machine it is plenty fast enough).

My code was basically some initialization like the following:

global_a = 1.0;

global_b = 5.0;

global_inited = true;

Once global_inited was set to true, then I knew it was safe for the rest of my other threads to look at a and b:

if (global_inited) {

  use-global a…

}

Unfortunately this didn’t always work - sometimes the program behaved as if random values had been entered for a and b. I was never able to reproduce this either. It would happen only once in a while, and restarting the supervisor usually fixed it. Eventually, to fix this bug, I re-structured my code so that all the initialization happened before any other thread was started. After that I never saw the bug again. But I never understood why I was seeing the bug!

A guy who works deep in the stack at Microsoft recently started a blog. One of this first posts explains, possibly, what bit me: the compiler and the CPU (both!!) are allowed to reorder the order that global_a, global_b, and global_inited are set!! Since this bug was not reproducible it was probably done by the CPU, though at the time I never tested that (or ever really figured out what caused this).

Superstition in the D0 Control Room March 14, 2008

Posted by gordonwatts in D0, Fermilab.
add a comment

There are lots of old superstitions - some of them we still live our lives by. Running a large experiment like D0 is no different. For example, there are a set of ducks along the console - the rumor is if they aren’t there then the whole system will cease to operate. I don’t think anyone has been brave enough to remove them… ;-)

I pulled the following quote from a recent shift report:


Beam was nice for a while.  Then while talking to Bill Lee about losing the beam, we lost the beam, thereby illustrating Bill’s spooky powers in the control room.

Bill has long been making our control room run smoothly, and should know the lesson: don’t talk about loosing the beam! You’ll jinx it!! [Technical reason: apparently an important power supply went out of allowed operating range].

What is this helicopter doing? March 3, 2008

Posted by gordonwatts in D0.
7 comments

IMG_1106I spotted this guy hovering behind the D0 outback building last week when I was at Fermilab. What the heck was it doing there? Practice?

Robust Trigger March 2, 2008

Posted by gordonwatts in ATLAS, D0.
2 comments

I’ve got to give a talk on Trigger Robustness next Tuesday (the link is protected unless you are a member of ATLAS - sorry). This is supposed to be robustness in all aspects of the word: against beam-gas events, against DAQ problems, against calibration problems, etc.

If you were attending this thing and were to hear a report from the Tevatron, what would you want to hear?

The Sky Is Falling! January 3, 2008

Posted by gordonwatts in D0, Fermilab, physics life.
4 comments

IMG_0013I’m on shift. At D0 this means we are 4 stories underground, in an all concrete building. One of the shifters looked up and saw a single drop of water fall from the drop-ceiling (think those awful ceilings that every school has).

We called the guy responsible for this sort of thing - even though a single drop of water isn’t much of anything, it seemed like a very odd place: in the middle of a room, many feet from any exterior wall.

After a few minutes of investigation we discovered cracks in the concrete floor and water was dripping through from the large AC unit above (which cools our Level 3 computer farm of 1200 CPUs).

No no one wants to sit over in that corner of the control room. :-)

End Of Year Gift From the Tevatron December 31, 2007

Posted by gordonwatts in D0, Fermilab, science.
add a comment

About a week ago or so the Tevatron developed a vacuum leak. The accelerator can’t run under conditions like that: the protons whizzing around billions and billions of times a second would collide with the air or Helium that had leaked in and would go crashing out of the accelerator. It wouldn’t take long before all the protons and anti-protons were lost and there would be no collisions.

So, there is only one thing to do — shut down and fix it. Unfortunately, because of the leak’s location, they had to warm up a superconducting magnet. A controlled warming (and the required cool-down) take about 4 days. Then you have to fix the leak — so at least a week with no beam.

As a side note - I think the timing was rather interesting. Clearly, the Tevatron was complaining about the budget situation in Washington DC. :-)

At any rate, no one expected beam until tomorrow. This has been great: I’ve not had to do my owl shifts since I got here! But yesterday it started to look like they might make it. In the electronic logbook things like “lets get a store in this year!” started to appear everywhere. As a result, I was on shift from midnight until 8am last night, just in case. And they were busy. And even worse - they were calling all sorts of experts when they had problems - so a lot of people got woken up.

And lo - I just read they did it! That was a lot of work! But it is exactly the right way for an accelerator to end the year: making physics!

Happy New Year!

The Joke Marches On: Julia now a member of D0? October 10, 2007

Posted by gordonwatts in Conference, D0, physics life.
add a comment

A long long time ago, when I was working for Dave Cutts, a Brown University professor as a post-doc, he lamented that he couldn’t make it out for a D0 collaboration meeting. These collaboration meetings are just shy of a week long and involve lots of meetings but also a chance to get together with your colleagues — they can be a lot of fun. That was back in the day when I was feeling rather full of myself, so I decided to do something about Dave missing the meeting.

The result was this collaboration picture. If you look near the bottom, near the front, you’ll see me (you might want to click on the hi-res link). I’m holding up a poster of Dave. See — Dave was there! Unfortunately for me, that turned into one of the more popular collaboration photos. It is on the title slide of many D0 talks, it is on a few posters… And there I am holding up a picture of Dave. A little embarrassing!

Blow up of AranNeedless to say, I’ve not done that again. And I just missed another collaboration meeting — this one was just last week. And there was a collaboration photo - which is pictured above. I mentioned to Aran that I was sad I’d missed it. So, Aran, my post-doc, had a little fun with me. If you look at the high-res version (which takes a while to load), you’ll find the below picture. It turns out that Aran wasn’t the only one having fun — check out the person several rows back.

Good Students! August 27, 2007

Posted by gordonwatts in D0, physics life.
add a comment

You can’t help but follow the path of your students and feel pretty darn proud of them, no matter how little their current accomplishments actually had to do with you or the (lack of) training you provided as they got their Ph.D. Thomas Gadfort was just appointed co-convener of the D0 b-quark identification group, and Andy Haas will be D0’s new Higgs co-convener! Congratulations to both of you! I expect this is only the beginning… :-)