CHEP Day 1 September 13, 2007Posted by gordonwatts in computers, physics.
I’m spending this week in Victoria, Canada attending the Computers in High Energy Physics (CHEP) conference. At one time this was my favorite conference (more on that in a later post). These summaries are just things in the talk I found interesting. [I spent last week there — I’m just getting around to cleaning up these posts!!!]
Plenary: LHC Computing – Ian Fisk
Ian talked mostly about HEP’s use of the GRID – that big computing appliance in the sky… err, I mean cloud. He showed a great slide that really tells you how much trouble we’ve gotten ourselves into – all those layers up layers in the GRID: “Not all of them included just for technical reasons.” We keep adding layers and adding layers to hide complexity. It is no wonder some of our systems are so fragile! He points out that everyone is developing their own front-end system interface to the GRID. He made the point (along with others before and I’m sure after) that the level of complexity means that we have to develop more and better automated testing of the tools.
The ALICE experiment has the ability to make custom datasets based on simple event cuts. A user can define a particular set of event attributes and have a dataset made from that. I can’t help but wonder how that scales if every user requests their own dataset (which is very similar to the next user’s) – that is a lot of I/O. But it is good, none-the-less. I see this in ATLAS too with the TAG – but I’m nervous of the same scalability problems there too.
ALICE also has a clever way of sending jobs out to the GRID (I think this is old hat now): the pull model. They submit a little pilot job. It checks out the machine for basic installation and then calls back to a central scheduler to get the actual job that needs to be run. I’ve known this gives them the ability to avoid black holes, but I didn’t appreciate how much more job scheduling flexibility it gave them: some GRID submission queues can be very long!
In the first reference to multi-core (of many!) he pointed out that with the new machines we are using now we need 8 or 9 copies of the simulation or reconstruction running at once in order to fully utilize the capabilities of a 8 core machine.
In general he painted a rosy picture of the tools. This didn’t match my experience and I asked how things look so good but seem to not work for the end user. He mused that perhaps 90% operation was still pretty bad from an end user point of view. I think it is still worse. I got a bunch of ribbing for that comment later on in the conference.🙂
Plenary: LHC DAQ – Sylvain Chapeland
This was the usual overview of the LHC DAQ. One thing impressed me was that every one is using ROOT to display plots. Java makes an appearance also, but ROOT is the real king here — any time a plot has to be made. Most of the operational displays shown were a few buttons and lots of histograms — I usually think of DAQ operations being a bunch of buttons and some text files. I think this was speaker bias.
He had a comment on the last slide — wondering if we couldn’t have a common DAQ system. Years ago I also wondered why we couldn’t do that. At the very least, a common DAQ system. Indeed, Fermilab has a project that does this for their fixed target experiments (I can’t remember the name). I think this will never happen for the large experiments like ATLAS or CMS. The hardware is so specialized, and the dataflow is so specialized. I can imagine that we might make use of libraries of utilities — but we’ll never see unification of the overall DAQ system or its control.
Plenary: Computer Facilities – Eng Lim Goh (sgi)
Unfortunately, the slides have not been posted. This talk was fascinating (as were all the vendor talks, actually). Goh noted that the data rate out of the ALICE DAQ system, 1.2 GByte/sec, was the same as what the film industry was seeing for their “4D” compression for a movie – so the real world is catching up!
He has seen 10,000 node rack & stack clusters close to production and hears people now talking about 100,000 node rack&stack clusters. Wow!
sgi’s vision of the future are very tightly packed racks. They do away with the pizza box stacking and instead just build CPU’s on a board, stuff the boards into a special high speed back plane, and build modules out of these. Input and output uses Infiniband, which gives performance on par with 10 Gb ethernet.
Supplying power was also a focus — and doing it efficiently. Apparently, one usually gets from line voltage via two steps. sgi has put it in a single step, and they have brought the power supply out to the back of the rack to supply a whole lot of nodes – thus they claim to have one of the most power efficient (watts/flop). Of course, Goh mentioned the backplane that they plug all their processors into now has 1000 amps coursing through it! Yikes!
Despite the efficiency improvements, they still have to cool the rack down – they use water cooling. Goh claims it is neutral — you don’t have to add extra AC to the room – the water chiller will take out at least as much heat as what goes in. Interestingly enough, the coolers are mounted on the back of the rack, after the air has flowed through the processor.
The solution is quite unique — in that it doesn’t play well with other solutions. This question was asked, and Goh said that in units of a single rack it did play well — sgi supplies standard GB interfaces in and out of a rack so that one’s whole data center doesn’t have to be infiniband.
Parallel: ATLAS Analysis Model – Amir Farbin
I managed to miss most of this (late back from getting a Dr. Pepper!), but his talk mentioned several things that looked interesting that I’m not already familiar with:
- SPyROOT – The specific example knows about data sets, relative luminosity, and cross sections and can automatically make plots for multiple datasets at once. Written in Python. The batch version seems to be a simplified version of what we use in D0 (in Python rather than C++). Cute. Not obvious it is robust enough – seems designed only for simple things — no way people are going to end up doing just simple things there!
- We will only simulate 20% of the data we take. I hear this said all the time. A particular analysis only looks at a small subset of the data — will this number matter?
He also talked about his EventView approach to doing analysis — which I think is not going to work out well.
Parallel: A Class To Make Combinations – Nobu Katayama
This was a very cool talk on an algorithm to solve a physics problem – a set of ideas you could walk away from CHEP with an immediately apply. I’d like to see more of this sort of thing!
Nobu was dealing with combinitorics in B-physics. Specifically, a 3 body decay. He wanted to represent it with C++ operator syntax: “D0 = Kaon*Pion*Kaon” — in order to do this you need to loop over all three objects at once, but C++ operator syntax means you will do the Pion*Kaon first, and then the Kaon*<result> second — not evaluate all three. So he wrote a delayed evaluation library: it builds up a syntax tree of all operations but doesn’t actually do them until the “=” sign is hit. At that time the whole parse tree is available and he can see that he needs to do a 3-way combination. Very cool! I’ve seen this technique used to solve all sorts of computational problems (like shipping matrix operations out to a GPU).
I asked why not use a Domain Specific Language: a text file that contains the physics in a more readable form that you could then use a program to automatically turn into C++: easy to read and write, and the speed of C++. He really wanted to complete flexibility of C++ available (for cuts, etc.). Sounds good!
Parallel: The CERN Analysis Facility – J. F. Grosse-Oetringhaus
ALICE is probably the heaviest user of ROOT of any of the ongoing large experiments. They use it for everything – reconstruction, simulation, visualization, etc. You name a feature of ROOT, and they are using it. This includes PROOF, the remote cluster version of ROOT. They currently have about 40 machines setup (500 is the goal). PROOF has, from the sounds of it, received significant debugging on this cluster.🙂 Listening to this talk you really get the idea that PROOF is big-iron. Lots of support is required to have a fully functioning system, which is too bad.
I asked how you distribute user code and libraries – you send source files and they are compiled on each PROOF machine that is running. What do users do when they want to run an analysis in PROOF with lots of plots – speaker didn’t know, but he did know that some people were making 100’s of plots in a single event loop (the only way to do it, in my opinion).
Parallel: Visualization with ROOT – Matevz TADEL
Wow. ROOT visualization is pretty impressive! The ALICE detector is a heavy-ion detector. This means when the ions collide there will be a huge mess of tracks in their detector. They are using the visualization code in ROOT as their detector display, and they claim decent performance when they show as many as 1000 tracks. And this is without using acceleration hardware (as far as I can tell). I wonder how fast things would go if they used modern graphics cards and did processing there?
Parallel: The Online Track
There were a lot of good talks on High Level Triggers at this point in the Online Track — but I have seen many of them before. Suffice it to say that good progress is being made and people are already at the point of testing the algorithms to make sure there is enough CPU to do the things that they need. In particular, one referred to a version of the track finding algorithm, called a Kalman Filter, that didn’t require matrix inversion (very slow). I’d not heard of this gain-matrix formalism before, but I found what looks to be a decent reference on the web.