jump to navigation

Pick The Trends March 27, 2009

Posted by gordonwatts in CHEP, computers, Conference.

I’m at CHEP 2009 – a conference which combines physics with computing. In the old days it was one of my favorites, but over the past 10 years it has been all about the Grid. Tomorrow is the conference summary – and I sometimes play a game where I try to guess what the person who has to summarize the whole conference is going to say – the trends, if you know what I mean.

  • Amazon, all the time (cloud computing). Everyone is trying out its EC2 service. Some physics analysis has even been done using it. People have tried putting large storage managers up on it. Cost comparisons seem to indicate it is almost the same cost as using the GRID – and so might make a lot of sense when used as overflow. As far as cost goes, it is not at all favorable for storing large amounts of data (actually, transferring it in and out of the service).
  • Virtualization. I’ve been saying this for years myself, and it looks like everyone else is saying it now too. This is great. It is driven partly by the cloud – which uses virtualization – and that was something I definitely did not foresee. Cloud computing and virtualization go hand-in-hand. CERNVM is my favorite project along these lines, but lots of people are playing with this. I’ve even seen calls to make the GRID more like the loud.
  • Multi-core. This is more wishful thinking on my part, but it should be more of a trend than it is. The basic problem is that as we head towards 100 cores on a single chip there is just no way to get 100 events’ worth of data onto the chip – the bandwidth just isn’t there. Thus we will have to change how we processes the data, spending multiple CPU’s on the same data – something no one in HEP does up to now. One talk mentioned that problems will occur at about 24 CPUs on a single core.
  • CMS has put together a bunch of virtual control rooms. Now everyone in CMS wants one (80% in a few years). These are supposed to be used for both outreach and also remote monitoring. This seems successful enough that I bet ATLAS will soon have its own program. I’m not convinced how useful it is. 🙂
  • It is all about the data! Everyone now says running jobs on the GRID isn’t hard, it is feeding them the data. Cynically, I might say that was only because we now know how to run those jobs – several years ago that was the problem. This is a tough nut to crack. To my mind, for the first time, I see all the bits in place to solve this problem; but nothing really works reliably yet.
  • And now a non-trend. I keep expecting HEP to gather around a single database. That hasn’t happened now, and so I don’t think it ever will! That is both good and bad. We have a mix of open source and also vendor supplied solutions in the mix.

Ok – there are probably other small ones, but these are the big ones I think Dario will mention in his final talk.



UPDATE: So, how did I do? Slides should appear here eventually.

    1. Data – and its movement was the top problem on his list.
    2. GRID – and under this was Cloud computing. He made the suggestions that some GRID’s should move to look more like cloud – no one reacted.
    3. Performance – optimization, multi-core and many core appeared under this as well.

So, I didn’t do too badly. The big ones I had were all addressed. He had a very cool word analysis of the abstracts – which I have to figure out how to do.

I’ve got some good notes from the conference, I’ll try not to get too distracted by teaching next quarter (ahem) and post some of them in the near future.



1. Corey - March 27, 2009

Hi Gordon!

I have to admit I’m skeptical about multi-core computing in physics. It seems to imply multi-threaded programs, and there’s just no way physicists are going to get that right.

We build great hardware because we have to pay a ton of money for it.

We build terrible software because we don’t pay for it. (Which probably costs us more money in terms of time wasted, but when has our time ever been valuable?)

If a grad student’s analysis is made by a multi-thread program, I must admit that I simply wouldn’t believe the result…

2. Bill C - March 27, 2009

The web site for making those cool word clouds is

3. gordonwatts - March 28, 2009

Bill — thanks – I hvae to figure out how to do it because I think I can only get the abstracts from the conference by looking at a big PDF!

Corey – I totally agree. For this to happen there would have to be a sea-change. At the conference I learned of one experiment that managed to do it – Belle. They did it in the trigger. They did it be severly restricting the programming environment, and then suppling a framework. What data you could get at and how you could access it was controled by the framework was carefully controlled. It worked – they were able to massivly multi-thread their trigger. They had to design this in from the ground up. Apparently ATLAS tried an experiment with this in their trigger, and tried to bolt it onto the current framework and that didn’t work so well.

So, if it starts, I think it will start like that – start in the trigger, we’ll learn, and then it will propagate outwards.

BTW, there are tools, similar to valgrind, that scan your program for multi-threaded race and deadlock conditions. They are quite cleaver and fairly effective. Unfortunately, like valgrind they are very very slow. But I would imagine tools like that would become more common to help with this sort of thing.

4. superweak - April 10, 2009

Speaking of single events on multi-core: certainly this won’t happen while we stick to the processing model where we manually resolve all the dependencies between different parts of reconstruction by ‘scheduling algorithms’ in the configuration – it’s extremely difficult to do this for a single thread with something as complex as ATLAS reconstruction…

5. Gordon Watts - April 10, 2009

Right. There are two options:

– There are parts of the reco algorithm that can be run in parallel – for example, muon segment finding and tracking and jet finding. However, the software framework would have to know about data dependencies in order to get that to work automatically – and that would probably require some real changes.

– The other approach is to multi-thread each algorithm. Peak finding in the calorimeter for seed location could probably be made parallel fairly easily. The nice thing about this is you also get data locality – the very issue that will eventually prevent us from using the current model to do processing.

6. CHEP 2009 | Trapped in the Middle - June 22, 2010

[…] nemohu mít jakýkoliv špatný zážitek z mé první účasti, avšak jsem se dočetl na blogu jednoho z přítomných lidí (Gordon Watts z University of Washington), že jej tato konference […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: