Pick The Trends March 27, 2009Posted by gordonwatts in CHEP, computers, Conference.
I’m at CHEP 2009 – a conference which combines physics with computing. In the old days it was one of my favorites, but over the past 10 years it has been all about the Grid. Tomorrow is the conference summary – and I sometimes play a game where I try to guess what the person who has to summarize the whole conference is going to say – the trends, if you know what I mean.
- Amazon, all the time (cloud computing). Everyone is trying out its EC2 service. Some physics analysis has even been done using it. People have tried putting large storage managers up on it. Cost comparisons seem to indicate it is almost the same cost as using the GRID – and so might make a lot of sense when used as overflow. As far as cost goes, it is not at all favorable for storing large amounts of data (actually, transferring it in and out of the service).
- Virtualization. I’ve been saying this for years myself, and it looks like everyone else is saying it now too. This is great. It is driven partly by the cloud – which uses virtualization – and that was something I definitely did not foresee. Cloud computing and virtualization go hand-in-hand. CERNVM is my favorite project along these lines, but lots of people are playing with this. I’ve even seen calls to make the GRID more like the loud.
- Multi-core. This is more wishful thinking on my part, but it should be more of a trend than it is. The basic problem is that as we head towards 100 cores on a single chip there is just no way to get 100 events’ worth of data onto the chip – the bandwidth just isn’t there. Thus we will have to change how we processes the data, spending multiple CPU’s on the same data – something no one in HEP does up to now. One talk mentioned that problems will occur at about 24 CPUs on a single core.
- CMS has put together a bunch of virtual control rooms. Now everyone in CMS wants one (80% in a few years). These are supposed to be used for both outreach and also remote monitoring. This seems successful enough that I bet ATLAS will soon have its own program. I’m not convinced how useful it is.
- It is all about the data! Everyone now says running jobs on the GRID isn’t hard, it is feeding them the data. Cynically, I might say that was only because we now know how to run those jobs – several years ago that was the problem. This is a tough nut to crack. To my mind, for the first time, I see all the bits in place to solve this problem; but nothing really works reliably yet.
- And now a non-trend. I keep expecting HEP to gather around a single database. That hasn’t happened now, and so I don’t think it ever will! That is both good and bad. We have a mix of open source and also vendor supplied solutions in the mix.
Ok – there are probably other small ones, but these are the big ones I think Dario will mention in his final talk.
UPDATE: So, how did I do? Slides should appear here eventually.
- Data – and its movement was the top problem on his list.
- GRID – and under this was Cloud computing. He made the suggestions that some GRID’s should move to look more like cloud – no one reacted.
- Performance – optimization, multi-core and many core appeared under this as well.
So, I didn’t do too badly. The big ones I had were all addressed. He had a very cool word analysis of the abstracts – which I have to figure out how to do.
I’ve got some good notes from the conference, I’ll try not to get too distracted by teaching next quarter (ahem) and post some of them in the near future.