jump to navigation

Opportunistic cross-platform distributed computing January 9, 2008

Posted by gordonwatts in computers.
trackback

There is room for two types of distributed computing in particle physics. The first one is boring – large clusters of computers run by laboratories or large universities with 1000’s of CPUs. These are essentially very big batch platforms. There are lots of unsolved problems – for example, how do get the data in and out.

The second type is much smaller. Say a small set of machines you are using to help analyze your 4 TB of ntuple-data for your analysis. While the batch farms are bound to be run by compute professionals, these small clusters are bound to be run by, well, us.

This got me to wondering — in the university environment – why aren’t we using some cross platform tool for running our analysis? For example, at UW we have large collections of Windows machines freely available to us (well, not freely) – the undergraduate lab computers. We also have a small number of dedicated Linux machines. ROOT does hold us back — it is difficult to get the same code running on more than one platform.

I stumbled on this article while trying to track down some other information on distributed computing. It describes using Microsoft’s virtual machine (MS’s answer to Java). Which got me to thinking — why haven’t we seen anything like this in Java? Wikipedia has a few references to projects in Java – but nothing that has crossed over to physics. Is it ROOT that is holding us back? Or something else?

[Update: ROOT developer wrote in to ask if it was ROOT or C++ that was holding things back — yes, C++, which is of course what ROOT is based on. It should also be noted that we can only do a lot of what is possible now because of ROOT.]

Comments»

1. apetrov - January 9, 2008

I was thinking about something along these lines (I actually wanted to use computer labs at night as makeshift clusters). It turns out that it is only partially possible: some of the University computers were bought for money specifically for instruction, so our bureaucrats told me that this dual use would violate some regulation or another… Another way of doing similar thing is described here: http://apetrov.wordpress.com/2007/04/07/project-hischool-disco/

2. MIke M - January 9, 2008

We’ve successfully harvested our Apples, but it’s not cross platform. We have donations now from multiple departments (Urban Studies, Physics, Math, and the IT group at MIT) — gives us something like 100+ job slots now for $4000 investment (for a file server). We have enough that we stopped trying to recruit more. It really has been “one click to join” and rock-solid robust. We’ve been toying with the idea of running cross platform, but that would mean actually installing software on the harvested nodes (e.g., Condor). I’m a firm believer that real harvested computing can’t require installing anything on the harvested nodes, which is why Apple/xgrid rocks.

3. Dmitry - January 9, 2008

A simple solution that takes doesn’t require any changes in the core software is using a virtual machine such as vmwave (player is free and fast). As long as you have reasonable amount of memory any machine can be reused.

4. Mike Procario - January 10, 2008

I am surprised that someone as familiar with HEP computing as you has not heard of JAS (Java Analysis Studio), http://jas.freehep.org/jas3/. It does seem to have made much progress in the last few years. . It came out SLAC. It always made more sense to me than ROOT, but I never really did any serious C++ coding. My last serious code was in C.

It seemed to me that ROOT was adopted by people who were writing C++ code full time, which made it very convenient for them. Then when others asked how to do analysis they said use ROOT. Here are my scripts.

5. gordonwatts - January 10, 2008

Apetrov – cool idea. How did it work out in the end? Hosting the computers at a high school is one thing, but actually getting them into the physics is another.

Mike – Yes, I’ve watched your work with fascination. Your talk as the OSG meeting at UW was part of what made stop just thinking about it and try to do something – which we did… As far as not installing anything — you are, actually — you just have the good fortune to have had it installed as part of the OS installation. At our schools there are fairly few Apples and mostly Windows machines, however, so we went where the CPU was. BTW, going cross platform is worse, isn’t it? You have to build multiple binaries – or if you are going to use Condor’s emulation of the OS so you can move things around then you have a fairly restrictive programming environment.

Dmitry – this is the approach I used when I did this — we use VM’s. I found that we took a 25% hit in speed when this was done – so that is a definate posibility. However the memory requirements are large – especially considering things had to be ejected as soon as a student touched the keyboard. Toby, another prof here at UW, was able to get the GLAST simulation up and running on Windows without too much trouble — and he became a major source of GLAST MC because of the large windows farms we had availible. That was, by far, the most successful. However, it still wasn’t cross-platform.

Mike — I’ve definately heard of JAS. And I think your rough outline of how ROOT as spread in the field and how this has starved projects like JAS is right. I was not, however, aware that JAS had a cross-platform batch execution engine to process ntuples (or simliar). That makes much more sense to me: java is a realatively old language and so I was expecting much more on this front – but perhaps the problem was with how I was searching the web.

6. hrivnac - January 11, 2008

Hi,
just a note concerning monopolistic position of Root and invisibility of other (often Java-based) projects. There was a long (and heated) discussion about it on the Root WikiPage in June 2006. Here is my view (with links to other contributions):

Need Help!

In short: After the disaster of LHC++ project in CERN, CERN IT didn’t want to take any risk and outsourced all software development to Root team. Since that time, all non-Root development is effectively forbidden in CERN. There are alternative projects, but it is very difficult to get them through due to CERN boycot.

Here are just two alternative project examples (besides JAS):
http://projects.hepforge.org/jhepwork (Java)
http://openscientist.lal.in2p3.fr (C++)

Julius

7. Weblog of Julius Hrivnac » Blog Archive » Why Root ? - January 11, 2008

[…] The whole thread is here. […]

8. gordonwatts - January 14, 2008

Julius, thanks for the comment — it would have come out earlier except my spam filter marked it as spam and I didn’t see that until last night.

I took a quick look at some of those links — both of them feel a bit over the top in ideology — sort of like the open source conspiracy theories I sometimes see against companies (like MS) doing software development.

BTW – ROOT started out as a anti-CERN CD project. It won out only because people voted with their feet.

9. Running ATLAS in a Virtual Machine « Life as a Physicist - January 16, 2008

[…] a long time ago; I’m not sure why it stayed in my drafts folder. But comments in response to this post on the java VM and Microsoft CLR made me realize I’d not posted it yet – though I thought I […]


Leave a comment