Virtual Machines for the GRID August 25, 2006Posted by gordonwatts in computers, physics.
I’ve said this before, and I continue to say it. I’ve never gotten what I consider a good argument against until this conference. But when I do say it most people get that look in their eye that says “hey — that is such a crazy off-the-wall idea that I’m not even going to address it.” I think this is starting to change as I saw some people experimenting with XEN at the last CHEP.
When we run on the GRID we don’t just run some C++ code that reads in a data file and spits out a new datafile. We frequently have to run complex scripts to setup an executable, or we want to run one executable after the other — and we need scripts to coordinate the executables. Our scripts are written in python or Perl. We thus have to make sure that any GRID machine we are planning on running on is correctly configured. If there is something small missing that is required then all jobs will fail. A missing wget utility — something most Linux users take for granted — was cited as an example of a real problem.
I have always wondered if virtual machines weren’t the correct answer to this. Every user creates their own custom install of Linux, with the proper versions of all the various utilities, and even executables they need to run. This image — probably 5-10 GB in total size — is then distributed to each compute node. Now the applications won’t fail for a mis-configuration option. Given the number of problems and the amount of effort that goes into dealing with configuration changes between jobs I would think this would be a huge saver of time.
Of course, you do take a hit. CPU is about 10% slower, so that isn’t so bad, but disk reads and writes can be a lot slower — as much as 50% I’ve heard. But most of our (HEP) compute jobs are CPU bound, not I/O bound so the hit in disk speed wouldn’t as great as it could be.
Simon Lin, who coordinates TWGrid, pointed out that I was thinking about HEP too much. Most GRIDs are designed to run many different types of jobs. A common class of jobs, not used in particle physics, are parallel jobs. These jobs are best positioned to take advantage of the multi-core trend currently underway in CPU design. They use libraries like MPI that allow several programs running together to communicate. The virtual machine hosts of today do not emulate multiple processors (even on a multiple processor machine). This means that the several programs may have to run in different virtual machines to take full advantage of a multi-core CPU, and each time they pass a message it will have to go out one VM and into the other — quite a penalty.
Hopefully the virtual machine vendors will soon address this and this argument will be gone. 🙂
BTW, I also found out during this OSG meeting how nodes are configured. When a new set of jobs come in and special installation job is run first. It installs all the required software in a special application directory. These jobs run on each machine on the farm and then the farm is ready to accept the new type of job. This method really only works when you can assign a large collection of machines to a particular task (like ATLAS).