Running ATLAS in a Virtual Machine January 16, 2008Posted by gordonwatts in computers.
[Note: I wrote this post a long time ago; I’m not sure why it stayed in my drafts folder. But comments in response to this post on the java VM and Microsoft CLR made me realize I’d not posted it yet – though I thought I had!!]
I’ve been curious for a long time how much of a CPU hit do you take if you run your physics software in a Virtual Machine as opposed to the bare machine. I’m sure anyone who has used VM’s knows you take a hit — they always feel more sluggish. My impression is that for anything disk related they can be as much as 50% slower, but if it is just CPU it is just 5% or 10% slower. So, where does a real physics program stand in that?
To test this I and Todd, who works with our group back at UW, ran a few quick tests. We simulated about 20 ATLAS events. All of this was done on the same physical CPU. In all cases the ATLAS software was running on Scientific Linux 4 (basically a Red Hat clone). In no cases of the VM’s was any hardware vitalization going on.
|SL4 installed on the machine||3:30|
|Windows Vista + VMWare||** (see below)|
|Windows Vista + Virtual PC||4:22|
|Windows Vista + CoLinux||3:47|
** We also ran tests with VMWare, but I can’t find the email with the results now. However, I remember them being almost exactly the same as Virtual PC. The same hardware was used for all the tests.
So… running a VM will force you to buy 25% more machines to get the same work done. When you are running on a farm of 1000 CPUs that can be significant. On the other hand, using the approach that CoLinux does means you don’t loose nearly as much!
One reason we looked into this is that UW has a very large pool of Windows machines that sit in student Lab’s. Most of the time they are idle — so why not configure them to run simulation jobs during off hours? Of course, we can’t install Linux on them – so we thought about using VM’s instead. We did a pilot run with about 30 machines running the Virtual PC version, and produced about 40,000 events fairly quickly. It was very useful in allowing us to work with the machines we had.
Unfortunately, we had several BSODs installing CoLinux – but never one installing the VMWare or Virtual PC. And if one is running on Lab machines that are used by students one can not afford to have a BSOD: the lab manager will kick you off those machines so fast your head will spin!🙂 The other thing about colinux was that it was significantly more difficult to configure than any of the VM’s. However, just like the VM’s, once you got a disk up and running you could distribute that everywhere.