jump to navigation

Maps! Maps! Maps! November 27, 2009

Posted by gordonwatts in computers, DeepTalk, Maps.
add a comment

I have become a big of the DeepZoom technology, as anyone who has been reading these posts a while knows. I’m also a big fan of maps – especially old ones. I’ve never been brave enough to purchase any on eBay or anything like that, but I’d love to eventually own a few and hang them on my wall.


In the meantime I make use of the fantastic resources of the web. UW recently put up a small collection of old maps, from the 16th to the 19th century. Some of them are stunning. I definitely recommend spending some quality time exploring them.

The default interface that is presented to you, however, is a bit of a pain. For each map, scroll down to the “detailed view” entry below the map picture and click on that. They used Zoomify to encode the images in a nice zoom-able interface.

Sweet. I wish they had done one or two things a little differently:

  • Higher resolution images so you can zoom in even further
  • Put all the maps on a single page, with perhaps some information (and a search tool) on the right hand side. Check out Hard Rock’s example.
  • Can’t make it full screen. 🙂

I wish more people would do this for collections of images like these maps. It makes navigating them a lot of fun, and it is still possible to display the metadata.


EPS And PS Files on Windows November 12, 2009

Posted by gordonwatts in computers.
1 comment so far

Windows has had this very nice feature called a “preview pane” ever since Vista. I think just about everyone (certainly on a Mac and on Windows) is familiar with this thing – you click on a text file or a C++ source file or a PowerPoint presentation or something like that, and without having to open up an editor, you can see exactly what is inside the file. Sweet!

But one place on windows this fell down are postscript files (.PS and .EPS files). Viewing the contents of those was a pain. Turns out fixing this isn’t that hard – check it out (this is on a Windows 7 computer):

PSPreview In Explorer

How nice is that?

You can get it here. Make sure to install a recent version of Ghostscript firs!

While it turns out doing this to first order is pretty easy – there is a very simple plug-in architecture that I was able to take advantage of – and for rendering the files I just used Ghostscript – the devil is in the details. There are two features (as software developers say): first the mouse scroll wheel doesn’t work when you have lots of pages to scroll through – you have to use the scroll bar. Second, it is hard or impossible to resize the window when your are viewing the PS file (just click on some other file, resize the window, and then click back). I’ll fix those as soon as I find out how (read: spend more than a few hours on this project).

If you find it misbehaving I’d love to know about it – particularly if you are willing to email me the file that caused the failure!!

The next thing along these lines I’d love to see is one for ROOT. There is one for you Mac users, btw – so if you are using a Mac and have .root files on it, I’d suggest going and getting that!

Data mining November 7, 2009

Posted by gordonwatts in computers, Health.

In particle physics this is what we do. We have petabytes (1000 terrabytes!) datasets consisting of billions of physics interactions. For the particularly rare ones we need to pick out several 100 or 1000 and study them in detail. As you might expect, we are drowning in data and have developed many tools to help us. Computers are central – without them we would not be able to do the science we currently do!

The most common public example of data mining I’ve heard about is looking at all the receipts from Wallmart purchases. This is why grocery stores like you to sign up for their frequent-use cards – they can track everything you buy, sell that data, and, more importantly, send you ads that are likely to get you in and get you to buy other things. It is an amazingly powerful tool. In business it has been getting a bit of a bad name recently because it has been connected to some fairly serious invasion of privacy issues (i.e. creepy things – like knowing what hour of the day you check your email, how much the average person in your zip code makes, etc.).

But one place that it could obviously be applied for the greater good that I’d never really given much thought to is medicine. Check out this long article from the NYT on the topic – Making Health Care Better. It starts with some history – and the bromide “The amount of death and disease would be less if all disease were left to itself.” from 1835… to the present day:

“Medicine adopted the scientific method,” James said… “It transformed medicine, and it’s easy to make the case.”

He talks about the testing and science applied to any new method, drug, procedure before it is allowed to be used by mainstream doctors. But…

But there is one important way in which medicine never quite adopted the scientific method… …once a treatment enters the mainstream — once we know whether it works in certain situations — science is largely left behind. The next questions — when to use it and on which patients — become matters of judgment, not measurement.

The article provides a dizzying array of treatments available to a doctor that is trying to treat heart disease. And what treatment is left up to the doctor’s judgment.

Cleary some hospitals and doctors have better average outcomes than others – so some doctors must have better judgment than others. Wouldn’t it be great if every doctor could start with a default procedure that has been shown to work for a patient that looks like the one the doctor is trying to treat and then modify it to fit the patient’s specifics?


“I thought there wasn’t anybody better in the world at twiddling the knobs than I was,” Jim Orme, a critical-care doctor, told me later, “so I was skeptical that any protocol generated by a group of people could do better.”

And that is just it – there are so many variations in treatments. Today’s instruments are quite complex and have many settings – so how do you know what works correctly? When the procedure is approved there is a fair amount of science recorded for each setting, and presumable most doctors follow it. But not all of them!

And this is where I think data-mining could come in. What if every single modern instrument was hooked into the network, and each adjustment was recorded? And linked to a patients medical file (so you could see history). Each time a nurse or doctor did something it was recorded. All of that in some standard format – and then shared across hospitals and doctors the country or world over?

This has, of course, been a dream for a while of electronic health care records. It always struck me as obvious that you would attach x-rays, CT scans, descriptions of medicine given, etc., but it never occurred to me the level of detail you could go into! From a technical point of view this is hard – the data is so non-uniform, unlike the particle physics experiments I work on, but the long term benefits could be quite good. The article describes when this data mining technique was applied ad-hoc in just a single hospital:

One widely circulated national study overseen by doctors at Massachusetts General Hospital had found an ARDS [Acute respiratory distress syndrome] survival rate of about 10 percent. For those in Intermountain’s study, the rate was 40 percent.

At any rate, this tickled my fancy, which is why I wrote about it. I found it ironic that on the Health home page yesterday there was also the following article:

Five years later, Medicare underwrites more than half of the $4 billion the nation now spends annually on defibrillators, but the agency is no closer to knowing how many lives that big investment is saving.

My impression of the health care bills working their way through congress right now is none of them really go after cost-savings. Science can help*.

* Ok – making devices that can spit out data in a common format will add to their cost. But you can do simple things like the Intermountain study to start as better devices come online!

Zoomify September 22, 2009

Posted by gordonwatts in computers, DeepTalk.

A bit of a technical post.

One of the biggest criticisms I get about DeepTalk (besides the fact that you can’t navigate using the arrow keys) is that it requires Microsoft’s Silverlight. There are two other options I’m aware of. First, to understand the problem that I’m working with, check out this simple conference that I’ve deeptalk’ed. Use the mouse wheel to zoom in/out and see how the display works.

For this discussion it is important to keep in mind the steps that a conference goes through on its way to becoming a DeepTalk:

  1. All the slides are sucked down from the internet, turned into jpgs, and then programmatically laid out.
  2. A rendering program reads the layout and all the images in and slices and dices the images into layers. These slices are stored on a web server with a decent internet connection.
  3. Code is downloaded to the browser that reads the layout and the slices and renders them just like any mapping website with zoom capabilities does.

First, raw javascript. This is an ideal solution. Every browser already has it installed and most modern browsers are pretty efficient. Indeed, all the mapping programs I use like live maps and google maps use this solution for terabytes of data. So why not me!? Well, the first requirement is I’m not willing to re-write the code, so I have to find it on the web. Actually, I did find one (are there others?) – from Microsoft and it can replace the Silverlight code. Ok! They I’m all set, right? Well, not. The code isn’t as capable as I need. For example, it can render only a single image at a time. For DeepTalk a single image is roughly equivalent to a single talk. I could render the whole conference as a single "image” however I do not have the memory on any machine I own to do that.

Second is a commercial Adobe Flash library called Zoomify. Check out their web page – very cool. It does exactly what I need. It requires Flash, which pretty much everyone has (even if they have to update – please do it – old software == hacker target!!!). Further, unlike Silverlight, Flash, works on Linux so – so this would be a big plus. Unfortunately, there are two problems. First, in order to automate the rendering you need the Enterprise version ($800 US – more than was spent on the server that is currently serving the DeepTalk content). Second, the project is well integrated with Adobe Flash – which is all great and fine for people who are used to Flash. But for the rest of us we need to learn a new programming language.

And finally there was the Silverlight version. This had the zooming built-in and the tools, including a rendering library I could link against, were all free. Further, the programming model for Silverlight is any .NET language – which includes C#, which looks a lot like C/C++ – something I can immediately start writing code in without having to buy a reference book.

So. That is why I’m using Silverlight for this project, and why, for the moment at least, it still remains the best choice for me for this project.

Now, as for the most popular criticism I’ve gotten about the project. I now have working on my desktop a version that allows you to use arrow keys to move around. Sadly, it still crashes due to bugs on about 1 in 3 conferences – which means it isn’t good enough to go on the web backend. You all will have to wait, sadly, for a little while longer: classes start next week, so a lot of my summer spare time is going to disappear!! Happy end of the summer!

Presentation September 20, 2009

Posted by gordonwatts in computers, Conference.

Ok. Really. This is my last post of Video for a while. Ever since I started the DeepTalk project I’ve started to be much more aware of how conference data is put out on the web. So it has become a bit of a soap-box for me. 🙂 But this is the last one for a while, I promise.

During my last several posts on this there have been a bunch of comments on how other conferences have presented their video online. I thought I’d give you my opinion. 🙂

  • Pycon 2009 – the annual Python conference. At first I was hopeful about this – the web page is quite nice and you’ll notice right at the top there is a nice iCal link so you can download the schedule. However, the schedule is just that – a schedule. You can’t get access to the links to the talks or video from there. Associated with the web page is a RSS feed too – which is excellent – I could now use my pod-cast software (any software should be able to read it) and I could download the audio of the whole event. Sweet. However, there is no way to connect the slides and the video or audio together via a program (as far as I can tell). The video looks like it is all archived on blip.tv. The beauty of this system is that it makes files availible in lots of formats (see this talk, click on the “files and links” to see). AND there is a small little RSS link at the bottom – so I can get all the talks down as video to my podcast software (the default seems to be the MP4 format, which satisfies most of my requirements as a good video format). So this conference has made its schedule available in a standard format (iCal), made all of its videos available in a standard format (blip.tv). I’d like to see some integration between the two so that one could find the slides, abstract, and video together, using a program. 🙂
  • Strings ‘07 Conference – a conference on strings. The conference website is basically a series of static web pages – including the schedule (I’ve extracted that page – but you can get to it by looking at the home page –> Scientific Program –> Speakers&Titles). There are links to the slides and Video. The video is in MP4 format (fantastic!). None of this is discoverable, unfortunately, by a program – you would have to scrape the web page in order to find it. Chimpanzee, who has left a lot of comments on these video posting, has done some work with this conference, putting it in iTunes as a show. Unfortunately, unless you have iTunes installed, this is not very useful as it brings you to an Apple page that asks you to download and install ITunes. However, Chimpanzee did put this on blip.tv as a several shows (one show per day – I think from the point of view of subscribing I’d have preferred a single show for the whole conference). Also, the nice RSS feeds to blip.tv are well hidden. So, well done with mp4 and PDF files up there. The blip.tv solution is quite nice, again. The static web page that links them together isn’t so good – it isn’t very discoverable, unfortunately.
  • Lepton-Photon 2009 – The agenda is posted in the standard agenda software in use in HEP, Indico, which makes it easily exportable. Each talk has a link to the PDF as well as a Video link. Unfortunately, the Video leads to a RealMedia file – which my open source tools cannot play. So the video format doesn’t pass muster.

I am pleasantly surprised by blip.tv. It looks like a very nice service. I have no idea what their business model is. The good news is that people won’t watch most talks from a physics conference very much – so they will require very little bandwidth.

No conference gets it quite right (IMHO), but they all come close. From my point-of-view, combining Indico with blip.tv seems like a fairly ideal solution given current technology constraints.

Two quick notes. First, there has been a hope that perhaps HTML5 would standardize a single video format – and we could all just depend on all browsers running it without having to install plugins like the security-ridden Flash or RealMedia. This is not to be, however. There is an excellent blog series for those of you who want to know what is happening to HTML5 that I stumbled on. This posting makes it clear that a preferred video format no longer exists in the standard (for details, see the change log for the standard).

Second, I keep holding up Indico as a nice way to post meeting agendas. But perhaps there is a standard for this sort of thing? A microformat or perhaps something form the Semantic Web? Then Indico (and everyone else) could produce that for various tools to parse. I only took a brief search, but didn’t find anything.

Bjarne Stroustrup September 8, 2009

Posted by gordonwatts in CERN, computers, ROOT.

IMG_2253If you are even semi-conscious of the computing world you know this name: Bjarne Stroustrup. He is the father of C++. He started designing the language sometime in the very late 1970’s and continues to this day trying to keep it from getting too “weird” (his words).

He visited CERN this last week, invited by the ROOT team (I took few pictures). I couldn’t see his big plenary talk due to a meeting conflict, but my friend Axel, on the ROOT team, was nice enough to invite me along to a smaller discussion. Presentations made at this discussion should be posted soon here. The big lecture is posted here, along with video (sadly, in flash and wmv format – not quite mp4 as I’ve been discussing!!)! I see that Axel also has a blog and he is posting a summary there too – in more detail than I am.

The C++ standard – which defines the language – is currently overseen by a ISO Standards Committee. Collectively they decide on the features and changes to the language. The members are made up of compiler vendors, library vendors, library authors, large banking organizations, Intel, Microsoft, etc. – people who have a little $$ and  make heavy use of C++. Even high energy physics is represented – Walter Brown from Fermilab. Apparently the committee membership is basically open – it costs about $10K/year to send someone to all the meetings. That is it. Not very expensive. The committee is currently finishing off a new version of the C++ language, commonly referred to as C++0x.

The visit was fascinating. I’ve always known there was plenty of politics when a group of people get together and try to decide things. Heck, I’m in High Energy Physics! But I guess I’d never given much thought to a programming language! Part of the reason it was as fascinating as it was was because several additions to the language that folks in HEP were interested in were taken out at the last minute – for a variety of reasons – so we were all curious as to what happened.

I learned a whole bunch of things during this discussion (sorry for going technical on everyone here!):

  • Bjarne yelled at us multiple times: people like HEP are not well represented on the committee. So join the thing and get views like ours better represented (though he worried if all 150 labs joined at once that might cause a problem).
  • In many ways HEP is now pushing several multi-core computing boundaries. Both in numbers of cores we wish to run on and how we use memory. Memory is, in particular, becoming an acute problem. Some support in the standard would be very helpful.  Minimal support is going in to the new standard, but Bjarne said, amazingly enough, there are very few people on the committee who are willing to work on these aspects. Many have the attitude that one core is really all that is needed!!! Crazy!
  • In particle physics we leak memory like a sieve. Many times our jobs crash because of it. Most of the leaks are pretty simple and a decent garbage collector could efficiently pick up everything and allow our programs to run longer. Apparently this almost made it into the standard until a coalition of the authors of the boost library killed it: if you need a garbage collector then you have a bug; just fix it. Which is all good and glorious in an ideal world, but give me a break! In a 50 million line code base!? One thing Bjarne pointed out was it takes 40 people to get something done on the committee, but it takes only 10 to stop it. Sort of like health insurance. 🙂
  • Built in support for memory pools would probably be quite helpful here too. The idea is that when you read in a particle physics event you allocated all the data for that event in a special memory pool. The data from an event is pretty self-contained – you don’t need it once you have done processing that event and move onto the next one. If it is all in its own memory pool, then you can just wipe it out all at once – who cares about actually carefully deleting each object. As part of the discussion of why something like this wasn’t in there (scoped allocators sounds like it might be partway there) he mentioned that HP was “on our side”, Intel was “not”, and Microsoft was one of the most aggressive when it came to adding new features to the language.
  • I started a discussion of how the STL is used in HEP – pointing out that we make very heavy use of vector and map, and then very little else. Bjarne expressed the general frustration that no one was really writing their own containers. In the ensuing discussion he dissed something that I often make use of – the for_each loop algorithm. His biggest complaint was who much stuff it added – you had to create a whole new class – which involves lots of extra lines of code – and that the code is no longer near where it is being used (non-locality can make source code hard to read). He is right both are problems, but to him they are big enough to nix its used except in rare circumstances. Perhaps I’ll have to re-look at the way I use them.
  • He is not a fan of OpenMP. I don’t like it either, but sometimes people trot it out as the only game in town. Surely we know enough to do better now. Tasked based parallelism? By slots?
  • Bjarne is very uncomfortable with Lambda’s functions – a short hand way to write one-off functions. To me this is the single best thing being added to the language – it will not be possible to totally avoid having to write another mem_fun or bind2nd template. That is huge, because those things never worked anyway – you could spend hours trying to make the code build, and they added so much cruft to your code you could never understand what you were trying to do in the first place! He is nervous that people will start adding large amounts of code directly into lambda functions – as he said “if it is more than one line, it is important enough to be given a name!!” We’ll have to see how use develops.
  • He was pretty dismissive of proprietary languages. Java and C# both were put in this category (both have international standards behind them, just like C++, however) – citing vendor lock-in. But the most venom I detected was when he was discussing the LLVM open source project. This is a C++ interpreter and JIT. This project was loosely run but has now been taken over by Apple – presumably to be, among other things, packaged with their machines. His comment was basically “I used to think that was very good, but now that it has been taken over by Apple I’d have to take a close look at it and see what direction they were taking it.”
  • Run Time Type Information. C++ came into its own around 1983 or so. No modern language is without the ability to inspect itself. Given an object, you can usually determine what methods are on the object, what the arguments of those methods are, etc. – and most importantly, build a call to that method without having ever seen the code in source form. C++ does not have it. We all thought there was a big reason this wasn’t the case. The real reason: no one has pushed hard enough or is interested enough on the committee. For folks doing dynamic coding or writing interpreters this is crucial. We have to do that in our code and adding the information in after-the-fact is cumbersome and causes code bloat. Apparently we just need to pack the C++ committee!

Usually as someone rises in importance in their field they get more and more diplomatic – it is almost a necessity. If that is the case, Bjarne must have been pretty rough when he was younger! It was great to see someone who was attempting to steer-by-committee something he invented vent his frustrations, show his passion, name names, and at one point threaten to give out phone numbers (well, not really, but he almost gave out phone numbers). He can no longer steer the language exactly as he wants it, but he is clearly still very much guiding it.

You can find slides that were used to guide the informal discussion here. I think archived video from the plenary presentation will appear linked to here eventually if you are curious.

Time shifting Video: Recording September 6, 2009

Posted by gordonwatts in computers, Conference.

In my first post on video there were a few comments on the effort required to record the video in the room. The basic question from Chip was the following:

The question I have is the on-site effort and expense. Take the PyCon setup: any clue what synch software they used? Because of the zooming, they had a person with a camera. Maybe I’ve not noticed, but having the slides small and the person large is an interesting idea. With the slides separately available in full-resolution, one could use the on-screen slide images as just a key to tell you when to actually click on the full size ones. Usually, it’s the other way with the person being very small and the slides larger. In fact, pedagogically, having the viewer then have to manipulate something during the talk would keep them in the game, so to speak.

Ok, there are several questions. First point: I want to be able to view this stuff on my MP3 player – so “keeping someone in the game” is not what I have in mind for that sort of viewing. 🙂

Now, the more important thing: cost of recording. There was a reply to this from Tim:

Why don’t you just record the video from the camera and the input to the projector? This would seem like an easy way to get synchronized slides.

For some dumb reason that hadn’t occurred to me – get a VGA splitting and hook its input up to your computer. The Lepton-Photon folks seem to have basically done that:


Judging from the quality of the slides (which is worse here because this was a low resolution image), I’d guess they had a dedicated camera recording the slides rather than actually looking at the computer output. A second stream focused on the presenter and they can use common post-processing tools to combine the two streams as they have above. In fact, the above is from a real-time stream. I don’t know what tool they used, but I can think of a few open-source ones that wouldn’t have too much difficulty as long as you had a decent CPU behind you. On caveat here: in a production environment I have no idea how hard it is to capture two streams and keep them in sync. If they are on two computers they you need software to make sure they start at the same time. Or if there is a glitch and you loose one, etc.

Chip also asks the key question:

what did it cost?

I’m not sure what the biggest expense for these things is – but it is usually culprit is the person doing the work – so I’ll go with that. To record a conference I assume you need to setup the video, run the system while it is recording, and then post-process the video to make it available on the web. The post processing could be fairly time consuming: you have to find where each talk ends and the next one begins, cut the talks, render the final video, etc.

Thinking about this, it seems like one could invest a little money up front and perhaps drop the price quite a bit. First, making software to record the two streams and keeping track of the sync can’t be too hard to write. On the windows platform I’ve seen plenty of samples using video and doing real-time interpretation. Basically, at the end of the day you would want two files with synchronization information: one with video focused on the slides, and the other on the person (with a decent audio pickup!)

If one wants to stream the conference live – that is harder. I don’t know enough about streaming technology to know how it would fit in above without impacting the timing – which is fairly important for the next step.

A human could probably recognize almost the complete structure of the conference from the slide stream alone. I suspect we could write a computer program to do something similar. Especially if we also handed the computer program the PDF’s of all the talks. Image comparison is probably decent enough that it could match almost every slide to the slide stream. As a result you’d get a complete set of the timings for the conference – when the title slide when up, when the last slide was reached, when the next talk started. Heck, even when every single slide was transitioned. You could then use these timings to automatically split the video into talk-by-talk video files. Or generate a timing file with detailed information (I’d love slide-by-slide timing for my deeptalk project). During this step you could also combine the two streams, much as is done in the above live stream I recorded. You could even discard the slide stream and put high quality images from the PDF in its place.

I doubt this would be perfect, but I bet it would get you 90% of the way there. It would have trouble at the edges – before the conference started, for example. Or if someone gives a talk with no slides or slides that are very different from the ones it is given to parse. But, heck, that is to be fixed in Version 2.0. I do not know if 90% is good enough for a project like this.

Seems like a perfect small inter-disciplinary project between CS and physics (with a small grant for one year of work). 🙂 I wonder how far fetched this is?

Time shifting a Conference: Video Formats August 31, 2009

Posted by gordonwatts in computers, Conference.

I got a few interesting comments when I wrote about discoverability of conference video the other day. I’ve been away for a week so instead I thought I’d write a whole new post. But I wanted to change the topic a little bit – motivated by two things: Lepton/Photon and a comment by chimpanzee. And sorry if this gets a little technical… I’m on a rant here!

First Lepton/Photon. The conference is over now. Check out the agenda:

image This got me very excited! Look at that – a link to the video! Yes! Finally! This is what I’d been hoping for in the last post – making it easy to find video for talks! Woo hoo!

Click on it and… they require the RealPlayer to be installed. Bummer. The RealVideo format is a proprietary video format. You have to have RealPlayer installed in order to use it. There are some open source implementations of RealVideo out there, but I think they do an older version of the RealVideo format (for example, VLC claims to know what to do with the RealVideo streams, but falls over before it plays anything). For most people that may not be a big deal – just install RealVideo. I personally have a problem with the RV software. But for the purposes of this post my problem is I can’t download the video stream and pack it up into my mp3 player (I have a Zune). For $40 bucks, I might be able to do it for an iPod (not clear from their website).

If anyone knows a way to play those above files without having to use the RealPlayer software, I’d love to know!

Second, there was the comment by chimpanzee. I recommend reading the fully thing – I’m going to cherry pick for this post and stick with video formats:

For those who love “competition” when it comes to codecs – the competitive war just heated up. Google, just bought On2/VP8 and apparently is going to Open Source VP8.

I know nothing about VP8 other than what is on on2’s web site. It is currently a proprietary video codec. And chimpanzee says in his comment it seems reasonable that Google would open source it. For any of you that have downloaded video files from the internet (Bad! Bad!) you already know there are a plethora of video formats out there and one often needs to install lots of different codecs to get them all to play. VP8 will not solve this, at least not in the near term (<5 years). But this got me thinking – we can solve this now, can’t we?

So, I have a modest proposal. Physics conferences should archive the video of their conferences in a format that plays natively on the n most used operating systems out-of-the-box (where n is > …):

  • Linux – this is funny. In HEP we mostly use Scientific Linux. This is not optimized for watching video. So choose the most popular distro – Ubuntu I think? I used to have a distro of that running on my laptop but had to delete it for space reasons, so I couldn’t test it…
  • Mac – A recent version of OSX running on Intel Mac’s.
  • WIndows – this is tricky. XP is the most popular version out there, however the OS is quite old and plays almost nothing modern out of the box. Plug-ins are available (including my favorite – wow I hate the new sourceforge) that will allow it to play almost anything. So that isn’t good. Vista, I think, is in the same boat – it doesn’t have much in the way extra codecs. W7, however, supports most formats I’ve seen out there (I couldn’t find the docs on microsoft site, but I did find this which matches my experience with the release candidate). So, I think we are almost forced to pick Windows 7 for the Windows branches.
  • iPhone/mobile – this I’m not too worried about. Usually if the host system can play it (iTunes, Zune, etc.) then it can be transcoded and placed on the moble device.
  • Others?

Given all this, it strikes me that MP4 is the only video format that comfortably fits into this. There are plenty of open source tools – heck, tools in general – that allow you to manipulate it to your hearts content. Play it on your mobile player, your TV, heck, most modern burner software can burn it to a DVD if you want. Further, if you are like me, and want to manipulate the video for whatever reasons, well, you can because there are so many tools.

So, that is my modest proposal for archiving.

Streaming is more complex. I don’t know as much about streaming. I’d be inclined to vote for MP4, but I’m not sure how well it works in a streaming protocol.

I have to sneak one vacation picture in… 🙂 The French countryside is pretty amazing… This is near the salt-flats out side of Rochelle.


Timeshifting A Conference: Can we all agree? Please? August 21, 2009

Posted by gordonwatts in computers, Conference, DeepTalk, Video.

A video feed or recording of a big physics conference is a mixed blessing.


If there is a video recording of a huge conference – like DPF – it would be 100’s of hours long. Many of the parallel sessions describe work that is constantly being updated – so it isn’t clear that if you posted the video how long it would be relevant. I’ve seen conferences just post video of plenary sessions and skip the parallel sessions for I imagine this very reason.

I definitely appreciate it when one of the big conferences does furnish video or streaming. But I have a major problem: time shifting. Even if I’m awake during the conference it is rare I can devote real time to watching it. Or if there is a special talk I might have to try to arrange my schedule around the special talk. But, come on folks – we’ve solved this problem, right? Tivo!?!? Or for us old folks, it is called a VCR!!!

Which brings me to the second issue with conference video. Formats. For whatever reason the particle physics world has mostly stuck to using RealMedia of one form or another. Ugh. I was badly burned back in the day with the extra crap that RealMedia installed on my machine so I’m gun shy now. But the format is also hard to manipulate. I tried a recent version of their player (maybe about 6 months ago) and they have a nice recording feature – exactly what I need here. But I couldn’t figure out how to convert its stored format to mp4 or other things to download to my mp3 player! There are some open source implementations out there – but I’ve never encountered one that has been good enough to reliably parse these streams.

This year’s Lepton-Photon is trying something new. They are streaming in RealMedia, but they also have a mp4 stream. And the free VLC player can play it. What is better is the free VLC player can record it! And convert it! Hooray!!! I can now download and convert these guys and listen/watch them on my commute to work and back, which is perfect for me (the picture above is a screen capture of the stream in VLC). The picture isn’t totally rosy, however. VLC seems to loose the stream every now-and-then. So when I’m recording it I have to watch the player like a hawk and restart it. Sometimes it will go two hours between drops, and other times just 10 minutes. It would be nice if it would auto-restart.

Which brings me to the last problem. Discoverability. I really like the way my DeepTalk project puts up a conference as a series of talks. But the only reason it works is because the conference is backed by a standard agenda/conference tool, Indico. My DeepTalk tools can interface with that, grab the agenda in a known format, and render it. We have no such standard for video.

Wouldn’t it be great if everyone did it the same way? You could point your iTunes/Zune/RealMedia/Whatever tool at a conference, it would figure out the times the conference ran, schedule a recording for streams, or if the video was attached, it would download the data… you’d come back after the conference was over, click the “put conference on my mp3 player” and jump on that long plane flight to Europe and drift off to sleep to the dulcet sounds of someone describing the latest update to W mass and how it has moved the most probably Higgs mass a few GeV lower.

Would that be bliss, or what!?

A Depressingly Good Plot August 17, 2009

Posted by gordonwatts in computers.

Long time readers of this blog will known that I love graphics – and think we need to use visual data representation better in particle physics than we currently do (heck, one of the main reasons I started working with DeepTalk). A friend of mine here in Marseille pointed out this fantastic one from Slate:

image What you are looking at is a static version of the real thing. Seriously, go to the article, and click on the green play button. I’ll wait. You’ll be depressed. You might recognize maps like this from the election. Each box is every single county in the USA. Here the blue circles represent the # of jobs gained each month. The red ones the # lost. And the size of the circle is proportional to the actual number. As you might imagine as the animation makes it way into 2008/2009 the map turns rather red.

There is a lot of data being shown here. But in about 20 seconds you can get a good idea of what is happening to the US job market in the last 2.5 years. So, while a depressing plot, what a great way to show the time evolution data! It has very high information density.

In particle physics most of our plots are static and simple histograms – very low information density. The problem is making high density plots like this is very work intensive. And we’d need the tools do do it quickly. Do such tool kits exist?