jump to navigation

Tests are Good for You January 21, 2011

Posted by gordonwatts in Teaching, university, University of Washington.

The New York Times had an article the other day talking about a discovery that is making rounds:

Taking a test is not just a passive mechanism for assessing how much people know, according to new research. It actually helps people learn, and it works better than a number of other studying techniques.

I’m here to tell you: duh!

In fact, we’ve institutionalized this in our physics graduate schools. Most university physics departments have the mother-of-all tests. Here at UW we call it the Qualifying Exam. Others call it a prelim (short for preliminary). And there is a joke associated with this exam, usually said with some bitterness if you’ve not passed it yet, or some wistfulness if you long since have passed it:

You know more physics the day you take the qual than you ever do at any other time in your life.

The exam usually happens at the end of your first year in graduate school. The first year classes are hell. Up to that point in my life it was the hardest I’d ever worked at school. Then the summer hits, and you get a small rest. But it is impossible to rest staring down the barrel of that exam, often given at the end of the summer just before the second year of classes start. You have to pass this exam in order to go on to get your Ph.D. And for most of us, it is the last (formal) exam in our career that actually matters. So physiologically, it is a big hurdle as well.

How hard is it? My standard advice to students is that they should spend about one month studying, 8 hours a day. For most people, if they study effectively, that is enough to get by. Some need less and some need more. This is about what it took me. What is the test like? At UW ours is 2 hours per topic, closed book, and all it is is working out problems. No multiple choice here! It lasts two days.

So, how do you study? There is, I think, really only one way to get past this. For 30 days, 8 hours a day, work out problems. There are lots of old qualifier problems on websites. Our department provides students with copies of all the old exams. Even if you don’t know the solution, you force your self to try to work it out with out looking it up in a book – break your brain on it. Once you can solve those problems with out having to look at a text book, you know you are ready. Imagine trying to study by reading a text book, or by reviewing your first year homework problems. There is no way your brain will be able to work out a new problem after that unless you are a very unique individual.

Note how similar this is to the results shown in the article:

In the first experiment, the students were divided into four groups. One did nothing more than read the text for five minutes. Another studied the passage in four consecutive five-minute sessions.

A third group engaged in “concept mapping,” in which, with the passage in front of them, they arranged information from the passage into a kind of diagram, writing details and ideas in hand-drawn bubbles and linking the bubbles in an organized way.

The final group took a “retrieval practice” test. Without the passage in front of them, they wrote what they remembered in a free-form essay for 10 minutes. Then they reread the passage and took another retrieval practice test.

The last group did the best, as you might imagine from the theme of this post!

This is also how you know more physics than at any other time in your life. At no other time do you spend 30 days working out problems across such a broad spectrum of physics topics. If you study and try to work out a sufficiently broad spectrum of problems you can breeze through the exam (literally, I remember watching one guy taking it with me just nail the exam in about half the time of the rest of us).

Working out problems  – without any aids – is active learning. I suppose you could follow the article and say that forcing the brain to come up with the solution means it organizes the information in a better way… Actually, I have no idea what the brain does. But, so far this seems to be the best way to teach yourself. You are actively playing with the new concepts and topics. This is why homework is absolutely key to a good education. And this is why tests are good – if you study correctly. If you actively study for the test (vs. just reading the material) then you will learn the material better.

And we need to work better at designing tests that force students to study actively. For example, I feel we are slipping backwards sometimes. With the large budget cuts that universities are suffering one byproduct is the amount of money we have to hire TA’s to help grade our large undergraduate classes is dropping. That means we can’t ask as many open-ended exam questions – and have to increase the fraction of multiple choice. It is much harder to design a test that goes after problem solving in physics using multiple choice. This is too bad.

So, is this qualifier test hazing process? Or is there a reason to do it? Actually, that is a point of controversy. Maybe there is a way to force the studying component without the high-anxiety of the make-or-break exam. Certainly some (very good) institutions have eliminated the qual. Now, if we could figure out how to do that and still get the learning results we want…


16,000 Physics Plots January 12, 2011

Posted by gordonwatts in ATLAS, CDF, CMS, computers, D0, DeepTalk, physics life, Pivot Physics Plots.

Google has 20% time. I have Christmas break. If you work at Google you are supposed to have 20% of your time to work on your own little side project rather than the work you are nominally supposed to be doing. Lots of little projects are started this way (I think GMail, for example, started this way).

Each Christmas break I tend to hack on some project that interests me – but is often not directly related to something that I’m working on. Usually by the end of the break the project is useful enough that I can start to get something out of it. I then steadily improve it over the next months as I figure out what I really wanted. Sometimes they never get used again after that initial hacking time (you know: fail often, and fail early). My deeptalk project came out of this, as did my ROOT.NET libraries. I’m not sure others have gotten a lot of use out of these projects, but I certainly have. The one I tackled this year has turned out to be a total disaster. Interesting, but still a disaster. This plot post is about the project I started a year ago.  This was a fun one. Check this out:


Each of those little rectangles represents a plot released last year by DZERO, CDF, ATLAS, or CMS (the Tevatron and LHC general purpose collider experiments) as a preliminary result. That huge spike is July – 3600 plots (click to enlarge the image) -  is everyone preparing for the ICHEP conference. In all the 4 experiments put out about 6000 preliminary plots last year.

I don’t know about you – but there is no way I can keep up with what the four experiments are doing – let alone the two I’m a member of! That is an awful lot of web pages to check – especially since the experiments, though modern, aren’t modern enough to be using something like an Atom/RSS feed! So my hack project was to write a massive web scraper and a Silverlight front-end to display it. The front-end is based on the Pivot project originally from MSR, which means you can really dig into the data.

For example, I can explode December by clicking on “December”:


and that brings up the two halves of December. Clicking in the same way on the second half of December I can see:


From that it looks like 4 notes were released – so we can organize things by notes that were released:


Note the two funny icons – those allow you to switch between a grid layout of the plots and a histogram layout. And after selecting that we see that it was actually 6 notes:



That left note is title “Z+Jets Inclusive Cross Section” – something I want to see more of, so I can select that to see all the plots at once for that note:


And say I want to look at one plot – I just click on it (or use my mouse scroll wheel) and I see:


I can actually zoom way into the plot if I wish using my mouse scroll wheel (or typical touch-screen gestures, or on the Mac the typical zoom gesture). Note the info-bar that shows up on the right hand side. That includes information about the plot (a caption, for example) as well as a link to the web page where it was pulled from. You can click on that link (see caveat below!) and bring up the web page. Even a link to a PDF note is there if the web scrapper could discover one.

Along the left hand side you’ll see a vertical bar (which I’ve rotated for display purposes here):


You can click on any of the years to get the plots from that year. Recent will give you the last 4 months of plots. Be default, this is where the viewer starts up – seems like a nice compromise between speed and breadth when you want to quickly check what has recently happened. The “FS” button (yeah, I’m not a user-interface guy) is short for “Full Screen”. I definitely recommend viewing this on a large monitor! “BK” and “FW” are like the back and forward buttons on your browser and enable you to undo a selection. The info bar on the left allows you do do some of this if you want too.

Want to play? Go to http://deeptalk.phys.washington.edu/ColliderPlots/… but first read the following. Smile And feel free to leave suggestions! And let me know what you think about the idea behind this (and perhaps a better way to do this).

  • Currently works only on Windows and a Mac. Linux will happen when Moonlight supports v4.0 of Silverlight. For Windows and the Mac you will have to have the Silverlight plug-in installed (if you are on Windows you almost certainly already have it).
  • This thing needs a good network connection and a good CPU/GPU. There is some heavy graphics lifting that goes on (wait till you see the graphics animations – very cool). I can run it on my netbook, but it isn’t that great. And loading when my DSL line is not doing well can take upwards of a minute (when loading from a decent connection it takes about 10 seconds for the first load).
  • You can’t open a link to a physics note or webpage unless you install this so it is running locally. This is a security feature (cross site scripting). The install is lightweight – just right click and select install (control-click on the Mac, if I remember correctly). And I’ve signed it with a certificate, so it won’t get messed up behind your back.
  • The data is only as good as its source. Free-form web pages are a mess. I’ve done my best without investing an inordinate amount of time on the project. Keep that in mind when you find some data that makes no sense. Heck, this is open source, so feel free to contribute! Updating happens about once a day. If an experiment removes a plot from their web pages, then it will disappear from here as well at the next update.
  • Only public web pages are scanned!!
  • The biggest hole is the lack of published papers/plots. This is intentional because I would like to get them from arxiv. But the problem is that my scrapper isn’t intelligent enough when it hits a website – it grabs everything it needs all at once (don’t worry, the second time through it asks only for headers to see if anything has changed). As a result it is bound to set off arxiv’s robot sensor. And the thought of parsing TeX files for captions is just… not appealing. But this is the most obvious big hole that I would like to fix some point soon.
  • This depends on public web pages. That means if an experiment changes its web pages or where they are located, all the plots will disappear from the display! I do my best to fix this as soon as I notice it. Fortunately, these are public facing web pages so this doesn’t happen very often!

Ok, now for some fun. Who has the most broken links on their public pages? CDF by a long shot. Smile Who has the pages that are most machine readable? CMS and DZERO. But while they are that, the images have no captions (which makes searching the image database for text words less useful than it should be). ATLAS is a happy medium – their preliminary results are in a nice automatically produced grid that includes captions.

The Ultimate Logbook January 8, 2011

Posted by gordonwatts in logbooks, physics life.

I couldn’t leave this alone. I mentioned the ultimate logbook in my last posting. This is the logbook that would record everything you did and archive it.

It isn’t difficult. The web already has a perfect data format for this – Atom (or RSS). Just imagine. Each source code repository you commit to would publish a feed of all of your changes (with a time stamp, of course!) in the Atom format. Heck, your computer could keep track of what files you edited and publish a list of those too (many cloud storage services already do do this). Make a plot in ROOT? Sure! A feed could be published. Ran a batch job? The command you used for submission could be polished.

Then you need something central that is polling those RSS feeds with some frequency, gathering the data, and archiving it. Oh, and perhaps even making it available for easy use.

Actually, there is a service that does this already. Facebook. Sure! Just tell it about every RSS feed and it will suck that data in. Some of you are probably reading this on Facebook – and this posting got there because I told Facebook about this blog’s Atom feed and it sucked the data in.

Of course, having a write-only repository of everything you did is a little less than useful. You need a powerful search engine to bring the data you are interested in back out. Especially because a lot of that data is just a random command which contains no obvious indication of what you were working on (i.e. no meta-data).

And finally, at least for me, I don’t really want something that is static. Rarely is there a project that I’m finished with and I can neatly wrap it up and move on. Heck, there are projects I put down and pick up again many months later. This ultimate logbook doesn’t really support that.

Perhaps it is best to split the functions. Call this a ultimate logbook a daily log instead, and then keep separate bits of paper where you do your thinking… Awww heck, right back to where we started!

BTW, if you think Facebook might be interesting as a solution here, remember several things. First, as far as I can tell, there is no way to search your comments or posts. Second, you might get ‘Zuckenberged’ – that is, the privacy settings might get changed and your logbook might become totally public.

Log Book Follow-up January 5, 2011

Posted by gordonwatts in logbooks, physics life.
1 comment so far

Starting back in March I wrote a bunch of posts on logbooks: where do you keep your log book?, what do you keep in it? (and more of what you put in it). I can’t help it. The logbook is near and dear to my heart. I promised a follow-up posting. Finally… In summary (nothing in any particular order):

  • What goes into a log book: pictures, code, text, screenscrapes, files, plots, handwriting, paper
  • What do you use: Evernote, old style (bound notebook), loose paper, wiki/twiki, yojimbo, google wave, email (as in email a plot to yourself), tiddywiki, blogging software, text file, DEVON Think Personal, Journler (now defunct).

One thing I didn’t ask about but all of you contributed anyway was how the logbook got used (there is no right way – the logbook has to work for you, of course):

  • Gave up – nothing but an inbox
  • Just keep track of thinking
  • Exploded: link services to track papers, paper for jotting down notes, email, etc. – a bit of everything
  • Every last thing goes into the logbook, including bathroom breaks.

No one mentioned using a kindle/nook to read their logbook, btw. For software that gets used most like a logbook it looks to me like Evernote wins.

For me the most surprising method was email. And by surprising, I  mean smacking myself on the forehead because I’d not already thought of it. Here is the idea: just email your log book entries – with files and attachments, etc., to your logbook email account. Then use the power of search to recover whatever you want. And since you can stick it on Gmail or Hotmail or Yahoo mail, you have almost no size restrictions – and it is available wherever you happen to have a internet connection. Further, since it is just email, it is trivial to write scripts to capture data and ship it off to the logbook.

Now, I’ll ramble a bit in way of conclusion…

Do you remember MIcrosoft’s failed phone, the Kin? It was basically a smart phone w/out the apps. But one of the cool things it did was called Kin Studio. The point was this – everything you did on the phone was uploaded to the cloud. All the text messages you sent or received, all the pictures you took, etc. Then on the web you could look back at any time at what you did and have a complete record. Now, that is a logbook.

Of course, there are some problems with this. Who wants to look at lots of messages that say “ok!” or “ttl” or similar? And the same problem would occur if we were able to develop the equivalent of the Kin studio for logbooks. It would be a disaster. Which I think gets to the crux of what many of you were wrestling with in the comments of those posts (and something I wrestle with all the time): what do you put in a logbook!? There is a part of me that would like to capture everything – the ultimate logbook. Given todays software and technology this wouldn’t be very hard to write!

In thinking about this I came up with a few observations of my own behavior over the last few years:

One way to look at this is: what do you look up in a logbook? I have to say – what I look up in my logbook has undergone some dramatic changes since I was a graduate student. Back then we didn’t have the web (really) or search engines. As a result writing down exactly what I needed to do to get some bit of code working was very important. Now it is almost certain I can find a code sample on the web in one or two searches. So that doesn’t need to go into the logbook anymore. Plots still go in – but 90% of them are wrong. You know – you make the plot, think you are done, move on to the next step and in the process discover a mistake – so you go back and have to remake everything. And put the updated version of the plot into your logbook. Soon it becomes a waste of time – so you just auto-generate a directory with all the plots. So it always has the latest-and-greatest version. Hopefully you remember to put some of those into your logbook when you are done… but often not (at least me).

What is the oldest logbook entry you’ve ever gone back to? For me it was the top discovery – but that was nostalgia, not because I needed some bit of data. I rarely go back more than a few months. And, frankly, in this day and age, if you do an analysis that is published in January, by July someone (perhaps you) have redone it with more data and a better technique in July. You need those January numbers to compare – but you get them from an analysis note, not from your logbook! In short, the analysis note has become the “official” logbook of the experiment.

I have to say that my logbook current serves two functions: meeting notes and thinking. Meeting minutes are often not recorded – so keep a record. Especially since I’m using an electronic notebook I can mark things with an “action” flag and go back later to find out exactly what I need to do as a result of that meeting. The second heaviest use for me is brainstorming. Normally one might scribble ideas on some loose paper, perhaps leave them around for a day or two, come back refine them, etc. I use my logbook for that rather than loose paper.

Now a days I definitely do not keep a log book in the traditional way. Certainly not in the way I was taught to use a logbook in my undergraduate physics classes! Here is a quote from an ex-student of mine (in the comments of one of the previous posts – and I can copy this because he already has a job!!):

I have a rather haphazard attitude toward these things–I have a logbook, but I use it to remember things and occasionally to sort out and prioritize my thoughts. So it’s fairly sparse, and it certainly would be of no help in a patent dispute! Often I keep my old working areas around on my computer, and I use them if I forget what I did in my previous work.

This is pretty typical of what I see in people around me in the field. Other commenters made reference to more careful use of logbooks. I wonder how much usage style varies by field (medicine, physics (particle vs. condensed matter, theory vs. experiment), engineering, industry vs. academic, etc.)?

Getting WiFi in a conference of online addicts is hard January 1, 2011

Posted by gordonwatts in Conference, physics life.

This post was triggered by an article pointing out some fundamental limitations of WiFi and tech conferences I saw.

Last month in San Francisco at the Web 2.0 Summit, where about 1,000 people heard such luminaries as Mark Zuckerberg of Facebook, Julius Genachowski, chairman of the Federal Communications Commission, and Eric E. Schmidt of Google talk about the digital future, the Wi-Fi slowed or stalled at times.

I like the way one of my students, Andy Haas, put it once. He was giving a talk at a DZERO workshop on the Level 3 computer farm and trying to make a point about the number and type of computers that were in the farm. He drew an analogy to the number of laptops that were open in the room. It can be a little spooky – almost everyone has one, and almost everyone has them open during conference talks. In Andy’s case there were about 100 people in the room. And when you are giving the talk you have to wonder: how many people are listening!?

There is another side-effect, however. It is rare that the hotel, or whatever, is ready for the large number of devices that we particle physicists bring to a meeting. In the old days it was a laptop per person and now add in a cell phone that also wants a internet connection. Apparently most conference organizers used to use to guess that it would be about 1 in 5 people would have a portable that needed a connection at any one time. Folks from particle physics, however, just blew that curve! The result was often lost wifi connections, many seconds to load a page, and an inability to download the conference agenda! As conference organizer we have long ago learned that is one of the most important things to get right – and one of the key things that will be used to judge the organization of your conference.

The article is interesting in another aspect as well (other that pointing out a problem we’ve been dealing with for more than 10 years now). WiFi is not really designed for this sort of use. Which leads to the question – what is next?

How was your year? December 31, 2010

Posted by gordonwatts in life.
add a comment

Watching my Facebook stream I’ve seen a bunch of comments about how bad their 2010 was. That got me to thinking about my 2010. Actually, I’m lucky: it has been pretty good. I got to live for almost 4 months in the South of France (and got a lot of work done there), the ‘mo moved through her 4’s – a very cool age (for those of you who can’t remember that far back!). Work-wise was also great – had my last Tevatron student get his Ph.D., finished being convener of the ATLAS b-tagging group, watched the first set of calibrations for b-tagging get “published”, and effort I helped start over 3 years ago, hired a very cool post-doc. And got to learn a bunch of things.

That isn’t to say lots of bad stuff happened – the poor economy continues to push back; I’ve not had a raise for the last two years and I would guess I won’t have one for the next two either. However, I do have a job, and, even, a job I like a lot. My eyes are going. Our condo’s price still hasn’t recovered.

Bye 2010! Looking forward to 2011. Oh, wait, I only have a few hours to make resolutions! Ack!

Email is dead! Long live… err… uhh…. hmmm… December 29, 2010

Posted by gordonwatts in email, physics life.

You know when something is past when the old grey lady picks it up. Apparently e-mail is dead.

The problem with e-mail, young people say, is that it involves a boringly long process of signing into an account, typing out a subject line and then sending a message that might not be received or answered for hours. And sign-offs like “sincerely” — seriously?

Those of you around the web I’m sure have seen this – murmurs have been going on a long time about the death of email. Text – on the phone – has been taking over. You have to look no further than text message usage statistics to see this is very real… 1 in 3 ‘teens send more than 100 text messages a day. [I wasn’t able to find any recent over-all usage statistics for text, but this one is back in 2005] If you look at similar plots of # of cell phones out there, you’ll note the increase is faster – we are sending more of these messages than we used to. Facebook, which is attempting to be our communications hub, is altering how it does email – removing the subject line, etc. – making it more like text messaging. Hotmail and Gmail and Yahoo Mail have already gone through this and they continue further down this road.

I, of course, teach, and so am often in contact with lots of students… and they say the same thing. We’ve heard the comment “I only read my email because old people send me things, like my parents or professors.” (no, I’m not making that up…).

But, really, is email dead? Can it be so? Or is it a situational thing?

David McDowell, senior director of product management for Yahoo Mail … said this was less a generational phenomenon than a situational one. Fifteen-year-olds, for example, have little reason to send private attachments to a boss or financial institution.

This I can buy. Heck, I’m a huge user of email and I am religious about putting a subject on all my emails when I send them professionally. When I use Facebook email for a quick note to a friend of mine… almost never put a subject on it. I am definitely seeing more use of IM – especially now that Facebook has allowed 3rd parties to tap into its IM system – people often contact me through that system with questions or comments – for a quick chat about some physics gossip or where to find some paper, etc.

But with students – the next generation – it really does seem like a more fundamental change is occurring. At the moment, when they enter the work force, they are entering our world and so are, at some level, forced to adopt our model of e-mail usage. But that will change – us old people are living on borrowed time – at some point we will be living in their world. How will communication look? Will it look similar to today or will it be a continuous stream of constant interruptions as text messages roll in? Or will it be a mix of the two, depending on the topic and the kind of question that is being asked?

And second, how do you deal with the modern class? Say I have 250 students. I want to tell them to study chapters 1-7 for the exam later this week. Normally I’d blast a class-wide email. Should I be setting up a class fan-page on facebook (and not all of them will be members)? Get the phone number for them all so I can send a text (sounds like too much work)? Just post to a web page and assume they saw it?

My guess is that all those emails which people just add one “line” and then hit send on are going to become a thing of the past – they will become these text and IM’s we’ve been talking about. How is this for starters? In our experiments we have a number of email lists. The name of the email lists should double as a chat room. When you have a question you post to the chat room. If no one replies, you create a more detailed (and formal) email.

Other ideas?

Did ATLAS Make a Big Mistake? December 16, 2010

Posted by gordonwatts in ATLAS, computers.

Ok. That is a sensationalistic headline. And, the answer is no. ATLAS is so big that, at least in this case, we can generate our own reality.

Check out this graphic, which I’ve pulled form a developer survey.


Ok, I apologize for this being hard to read. However, there is very little you need to read here. The first column is Windows users, the second Linux, and the third Mac. The key colors to pay attention to are red (Git), Green (Mercurial), and Purple (Subversion). This survey was completed just recently, has about 500 people responding. So it isn’t perfect… But…

Subversion, Mercurial, and Git are all source code version control systems. When an experiment says we have 10 million lines of code – all that code is kept in one of these systems. The systems are fantastic – they can track exactly who made what modifications to any file under their control. It is how we keep anarchy from breaking out as >1000 people develop the source code that makes ATLAS (or any other large experiment) go. Heck, I use Subversion for small little one-person projects as well. Once you get used to using them you wonder how you ever did without them.

One thing to note is that cvs, which is the grand-daddy of all version control systems and used to be it about 10 or 15 years ago doesn’t even show up. Experiments like CDF and DZERO, however, are still using them. The other thing to note… how small Subversion is. Particularly amongst Linux and Mac users. It is still fairly strong in Windows, though I suspect that is in part because there is absolutely amazing integration with the operating system which makes it very easy to use. And the extent to which it is used on Linux and the Mac may also be influenced by the people that took the survey – they used twitter to advertise it and those folks are probably a little more cutting edge on average than the rest of us.

Just a few years ago Subversion was huge – about the current size of Git. And there in lies the key to the title of this post. Sometime in March 2009 ATLAS decided to switch from cvs to Subversion. At the time it looked like Subversion was the future of source control. Ops!

No, ATLAS doesn’t really care for the most part. Subversion seems to be working well for it and its developers. And all the code for Subversion is open source, so it won’t be going away anytime. At any rate, ATLAS is big enough that it can support the project even if it is left as one of the only users of it. Still… this shift makes you wonder!

I’ve never used Git and Mercurial – both of which are a new type of distributed source control system. The idea is that instead of having a central repository where all your changes to your files are tracked, each person has their own. They can trade batches of changes back and forth with each other without contacting the central repository. It is a technique that is used in the increasingly high speed development industry (for things like Agile programming, I guess). Also, I’ve often heard the term “social coding” applied to Git as well, though it sounds like that may have to do more with the GitHub repository’s web page setup than the actual version control system. It is certainly true that anyone I talk to raves about GitHub and other things like that. While I might not get it yet, it is pretty clear that there is something to “get”.

I wonder if ATLAS will switch? Or, I should say, when it will switch! This experiment will go on 20 years. Wonder what version control system will be in ascendance in 10 years?

Update: Below, Dale included a link to a video of Linus talking about GIT (and trashing cvs and svn). Well worth a watch while eating lunch!

Linus on GIT– he really hates cvs and svn–and makes a pretty good case

The Particle Physics Version of the Anecdote December 13, 2010

Posted by gordonwatts in ATLAS, Hidden Valley.

Anecdotes are wonderful things, used (and misused) all the time. They tell great little stories, can be the seed of a new idea, or bring down an argument. Have something that is always true? Then you need but one anecdote to bring it tumbling to the ground. People fighting the evolution vs. creationism battle know this technique well! Of course, it is often misused too – an anecdote does not a theory make or break!

In experimental particle physics we have our own version of an anecdote: the event display. In the anecdotal sense we use it mostly in the sense that it is the seed of a new idea. Our eyes and brain are better at recognizing a new pattern than any computer algorithm currently known. I’ve often said that gut instinct does play a role in physics – and the event display is one place where we learn our gut instinct!

Take, for example, this event display shown by Monica Verducci at the Discrete2010 conference


You are looking at the inner detector of ATLAS – first (from inner to outer) are the highly accurate pixel detectors, then the silicon strip detectors, and finally all the dots are the transition radiation detector (TRT). The hits from a simulated Hidden Valley event are shown. Now, so the average particle physicist most of that display looks very normal, and wouldn’t even raise an eyebrow. Except for two features. Opposite each other, just above and below the horizontal, there are two plumes of particles. While plumes of particles (“jets”) are not uncommon, the fact that they draw to a point a long way – meters – from the center of the detector is. Very uncommon in fact.

Your eye can pick those out right away. Perhaps, if you aren’t a particle physicist, you didn’t realize those were unique, but I bet your eye got them right away, regardless. Now, the problem is to develop a computer algorithm to pick those guys out. It may look trivial – after all something that your eye gets that easily can’t be that hard – but it turns out not to be the case. Especially using full blown tracking to find those guys… tracking that is tuned to find a track that originates from the center of the detector. Just starting at it like this I’m having a few ideas of things we could do to find those tracks.

Say you already have an algorithm, but it fails some 30% of the time. Then you might take 100 interactions that fail, make event displays of all of them, create a slide show, and then just watch them one after the other. If you are lucky you’ll start to see a pattern.

None of this proves anything, unfortunately. Anecdotes aren’t science. But they do lead to ideas that can be tested! Once you have an idea for the algorithm you can write some code – which is not affected by human bias! – and run it on your sample of interactions. Now you can test it, and you measure its performance and see if your idea is going to work. By measuring you’ve turned your anecdote into science.

That is what I mean by the event display can be the germ of an idea. I’ve seen this technique used a number of times in my field. Though not enough! Our event displays are very hard to use and so many of us (myself included) tend to use them as a last resort. This is unfortunate, because when looking for some new sort of pattern recognition algorithm – as in this case – they are incredibly valuable. Another trend I’ve noticed – the old generation seems to resort to these much quicker than the younger ones. <cough>bubble chambers<cough>.

Just like with real anecdotes, we particle physicists misused our event displays all the time. The most public example is we show an event display at a conference and then call it “a typical event.” You should chuckle. Anytime you hear that it is code for “we searched and searched for the absolutely cleanest event we could find that most clearly demonstrates what we want you to think of as normal and that probably will happen less than once every year.” <smile>

Wikileaks December 3, 2010

Posted by gordonwatts in politics.
add a comment

Normally I try to stick to science and things affecting education (or just not write much), but I find the Wikileaks thing in the news lately fascinating for several reasons. The most mundane of which is I’ve always been interested in international diplomacy and this gives quite a peak into what conversations were actually going on. When I was reading lots of books (more on that in a future post, if I get back to posting) one of the topics was international relations. In many cases the authors would use public actions to infer the diplomacy that must have gone on behind closed doors. This gives us a glimpse into that – and I can’t wait to see more articles sifting through the cables.

I suspect how you feel about Wikileaks depends on where you sit on two issues. First, do you trust your government when it comes to doing the right thing internationally? If the answer is no, then I would guess that anything that makes the government more transparent you will like. The other axis is how much damage you think this causes to the US’ ability to conduct international relations and, perhaps, how important it is to get unvarnished opinions from others around the globe without their fear of their words being published. The balance of those two issues probably governs your basic reaction. Personally, I’m more concerned about the latter.

But that isn’t what prompted me to write this.

Everyone is going after Wikileaks. They have been kicked off servers in the UK, now out of Amazon, and as I write this their servers are offline or at least not responding. My guess is that since countries are out to get Wikileaks and its founder, its life is going to be short. But… I think it doesn’t matter what happens to Wikileaks.

When you break it down, Wikileaks is doing two things, well, three things.

  1. First, Wikileaks receives the secrets. Someone sends it to them with the expectation that they will publish the secrets and do their best to keep them secret.
  2. Wikileaks does do some filtering – removing names, etc., and other things that are obvious references to names that might get people killed or similar.
  3. Indexing and collation. This is especially true of the latest batch of cables – if Wikileaks had dumped the complete trove on the web it would have been a lot less accessible or interesting to most of us. Right now you can just browse their website and look at topics and look at the cables that concern that topic.
  4. Finally they publish the secrets to the web and the world.

I think it is a given that as long as you have humans working with secrets you’ll have leaks. So I don’t think the source of leaks in the world is going to dry up. I think this is especially true now that Wikileaks has shown everyone how easy it is to publish the secrets. But lets say Wikileaks goes away, and that the world is successful in keeping new Wikileaks clones from being created. Now what?

Well, we already know how to widely publish something we aren’t supposed to with very little effort. It is called file sharing! Once a file gets started it is very difficult to tell exactly where it came from – and so it quickly becomes anonymous. That takes care of step #1 and #4. What about #3? I’m going to go with crowdsourcing here. The idea is that some data is published and then the people who are interested will comb through it and publish their findings online. Blogs, tweets, etc. That takes care of #3.

This isn’t without problems in the future. For example, #2 won’t be dealt with – or you’ll have to rely on every single person who is combing through the data. Second, #4 isn’t going to be as nice – it will be spread all over the web. Perhaps more serious is the way #4 will occur – most people who put up the information will also put up a interpretation – perhaps cherry picking the comments. We saw a classic example of this with http://en.wikipedia.org/wiki/ClimategateClimateGate. I’m not sure how well #1 is going to work. Doing that with file sharing isn’t trivial. You have to have a computer connected to the net and up and running long enough to publish the fact that you have this file – something someone leaking a secret may not like to do. There are other methods – but it is unlikely that the person who has access to the secrets also knows how to publish them anonymously and effectively.

None-the-less the information is out there for all to see. Now that these leaked secrets have gotten so much publicity and people will start to realize how easy it is. So, while Wikileaks might disappear I think the cat is out of the bag, so to speak. You know the saying, right? Don’t write anything in email you wouldn’t want repeated, even if it is a private email… well… more proof!