jump to navigation

Jumping the Gun April 4, 2011

Posted by gordonwatts in Uncategorized.
16 comments

The internet has come to physics. Well, I guess CERN invented the internet, but, when it comes to science, our field usually moves at a reasonable pace – not too fast, but not (I hope) too slow. That is changing, however, and I fear some of the reactions in the field.

The first I heard about this phenomena was some results presented by the PAMELA experiment. The results were very interesting – perhaps indicating dark matter. The scientists showed a plot at a conference to show where they were, but explicitly didn’t put the plot into any public web page or paper to indicate they weren’t done analyzing the results or understanding their systematic errors. A few days later a paper showed up on arXiv (which I cannot locate) using a picture taken during the conference while the plot was being shown. Of course, the obvious thing to do here is: not talk about results before they are ready. I and most other people in the field looked at that and thought that these guys were getting a crash course in how to release results. The rule is: you don’t show anything until you are ready. You keep it hidden. You don’t talk about it. You don’t even acknowledge the existence of an analysis unless you are actually releasing results you are ready for the world to get its hands on and play with it as it may.

I’m sure something like that has happened since, but I’ve not really noticed it. But a paper out on the archives on April 1 (yes) seems to have done it again. This is a paper on a Z’ set of models that might explain a number of the small discrepancies at the Tevatron. A number of the results they reference are released and endorsed by the collaborations. But there is one source that isn’t – it is a thesis: Measurement of WW+WZ Production Cross Section and Study of the Dijet Mass Spectrum in the l-nu + Jets Final State at CDF (really big download). So here are a group of theorists, basically, announcing a CDF result to the world. That makes a bit uncomfortable. What is worse, however, is how they reference it:

In particular, the CDF collaboration has very recently reported the observation of a 3.3 excess in their distribution of events with a leptonically decaying W+- and a pair of jets [12].

I’ve not seen any paper released by the CDF collaboration yet – so that above statement is definitely not true. I’ve heard rumors that the result will soon be released, but they are rumors. And I have no idea what the actual plot will look like once it has gone through the full CDF review process. And neither do the theorists.

Large experiments like CDF, D0, ATLAS, CMS, etc. all have strict rules on what you are allowed to show. If I’m working on a new result and it hasn’t been approved, I am not allowed to even show my work to others in my department except under a very constrained set of circumstances*. The point is to prevent this sort of paper from happening. But a thesis, which was the source here, is a different matter. All universities that I know of demand that a thesis be public (as they should). And frequently a thesis will show work that is in progress from the experiment’s point of view – so they are a great way to look and see what is going on inside the experiment. However, now with search engines one can do exactly the above with relative ease.

There are all sorts of potential for over-reaction here.

On the experiment’s side they may want to put restrictions on what can be written in a thesis. This would be punishing the student for someone else’s actions, which we can’t allow.

On the other hand, there has to be a code-of-standards that is followed by people writing papers based on experimental results. If you can’t find the plot on the experiment’s public results pages then you can’t claim that the collaboration backs it. People scouring the theses for results (as you can bet there will be more now) should get a better understanding of the quality level of those results: sometimes they are exactly the plots that will show up in a paper, other times they are an early version of the result.

Personally, I’d be quite happy if results found in theses would stimulate conversation and models – and those could be published or submitted to the archive – but then one would hold off making experimental comparisons until the results were public by the collaboration.

The internet is here – and this information is now available much more quickly than before. There is much less hiding-thru-obscurity than there has been in the past, so we all have to adjust. Smile

* Exceptions are made for things like job interviews, students presenting at national conventions, etc.

Update: CDF has released the paper

Digitize the world of books March 26, 2011

Posted by gordonwatts in Books, physics life.
4 comments

Those of you watching would have noticed that a judge threw a spanner in the plans of Google to digitize the world’s book collection:

The company’s plan to digitize every book ever published and make them widely available was derailed on Tuesday when a federal judge in New York rejected a sweeping $125 million legal settlement the company had worked out with groups representing authors and publishers.

I am a huge fan of the basic idea. Every book online and digital and accessible from your computer. I’m already almost living the life professionally: all the journal articles I use are online. The physics preprint archive, arivx.org, started this model and as a result has spawned new types of conversation – papers that are never submitted to journals. Pretty much the only time I walk over to the library is to look at some textbook up there. The idea of doing the same thing to all the books – well I’m a huge fan.

However, I do not like the idea of one company being the gateway to something like that. Most of the world’s knowledge is written down in one form or another – it should not be locked away behind some wall that is controlled by one company.

I’d rather see a model where we expect, in the long term, that all books and copyrighted materials will eventually enter the public domain. At that point they should be easily accessible online. When you think of the problem like this it seems like there is an obvious answer: the Library of Congress.

Copyrighted books are a tougher nut to crack. There publishers and authors presumably will still want to make money off this. And making out-of-print books available will offer some income (though not much – there is usually a reason those books are out of print). In this case the Google plan isn’t too bad – but having watched journals price gouge because they can, I’m very leery of seeing this happen again here. I’d rather see an independent entity setup that will act as a clearing house. Perhaps they aren’t consumer facing – rather they sell access and charge for books to various companies that then make the material available to us end users. This model is similar to what is done in the music business. I purchase (or rent) my music through Zune – I don’t deal directly with any of the record labels. The only problem is this model doesn’t have competition to keep prices down (i.e. nothing stops this one entity from price gouging).

Lastly, I think having all this data available will open a number of opportunities for things we can think of now. But I think that we need to make sure the data is also available in a raw form so that people can innovate.

Print books are dying. Some forms will take longer than others – I would expect the coffee table picture book to take longer before it converts to all digital than a paper-back novel. But I’m pretty confident that the switch is well underway now. What we do with all the print books is a crucial question. I do think we should be spending money on moving these books into the digital age. Not only are they the sum of our knowledge, but they are also a record of our society.

Under Attack March 23, 2011

Posted by gordonwatts in DOE, university, University of Washington.
7 comments

I’ve been trying not to make a comment on the budget situation in the USA. Or on the current discussion about teacher pay and benefits. Or about the state of science funding in this budget atmosphere. Or the drive to eliminate the Department of Education. Or the revival of the teach the controversy push. Others have made the case much more eloquently than I could have. This is more of a personal take on some of this: I’ve never felt under attack quite the way I do right now.

There seems to be a concerted attack on science funding in the US at the federal level. The feds fund most research that is too long term for a company to fund – which is becoming more and more as the stock market forces companies to think more and more short term. A healthy research program in a country needs to contain a balance for the sake of the long-term health of the economy. And a healthy economy is the only way to make jobs. The large cuts that are reputed to befall the Office of Science, which funds most of the national labs, will force lab closures. Facilities where we do science – gone! 1000’s of people layed off. Heck, if you are trying to cut out 60 billion you can take a guess as to how many jobs that is worth. At $100,000 per person per year – so really nice jobs! – that is another .6 million added to the unemployment roles. Right. That’s going to turn out well!

Second is this constant discussion about teacher pay. I’ve seen comments on newspaper articles with statements like “we are just paying them to babysit our kids.” Seriously?? Maybe we should just eliminate the schools and have the kids all at home. No formalized education system. Now, that has never been done before! And so obviously it must be better! Oh… wait. I guess it has been done before. I think it was called the middle ages… Arrgh! Yes, our K-12 system needs some real work. But beating the crap out of teachers in newspapers is not the way to get good people into the classroom! And the idea that teachers are overpaid paid? Seriously? [I’m not trying to channel Grey’s Anatomy here] I find that hard to believe. Perhaps they are getting better retirement plans for what they are paid – but I suspect that is because when the unions couldn’t negotiate a pay raise – so they went for an increase in the pension. I wonder if you paid teachers a more fair wage, but kept their pension plans the same size, if the rate would be more in line with normal?

On a more local note, one of our state legislators was heard to say “Higher education is a luxury we no longer can afford.” I don’t even know where to start with that. Washington is like every other state, it has some rich people and some poor people. UW is a state school – the state provides subsidies for the in-state students to make it more affordable. A robust state and federal scholarship program back fill for people really in need. The idea is if you are good and you want to get a higher level education, the federal government, the state government, and the university will do its best to make sure that finances do not get in your way. This has been a bedrock of all higher education in the USA for many years now. Do we go back to a class based system? What are people thinking, really? I get they are trying to cut the budget, but think for a few minutes about the implications of what you are saying!

And to those who say education is radically more expensive than it has been in the past – at the UW that is definitely true that the cost an instate student pays has gone up a lot over the last 10-15 years. Definitely more than inflation(by a bit). But if you look at the amount of $$ the university pays to educate a single student that has remained almost constant. Wait. For. It… That is right! State support has dropped dramatically. So the university has to cut expenses and find other sources of income – i.e. raise tuition. Blaming the university for this is misplaced. Last year in the state of Washington after the state legislature cut the UW funding by 26% the university raised tuition by 14% over two years. Legislatures were known to stand up at town halls, etc., and express their displeasure at UW for doing that in hard economic times. I’m happy with them being displeased – I was displeased – but at least be honest and say that the state cut 26% of the university’s funding. It isn’t like that was a capricious raise!

Next is another is the push to increase the teaching load. I currently teach one class a quarter – so three a year (I get paid for only the 9 months that I’m teaching – I have to find my own funding for the rest of the year). That one class is about 3 hours in the class room in front of students. Pretty cushy, eh!? I taught graduate particle physics this year. This is my third year so I’d like to think that I know it by now (not) – but all told during the week it would take about 20 hours of my time. The first time I taught it – when I had to teach myself some field theory – it was taking more like 50 hours a week. When I teach the easier undergraduate courses I tend to have 100’s of students – so it also works out to be about 20 hours a week. Some weeks a lot less, some a lot more. So, it would seem I have at least enough time to take on another course! Except there is one big problem here – my job isn’t just to teach undergraduates. My job is to also teach graduate students, mentor post-docs, and do research. UW is the #1 public institution in the USA when it comes to bringing in $$ from grants. You add another class, then you will effectively change the nature of the University of Washington – make it a teaching institution rather than a research institution. The ramifications of something like that are huge – rankings, desirability, research & undergrads, etc. Do people to say things like this understand how all this is connected?

This last election brought in a lot of new people (at least at the federal level). I remember being elected to a few positions having to do with HEP. I had all sorts of ideas – but I discovered that when I arrived that all the decisions that had been made were all made for a reason! They weren’t arbitrary. You can’t go wrecking around like a bull in a china shop – you have to carefully consider what you are doing and the ramifications. I get the feeling many of these new folks just don’t care. Really just don’t care. Even worse, they don’t know history – which means they are doomed to repeat it. Many of the ideas on the table around America have been tried before – if not here, then other places. I would love them to take a careful look. There is plenty of room for new things to achieve some of the same goals – why not try them rather than closing your eyes and just letting the knife fall where it may? In physics we call this a “prescale” – we just randomly through out data because we have too much. Here we are randomly throwing out programs because we have too little. In both cases this is an implicit admission of defeat: we aren’t smart enough to make a strategic cut.

Ok. Enough. Thank goodness there is a counter balance in most cases to these drives to change things so radically. It won’t be pleasant, but the system is too large and what comes out of it too valuable to actually destroy it in a few short years, despite best efforts of some. Now that I’ve vented, back to working on my classes and my research!

Update: Fixed “under paid” –> “over paid”. Of all the typo’s! Smile

We’re Broke… or not… where is the data!? January 26, 2011

Posted by gordonwatts in DOE, NSF, science, University of Washington, USA.
6 comments

It is hard for me not to feel very depressed about the way government funding is going in Washington. Especially all the “cuts” that keep being  mentioned. So I thought I’d spend an hour doing my best to understand what cuts are being talked about. Ha! Sheer fantasy!

Before I write more, I should point out that I very much have a dog in this race. Actually, perhaps a bit more than one dog. Funding for almost all my research activities comes via the National Science Foundation (NSF) – this is funded directly by congress. My ability to hire post-docs and graduate students, train them, do the physics – everything, is dependent on that stream of money. Also, two months of salary a year come from that stream. In short, almost everything except for the bulk of my pay. That comes from two sources: state of Washington and student’s tuition. A further chunk of money comes from the Department of Energy’s (DOE) Office of Science – they fund the national labs where I do my research, for example. In short, particle physics does not exist without government funding.

So when people start talking about large, across-the-board cuts in funding levels I get quite nervous. Many republicans in 2010 campaigned on cutting back the budget, hard:

“We’re broke, and decisive action is needed to help our economy get back to creating jobs and end the spending binge in Washington that threatens our children’s future,” Mr. Boehner said.

Up until recently they really haven’t said how they were going to do it – a typical political ploy. But now things are starting to show up: cut funding to 2008 levels, and then no increases to counter inflation. The latter amounts to a 2-3% cut per year. No so bad for one year but when you hit 3-4 it starts to add up. You’ll have to let go a student or perhaps down-size a post-doc to a student.

But what about all these other cuts? So… I’m a scientist and I want to know: Where’s the data!? Well, as any of you who aren’t expert in the ways of Washington… boy is it hard to figure out what they really want to do. I suppose this is to their advantage. I did find out some numbers. For example, here is the NSF’s budget page. 2008 funding level was $6.065 billion. In 2010 it was funded at a rate of $6.9 billion. So dropping from 2010 back to 2008 would be a 12% cut. So, if that was cut blindly (which it can’t – there are big projects and small ones and some might be cut or protected), that would translate into the loss of about one post-doc, perhaps a bit more. In a group our size we would definitely notice that!

But is that data right? While I was searching the web I stumbled on this page, from the Heritage foundation, which seems to claim reducing the NSF to 2008 levels will save $1.7 billion, about x2 more than it looks like above. Who is right? I know I tend to believe the NSF’s web page is more reliable. But, seriously, is it even possible for a citizen who doesn’t want to spend days or weeks to gather enough real data to make an independently informed decision?

Check out this recent article from the NYTimes about a recent proposal coming from  Congressman Jordan whose goal is to reduce federal spending by $2.5 trillion through fiscal year 2021 (am I the only one that finds the wording of that title misleading?). As a science/data guy the first thing I want to know is: where is he getting all that savings from? There are lists of programs that are eliminated, frozen, or otherwise reduced – but that document contains no numbers at all. And I can’t find any supporting documentation that he and his staff must have in order of have made that $2.5 trillion claim. So, in that document, which is 80 pages long, I’m left scanning for the words “national science foundation”, “science”, “energy”, etc. Really, there is very little mentioned. But I have a very hard time believing that those programs are untouched – as the article in the new york times points out, since things like Medicare, Social Security, etc., are left untouched (the lions share of the budget – especially in out years), and so all the cuts must come from other programs:

As a result, its effect on the entire array of government programs, among them education, domestic security, transportation, law enforcement and medical research, would be nothing short of drastic.

I agree with that statement. 2.25 trillion is a lot of cash! Can you find the drastic lines in that document? Well, perhaps you know more about Washington. I can’t. This gets to me because now if I have to get into an argument it is a very abstract one.

Pipedream: What I would love these folks to do is release a giant spreadsheet of the US gov’t spending that had 2008, 2009, 2010 levels, and then their proposed cuts, with an extra column for extra text. That is a lot of data, and would probably be hard to compile. But, boy, it would be nice!

Tests are Good for You January 21, 2011

Posted by gordonwatts in Teaching, university, University of Washington.
3 comments

The New York Times had an article the other day talking about a discovery that is making rounds:

Taking a test is not just a passive mechanism for assessing how much people know, according to new research. It actually helps people learn, and it works better than a number of other studying techniques.

I’m here to tell you: duh!

In fact, we’ve institutionalized this in our physics graduate schools. Most university physics departments have the mother-of-all tests. Here at UW we call it the Qualifying Exam. Others call it a prelim (short for preliminary). And there is a joke associated with this exam, usually said with some bitterness if you’ve not passed it yet, or some wistfulness if you long since have passed it:

You know more physics the day you take the qual than you ever do at any other time in your life.

The exam usually happens at the end of your first year in graduate school. The first year classes are hell. Up to that point in my life it was the hardest I’d ever worked at school. Then the summer hits, and you get a small rest. But it is impossible to rest staring down the barrel of that exam, often given at the end of the summer just before the second year of classes start. You have to pass this exam in order to go on to get your Ph.D. And for most of us, it is the last (formal) exam in our career that actually matters. So physiologically, it is a big hurdle as well.

How hard is it? My standard advice to students is that they should spend about one month studying, 8 hours a day. For most people, if they study effectively, that is enough to get by. Some need less and some need more. This is about what it took me. What is the test like? At UW ours is 2 hours per topic, closed book, and all it is is working out problems. No multiple choice here! It lasts two days.

So, how do you study? There is, I think, really only one way to get past this. For 30 days, 8 hours a day, work out problems. There are lots of old qualifier problems on websites. Our department provides students with copies of all the old exams. Even if you don’t know the solution, you force your self to try to work it out with out looking it up in a book – break your brain on it. Once you can solve those problems with out having to look at a text book, you know you are ready. Imagine trying to study by reading a text book, or by reviewing your first year homework problems. There is no way your brain will be able to work out a new problem after that unless you are a very unique individual.

Note how similar this is to the results shown in the article:

In the first experiment, the students were divided into four groups. One did nothing more than read the text for five minutes. Another studied the passage in four consecutive five-minute sessions.

A third group engaged in “concept mapping,” in which, with the passage in front of them, they arranged information from the passage into a kind of diagram, writing details and ideas in hand-drawn bubbles and linking the bubbles in an organized way.

The final group took a “retrieval practice” test. Without the passage in front of them, they wrote what they remembered in a free-form essay for 10 minutes. Then they reread the passage and took another retrieval practice test.

The last group did the best, as you might imagine from the theme of this post!

This is also how you know more physics than at any other time in your life. At no other time do you spend 30 days working out problems across such a broad spectrum of physics topics. If you study and try to work out a sufficiently broad spectrum of problems you can breeze through the exam (literally, I remember watching one guy taking it with me just nail the exam in about half the time of the rest of us).

Working out problems  – without any aids – is active learning. I suppose you could follow the article and say that forcing the brain to come up with the solution means it organizes the information in a better way… Actually, I have no idea what the brain does. But, so far this seems to be the best way to teach yourself. You are actively playing with the new concepts and topics. This is why homework is absolutely key to a good education. And this is why tests are good – if you study correctly. If you actively study for the test (vs. just reading the material) then you will learn the material better.

And we need to work better at designing tests that force students to study actively. For example, I feel we are slipping backwards sometimes. With the large budget cuts that universities are suffering one byproduct is the amount of money we have to hire TA’s to help grade our large undergraduate classes is dropping. That means we can’t ask as many open-ended exam questions – and have to increase the fraction of multiple choice. It is much harder to design a test that goes after problem solving in physics using multiple choice. This is too bad.

So, is this qualifier test hazing process? Or is there a reason to do it? Actually, that is a point of controversy. Maybe there is a way to force the studying component without the high-anxiety of the make-or-break exam. Certainly some (very good) institutions have eliminated the qual. Now, if we could figure out how to do that and still get the learning results we want…

16,000 Physics Plots January 12, 2011

Posted by gordonwatts in ATLAS, CDF, CMS, computers, D0, DeepTalk, physics life, Pivot Physics Plots.
4 comments

Google has 20% time. I have Christmas break. If you work at Google you are supposed to have 20% of your time to work on your own little side project rather than the work you are nominally supposed to be doing. Lots of little projects are started this way (I think GMail, for example, started this way).

Each Christmas break I tend to hack on some project that interests me – but is often not directly related to something that I’m working on. Usually by the end of the break the project is useful enough that I can start to get something out of it. I then steadily improve it over the next months as I figure out what I really wanted. Sometimes they never get used again after that initial hacking time (you know: fail often, and fail early). My deeptalk project came out of this, as did my ROOT.NET libraries. I’m not sure others have gotten a lot of use out of these projects, but I certainly have. The one I tackled this year has turned out to be a total disaster. Interesting, but still a disaster. This plot post is about the project I started a year ago.  This was a fun one. Check this out:

image

Each of those little rectangles represents a plot released last year by DZERO, CDF, ATLAS, or CMS (the Tevatron and LHC general purpose collider experiments) as a preliminary result. That huge spike is July – 3600 plots (click to enlarge the image) -  is everyone preparing for the ICHEP conference. In all the 4 experiments put out about 6000 preliminary plots last year.

I don’t know about you – but there is no way I can keep up with what the four experiments are doing – let alone the two I’m a member of! That is an awful lot of web pages to check – especially since the experiments, though modern, aren’t modern enough to be using something like an Atom/RSS feed! So my hack project was to write a massive web scraper and a Silverlight front-end to display it. The front-end is based on the Pivot project originally from MSR, which means you can really dig into the data.

For example, I can explode December by clicking on “December”:

image

and that brings up the two halves of December. Clicking in the same way on the second half of December I can see:

image

From that it looks like 4 notes were released – so we can organize things by notes that were released:

image

Note the two funny icons – those allow you to switch between a grid layout of the plots and a histogram layout. And after selecting that we see that it was actually 6 notes:

image

 

That left note is title “Z+Jets Inclusive Cross Section” – something I want to see more of, so I can select that to see all the plots at once for that note:

image

And say I want to look at one plot – I just click on it (or use my mouse scroll wheel) and I see:

image

I can actually zoom way into the plot if I wish using my mouse scroll wheel (or typical touch-screen gestures, or on the Mac the typical zoom gesture). Note the info-bar that shows up on the right hand side. That includes information about the plot (a caption, for example) as well as a link to the web page where it was pulled from. You can click on that link (see caveat below!) and bring up the web page. Even a link to a PDF note is there if the web scrapper could discover one.

Along the left hand side you’ll see a vertical bar (which I’ve rotated for display purposes here):

image

You can click on any of the years to get the plots from that year. Recent will give you the last 4 months of plots. Be default, this is where the viewer starts up – seems like a nice compromise between speed and breadth when you want to quickly check what has recently happened. The “FS” button (yeah, I’m not a user-interface guy) is short for “Full Screen”. I definitely recommend viewing this on a large monitor! “BK” and “FW” are like the back and forward buttons on your browser and enable you to undo a selection. The info bar on the left allows you do do some of this if you want too.

Want to play? Go to http://deeptalk.phys.washington.edu/ColliderPlots/… but first read the following. Smile And feel free to leave suggestions! And let me know what you think about the idea behind this (and perhaps a better way to do this).

  • Currently works only on Windows and a Mac. Linux will happen when Moonlight supports v4.0 of Silverlight. For Windows and the Mac you will have to have the Silverlight plug-in installed (if you are on Windows you almost certainly already have it).
  • This thing needs a good network connection and a good CPU/GPU. There is some heavy graphics lifting that goes on (wait till you see the graphics animations – very cool). I can run it on my netbook, but it isn’t that great. And loading when my DSL line is not doing well can take upwards of a minute (when loading from a decent connection it takes about 10 seconds for the first load).
  • You can’t open a link to a physics note or webpage unless you install this so it is running locally. This is a security feature (cross site scripting). The install is lightweight – just right click and select install (control-click on the Mac, if I remember correctly). And I’ve signed it with a certificate, so it won’t get messed up behind your back.
  • The data is only as good as its source. Free-form web pages are a mess. I’ve done my best without investing an inordinate amount of time on the project. Keep that in mind when you find some data that makes no sense. Heck, this is open source, so feel free to contribute! Updating happens about once a day. If an experiment removes a plot from their web pages, then it will disappear from here as well at the next update.
  • Only public web pages are scanned!!
  • The biggest hole is the lack of published papers/plots. This is intentional because I would like to get them from arxiv. But the problem is that my scrapper isn’t intelligent enough when it hits a website – it grabs everything it needs all at once (don’t worry, the second time through it asks only for headers to see if anything has changed). As a result it is bound to set off arxiv’s robot sensor. And the thought of parsing TeX files for captions is just… not appealing. But this is the most obvious big hole that I would like to fix some point soon.
  • This depends on public web pages. That means if an experiment changes its web pages or where they are located, all the plots will disappear from the display! I do my best to fix this as soon as I notice it. Fortunately, these are public facing web pages so this doesn’t happen very often!

Ok, now for some fun. Who has the most broken links on their public pages? CDF by a long shot. Smile Who has the pages that are most machine readable? CMS and DZERO. But while they are that, the images have no captions (which makes searching the image database for text words less useful than it should be). ATLAS is a happy medium – their preliminary results are in a nice automatically produced grid that includes captions.

The Ultimate Logbook January 8, 2011

Posted by gordonwatts in logbooks, physics life.
2 comments

I couldn’t leave this alone. I mentioned the ultimate logbook in my last posting. This is the logbook that would record everything you did and archive it.

It isn’t difficult. The web already has a perfect data format for this – Atom (or RSS). Just imagine. Each source code repository you commit to would publish a feed of all of your changes (with a time stamp, of course!) in the Atom format. Heck, your computer could keep track of what files you edited and publish a list of those too (many cloud storage services already do do this). Make a plot in ROOT? Sure! A feed could be published. Ran a batch job? The command you used for submission could be polished.

Then you need something central that is polling those RSS feeds with some frequency, gathering the data, and archiving it. Oh, and perhaps even making it available for easy use.

Actually, there is a service that does this already. Facebook. Sure! Just tell it about every RSS feed and it will suck that data in. Some of you are probably reading this on Facebook – and this posting got there because I told Facebook about this blog’s Atom feed and it sucked the data in.

Of course, having a write-only repository of everything you did is a little less than useful. You need a powerful search engine to bring the data you are interested in back out. Especially because a lot of that data is just a random command which contains no obvious indication of what you were working on (i.e. no meta-data).

And finally, at least for me, I don’t really want something that is static. Rarely is there a project that I’m finished with and I can neatly wrap it up and move on. Heck, there are projects I put down and pick up again many months later. This ultimate logbook doesn’t really support that.

Perhaps it is best to split the functions. Call this a ultimate logbook a daily log instead, and then keep separate bits of paper where you do your thinking… Awww heck, right back to where we started!

BTW, if you think Facebook might be interesting as a solution here, remember several things. First, as far as I can tell, there is no way to search your comments or posts. Second, you might get ‘Zuckenberged’ – that is, the privacy settings might get changed and your logbook might become totally public.

Log Book Follow-up January 5, 2011

Posted by gordonwatts in logbooks, physics life.
1 comment so far

Starting back in March I wrote a bunch of posts on logbooks: where do you keep your log book?, what do you keep in it? (and more of what you put in it). I can’t help it. The logbook is near and dear to my heart. I promised a follow-up posting. Finally… In summary (nothing in any particular order):

  • What goes into a log book: pictures, code, text, screenscrapes, files, plots, handwriting, paper
  • What do you use: Evernote, old style (bound notebook), loose paper, wiki/twiki, yojimbo, google wave, email (as in email a plot to yourself), tiddywiki, blogging software, text file, DEVON Think Personal, Journler (now defunct).

One thing I didn’t ask about but all of you contributed anyway was how the logbook got used (there is no right way – the logbook has to work for you, of course):

  • Gave up – nothing but an inbox
  • Just keep track of thinking
  • Exploded: link services to track papers, paper for jotting down notes, email, etc. – a bit of everything
  • Every last thing goes into the logbook, including bathroom breaks.

No one mentioned using a kindle/nook to read their logbook, btw. For software that gets used most like a logbook it looks to me like Evernote wins.

For me the most surprising method was email. And by surprising, I  mean smacking myself on the forehead because I’d not already thought of it. Here is the idea: just email your log book entries – with files and attachments, etc., to your logbook email account. Then use the power of search to recover whatever you want. And since you can stick it on Gmail or Hotmail or Yahoo mail, you have almost no size restrictions – and it is available wherever you happen to have a internet connection. Further, since it is just email, it is trivial to write scripts to capture data and ship it off to the logbook.

Now, I’ll ramble a bit in way of conclusion…

Do you remember MIcrosoft’s failed phone, the Kin? It was basically a smart phone w/out the apps. But one of the cool things it did was called Kin Studio. The point was this – everything you did on the phone was uploaded to the cloud. All the text messages you sent or received, all the pictures you took, etc. Then on the web you could look back at any time at what you did and have a complete record. Now, that is a logbook.

Of course, there are some problems with this. Who wants to look at lots of messages that say “ok!” or “ttl” or similar? And the same problem would occur if we were able to develop the equivalent of the Kin studio for logbooks. It would be a disaster. Which I think gets to the crux of what many of you were wrestling with in the comments of those posts (and something I wrestle with all the time): what do you put in a logbook!? There is a part of me that would like to capture everything – the ultimate logbook. Given todays software and technology this wouldn’t be very hard to write!

In thinking about this I came up with a few observations of my own behavior over the last few years:

One way to look at this is: what do you look up in a logbook? I have to say – what I look up in my logbook has undergone some dramatic changes since I was a graduate student. Back then we didn’t have the web (really) or search engines. As a result writing down exactly what I needed to do to get some bit of code working was very important. Now it is almost certain I can find a code sample on the web in one or two searches. So that doesn’t need to go into the logbook anymore. Plots still go in – but 90% of them are wrong. You know – you make the plot, think you are done, move on to the next step and in the process discover a mistake – so you go back and have to remake everything. And put the updated version of the plot into your logbook. Soon it becomes a waste of time – so you just auto-generate a directory with all the plots. So it always has the latest-and-greatest version. Hopefully you remember to put some of those into your logbook when you are done… but often not (at least me).

What is the oldest logbook entry you’ve ever gone back to? For me it was the top discovery – but that was nostalgia, not because I needed some bit of data. I rarely go back more than a few months. And, frankly, in this day and age, if you do an analysis that is published in January, by July someone (perhaps you) have redone it with more data and a better technique in July. You need those January numbers to compare – but you get them from an analysis note, not from your logbook! In short, the analysis note has become the “official” logbook of the experiment.

I have to say that my logbook current serves two functions: meeting notes and thinking. Meeting minutes are often not recorded – so keep a record. Especially since I’m using an electronic notebook I can mark things with an “action” flag and go back later to find out exactly what I need to do as a result of that meeting. The second heaviest use for me is brainstorming. Normally one might scribble ideas on some loose paper, perhaps leave them around for a day or two, come back refine them, etc. I use my logbook for that rather than loose paper.

Now a days I definitely do not keep a log book in the traditional way. Certainly not in the way I was taught to use a logbook in my undergraduate physics classes! Here is a quote from an ex-student of mine (in the comments of one of the previous posts – and I can copy this because he already has a job!!):

I have a rather haphazard attitude toward these things–I have a logbook, but I use it to remember things and occasionally to sort out and prioritize my thoughts. So it’s fairly sparse, and it certainly would be of no help in a patent dispute! Often I keep my old working areas around on my computer, and I use them if I forget what I did in my previous work.

This is pretty typical of what I see in people around me in the field. Other commenters made reference to more careful use of logbooks. I wonder how much usage style varies by field (medicine, physics (particle vs. condensed matter, theory vs. experiment), engineering, industry vs. academic, etc.)?

Getting WiFi in a conference of online addicts is hard January 1, 2011

Posted by gordonwatts in Conference, physics life.
3 comments

This post was triggered by an article pointing out some fundamental limitations of WiFi and tech conferences I saw.

Last month in San Francisco at the Web 2.0 Summit, where about 1,000 people heard such luminaries as Mark Zuckerberg of Facebook, Julius Genachowski, chairman of the Federal Communications Commission, and Eric E. Schmidt of Google talk about the digital future, the Wi-Fi slowed or stalled at times.

I like the way one of my students, Andy Haas, put it once. He was giving a talk at a DZERO workshop on the Level 3 computer farm and trying to make a point about the number and type of computers that were in the farm. He drew an analogy to the number of laptops that were open in the room. It can be a little spooky – almost everyone has one, and almost everyone has them open during conference talks. In Andy’s case there were about 100 people in the room. And when you are giving the talk you have to wonder: how many people are listening!?

There is another side-effect, however. It is rare that the hotel, or whatever, is ready for the large number of devices that we particle physicists bring to a meeting. In the old days it was a laptop per person and now add in a cell phone that also wants a internet connection. Apparently most conference organizers used to use to guess that it would be about 1 in 5 people would have a portable that needed a connection at any one time. Folks from particle physics, however, just blew that curve! The result was often lost wifi connections, many seconds to load a page, and an inability to download the conference agenda! As conference organizer we have long ago learned that is one of the most important things to get right – and one of the key things that will be used to judge the organization of your conference.

The article is interesting in another aspect as well (other that pointing out a problem we’ve been dealing with for more than 10 years now). WiFi is not really designed for this sort of use. Which leads to the question – what is next?

How was your year? December 31, 2010

Posted by gordonwatts in life.
add a comment

Watching my Facebook stream I’ve seen a bunch of comments about how bad their 2010 was. That got me to thinking about my 2010. Actually, I’m lucky: it has been pretty good. I got to live for almost 4 months in the South of France (and got a lot of work done there), the ‘mo moved through her 4’s – a very cool age (for those of you who can’t remember that far back!). Work-wise was also great – had my last Tevatron student get his Ph.D., finished being convener of the ATLAS b-tagging group, watched the first set of calibrations for b-tagging get “published”, and effort I helped start over 3 years ago, hired a very cool post-doc. And got to learn a bunch of things.

That isn’t to say lots of bad stuff happened – the poor economy continues to push back; I’ve not had a raise for the last two years and I would guess I won’t have one for the next two either. However, I do have a job, and, even, a job I like a lot. My eyes are going. Our condo’s price still hasn’t recovered.

Bye 2010! Looking forward to 2011. Oh, wait, I only have a few hours to make resolutions! Ack!

Follow

Get every new post delivered to your Inbox.

Join 51 other followers