Research notes in the Internet Archive December 18, 2007Posted by gordonwatts in archive, computers, physics life.
Articles like this one in Ars-Technica always fascinate me: Internet Archive to store researchers’ notes, raw data. The idea is to collect everything an academic does — web pages, notes, local files, etc. Then they get uploaded to the Internet Archive. Check out the article for more details.
They fascinate me because there is a good part of me that is a***-retentive — I want to keep track of everything I do. For example, I use Onenote as my electronic logbook (along with a Tablet PC so I can write in it). I keep a huge amount in there — gigabytes – plots, notes I’ve proof-read, etc. All of it searchable (it is pretty cool — it will OCR any PDF’s I put in there, so they become searchable; and my handwriting too — something my second grade teacher could never do).
Would it be useful? Timmer, who wrote the Ars article, took a stab:
Will the material that’s uploaded be of any value? Based on my personal experience, the answer here will be mixed. I’ve taken notes and made annotations for everything from peer-reviewed publications to articles for Ars, but only a fraction of the ideas ever make it into the publication. Within the remainder, there are some genuine insights that don’t make the cut due to a lack of direct relevance or space constraints. But there are also a lot of spur-of-the-moment thoughts that I later reject due to further reading or analysis. Unless all contributors are careful about what they upload, this effort may produce a storehouse of bad ideas.
Let me go further: No. It will not be very useful. At least, not my research notes. I make so many mistakes, try out so many ideas that are just plain dumb in retrospect (and many that were dumb in the first place – I just missed that fact). And I can’t spell.
There is also the issue of unpublished data. My logbook contains a great deal of D0 (and hopefully soon, ATLAS) data that has never been published. Plots with unrefined selection cuts (with interesting bumps!). Items that have not been vetted by the collaboration for publish release. Perhaps after some statute of limitations one could release this into the public, but certainly not immediately.
On the other hand, this is personally incredibly useful. Further, it is available on all the computers I use (well, the Windows ones), sync-ed across the net automatically, and so much lighter to carry around than a real logbook. So, while I’m 100% behind logbooks, mine will remain private for the near future.
I guess the biggest question I have is: what would this be useful for? I don’t see it other than as a way to save for posterity (can you say “information overload”)? What am I missing here?