Sometimes we just get away from ourselves. I wrote up the post below, built the web site discussed, actually got the whole thing up and functioning nicely. That was before I saw that InstaPaper already had an excellent clipping function built in, out of the box, already there and waiting for me, and that it would do the exact same thing as the system I designed.
So let the attached be a memorial to all those poorly thought out development projects, the ones best deliberated more deeply before begun, the one’s best killed when they are young, at the initiation stage, before we do the palm-to-the-forehead-slap of realization, the cruel knowledge arriving of just how much time we just wasted…
Making a Reading Notebook From a Private Web Site
I’ve been keeping notes on my reading for years. Pre-PC I jotted notes on 3 x 5 cards. Post-PC I typed notes in various electronic formats. Pre-web I cut out and filed articles. Post-web I saved them with a few clicks. For a manic and frustrating period I began scanning all my old paper articles until I abandoned that Sisyphean task. These notes, in all their formats, go back a decade or more, and their maintenance was a significant factor in the creation of my File System Information Manager.
Over the past three years managing my fiction writing has taken greater emphasis in the FSIM, but I still capture a lot of random notes and text from my non-book reading. (For my book reading I have a separate process called “Readers Notes,” that works directly in the FSIM.) But I’m reading on the iPad now so my old system doesn’t work any longer.
For a long time I PDF’ed most articles. I’d read the PDF and not the web page, highlighting and annotating as I went along. These files make up the bulk of the “Topics” category in my FSIM. The benefits of the PDF format is the retention of the graphical layout of the page, which can have subtle information in it, highlighting is easy, and so is notation. The downsides: The files are big and there’s a multi-step PDF generation process. An addition benefit is, as others have commented, that the PDF format, along with ASCI text, is about as future proof an archiving format as we will find.
For a while I grabbed articles as WebArchives, which I found unsatisfactory, or as HTML, which was better, but was not functional for annotations.
I have also copied text from articles and saved the clips as text files. I’d then pre- or post- pended notes to the quoted material. The OS X service in Notational Velocity, “Send to Notational Velocity” was the primary reason I began to do this. Unlike similar services in WriteRoom, Bean, Scrivener, and TextEdit, NV adds the originating site’s URL as the first line of the new text. That’s a graceful addition and it made text file sniping useful for the FSIM.
(There are script solutions to this, but I don’t have the programing chops to make them work. QuickSilver had a nice clipping function also, but then you’re in the QuickSilver development/non-development quagmire…)
All these approaches had the similarity of being laptop or PC based practices. I’m increasing reading on an iPad. My work flow is now heavily depended on InstaPaper. I find stuff while browsing in an RSS reader or from email subscriptions, I pop them into InstaPaper and read them later on the iPad, which unfortunately is a place where a “print to PDF ” function does not exist.
So my problem was how to get the workflow from reading to archiving. How to create a note in the FSIM as I go through articles in InstaPaper, the same way I would save to pdf on my laptop?
Sure there’s Twiter and Tumbler. I could “share” my reading, but a lot the stream really has nothing of interest to others. And I’m not so sure I’d want you all to see my notes anyway. With Twitter I’d get at best a hundred characters and a URL, not the page.
Also, I always worry about the amberV prohibition that the web is not now and was never meant to be a permanent repository of data. Just linking is not enough. Web pages change, they are modified, they go missing. I’ve watched the editorial proclivities of the New York Times unfold before my very eyes with each refresh of the page as adjectives and adverbs arrive and depart from published stories while they are shaped to fit the held narrative of the paper. It happens on most other pages too, so if material is important, you have to get a copy of it off the web and onto your file system, be it a local hard drive or a server. So Twitter fails as a notebooking tool here on two counts, capacity and permanence, while Tumblr and Posterious and the like fail on one: they’re on someone else’s system. This also goes for the other social bookmarking sites like delicious and dig.
There are the BibCite apps like Pages. They are fabulous, but they are not going to be on the iPad Safari browser any time soon. Zotero, besides being FireFox only, will not be integrated with InstaPaper or mobile Safari. I have other gripes with Zotero, like it’s Microsoft view on data architecture, but it was close enough that I considered switching from Safari, for about an hour or so, before I realized that wasn’t going to work.
So when I found the core function in WordPress for clipping web pages via a java script bookmarklet using the “Press This” function, it wasn’t to long before I could see an architecture solution to distributed notetaking.
Here’s the goal: When reading in any browser, I want to be able to log an article, its contents, and any notes. A further requirement: be able to get everything out for integration with the FSIM.
First, I set up a subdomain in one of our (numerous) registered domains, put up an instal of WordPress, grabbed a super simple minimalist theme framework, and started ripping stuff out.
(I could have used another CMS or blog platform, but the “Press This” function was really what I was after. I wanted to replicate the fabulous way Tumblr allows for quick posts from any web site on a drive-by basis)
I pulled off all the footers and sidebars and all the graphics. I cut out all the comments code, which is no trivial matter in most themes; discussions with myself felt psychotic. I structured the main index to show only the title of an article, not the blog post body or an excerpt to get a nice folder-like article list. I did keep the category and tagging functions because they are nice for looking at the data from various angles. I also kept search and the static pages, even though I’m not sure what I’ll ever use a static page for, more talking to myself?
Then I began the process of locking the thing down. I password protected the site, pulled it off the web search spider trails, all the usual fortification stuff.
These are pretty standard procedures for anyone running their own blog, so I’ll not describe the steps in any more detail, except say that if you’re not used to messing with the guts of css or html this will not be a something you’re likely to implement, but if you are, then it’s a remarkably simple application of technology you already use every day. It took me about three hours to go from idea to functionality.
Those more geek than me would craft up their own html frame, perhaps using a text only web page, perhaps something rendered from MMD. My hats off to you if you do that. I felt like I needed the wheels and gears of WordPress, and if the thing really works and ever gets big I think I’ll be happy for the more robust support and extensibility of the application.
So now I have a server based, html formatted repository of my web reading. I’m one click away from grabbing and annotating text with a URL back to the original. I can sort and filter articles and, using the plugin Joe Boydston created called Export Posts, I can drop anything down as plain text as needed.
Time will tell if this is sustainable, or just another nice hobby project for a November afternoon, but just using it for a while now on the iPad, the iMac, and on a laptop, it’s a really a joy to click and note into my own system.