File System Infobase Manager

Using the file system for your notes.

I’ve been keeping notes and journals for as long as I can remember. And I was born long before these cool PC/Mac thing’ies became ubiquitous, so my earliest notes were written on the best technology of the day: paper. Then came the revolution, and before too long I got with the program, moved on to the electronics, and started typing up all kinds of stuff in all kinds of applications.

Technology over this time period has been about as fickle as a saloon girl after a roundup, so I’ve used almost every type of system that’s been rolled out since the green screen VAX I played with in 1983. The result was a pile of notes that collected then collapsed into a mish-mash of various file types, in different formats, with incompatible structures, all strewn about various locations on multiple generations of mediums.

For example, in my notes folders I had files produced by AmiPro, WordStar, WordPerfect, Commence, Ecco Pro, and Word. There was text in Lotus 123 files, and Excel spreadsheets. There were files from an outliner app called Think Tank, and others from an outliner called Outliner. There were emails from Outlook, emails from Lotus Notes, stuff from an HP95LX, an HP200LX and a number of Palms. There were text files, doc files, files with extensions I had forgotten from applications I’d forgotten – all kinds of electronic exotica. But I carefully saved them all ‘cause I was sure that someday, somehow, I’d use them.

In 2007 I became a full time writer. All of a sudden this hoard of electronic chaff became a mineable resource. Making sense of it changed from deferrable issue to current todo because someone told me that note taking, journaling they called it, and, crucially, retrieving said notes so they could actually be used, was a key skill for a writer. I scrounged up the old data folders, consolidated them and began the search for a system to manage it all.

IT Architects like to call collections like this, “unstructured infobases” and there are lots of programs around – variously called information managers, PIMs, or Everything Buckets – to help manage them. Surveying the field I adopted two, Journler and DEVONThink, after I demoed a dozen more (and did this all, probably, while I should have been writing).

First, I poured all my notes into Journler, a fabulous but sadly abandoned gem of a program. Journler allowed me to think of my infobase as a structured whole, rather than as disparate segments, and it prompted me to habitualize the process of capturing and synthesizing the random bits of data flowing past my writing desk every day. Primarily the import to Journler standardized all my file formats. From the transition I got a fairly fixed TXT/RTF/RTFD/HTML set of documents, augmented with some PDFs, various image and audio files. This was not an insignificant feat.

When I outgrew Journler (and you always outgrow these packages, always, eventually, each and every one, no mater what the developer says about capacity and growth potential when you sign on) I transitioned to a beast of an application called DEVONThink. DT ultimately showed itself to be both constricting and superfluous. (see my Dating DEVONThink post about this) But DT further refined my file formats and got me to add tags to files in a common data set rather than categorizing by topic into groups.

Along the way I played with Evernote, MacJournal, SoHo Notes, Mori, EagleFiler and Yohimbo. I’ve written about these attempts, and my struggles with DEVONThink, elsewhere. In their own way each of these apps was lacking, but as a group hey all demanded attention to their own set of quirks that their programers thought of as features. You had to conform your dates, workflow, ideas, cataloging, to their app’s functionality. This for me was perfectly backwards.

So, now, while my data format was standardized, thanks to Journler and DT, and, as such, much more usable, the whole process was still not stable, not at least for any time horizon of more than a year of two, since everything was still in someone else’s app. I was dependent on one or another of these applications to make sense of it all, which was kind of where I’d been all along. Then I found a better way.

Now I’m using a system that is stable, and sustainable, and scalable; one that seems to fall into the background while I work; one that is as future proof as can be. It allows me to refer to my notes, do my writing, create new ideas, synthesize old ones and not wrestle with an application while I’m doing it. I think it’s a long term solution that is platform neutral and vastly extensible.

It’s called the “file system”. Yep, the files system, that’s all. The very thing we use to run our computer every day. Shocking huh? After all those applications and proprietary file structures who would’a thunk that the best answer to electronic note taking would be the good old file system?

By using consistent file naming conventions and some highly abstract codes, I have produced a vastly flexible system that is portable, that lets me find just what, and does so without wasting my time in arcane processes, leaving me to learn the quirks of a program that my well be abandoned in a year or so.

I credit a denizen of the Scrivener discussion boards, amberV, with creating the core of the system. In a series of posts she turned the light on for me, the one that let me think of organizing my data in this simple but deeply powerful way. I’ve taken her ideas and modified them, but not so very much as to be able to claim any credit for the origination of the system. amberV is brilliant, and her ability to create vast robustness in a simple design is evidence of that gift.

Her original discussion are here, and here, but just to be clear, the credit for these ideas should go to her. Any problems due to my modification are my responsibility alone.

How it works

The system relies on file naming conventions and folders, two things that are as stable, permanent and accessible as anything that will ever exist in the computing world. To understand this system you have to start with some philosophical ideas about info management. So strap on your seat-belts, this won’t take long.

Every note you take, every article you clip, every email you write is metaphorically just a sheet of paper, a slip in your infobase. Think: a giant 3×5 card, or a single page entry in a notebook. The item can be complex or simple, but each is a record, and each record needs to be retrievable in a consistent way, and should be retrievable in multiple ways.

An organizational approach to these kids of slips was developed by a neo-ludite from Japan, Noguchi Yukio, who built a filing philosophy to manage his infobase of folders. Others have used it to manage thousands and thousands of index cards.

It’s not that hard to think of RTFs or TXTs (or DOCs for that mater) as a series of 3×5 cards, and when you look closely at Journler or DEVONThink from an architectural perspective you see that’s really what they are doing: creating a file system for bunch of text, graphic and pdf slips.

So the path from a card management system to an infobase is pretty straight and direct.

We all start out the same way, right? We save a file or two in a folder, then we begin to do more work, maybe on the same topic, maybe on another, and the number of files grows, and all of a sudden we can’t see the structure of our work anymore because of the clutter, so we folder some files, and subfolder others, then the folder and file names don’t make sense anymore, and the tree structure has gone overgrown and tangled.

This happens because the names we use are one dimensional. As the architects like to say, folders and files have only one “axis of information” for retrieval. The FSIM changes that.

Abstract Coding

Most people want to use multiple axis of data identification to work with their notes. (meaning they want to get at their information from different directions at different times). The frustration users have with topic folders is the limitations of the one axis categorization that comes with a file or folder name. It’s the old paper journaling problem transferred to the 21st century. Journals are either perfectly chronological or perfectly topical. I t is very difficult to make them be both chronological and topical at the same time (ie: to use multiple axis of coding), and it is impossible for them to be multi-topical, regardless of chronology, unless you have the transcription skills and determined habits of a monk to copy and recopy items over and over.

When I used Journler, I found that all my notes had not just one but other, more abstract, sets of categories that I would use to retrieve data, if I could. THese went beyond the single topic of the folder it resided in. Besides an item being being about “Art”, or “Productivity”, or “Non-Profit Management”, or “Strategy Formulation”, or “Phenomenological Philosophy”, each slip was also either a chunk of text I had created, which I called “Thoughts”, or things other people had created, which I called “Notes”.

So, besides unique file name and topic categorization there was this additional dimension of categorization about who created the information at it’s origin.

Don’t be fooled by the simplicity of this. The split between “Thoughts” and “Notes”, this distinction between what others have produced (inputs into your creative process) and the syntheses which follows (what you do with it), is really at the core of academic and artistic work. Recently both Merlin Mann and Twyla Tharp have been writing about this at some length and when yo think about it, the input vs output relationship is not only obvious but profound.

In addition to “Thoughts” and “Notes” I realized that there were other categories at this level of abstraction. (When we get to the codes themselves below you’ll see them.) And as with any good system development project I found that what I first believed were categories at this level, really weren’t. So I changed them, and it was easy. The idea is to keep an equivalent level of abstraction throughout the system, and not get spooked about making adjustments.

Chronology

Besides abstraction, the other idea from amberV’s thinking on this was the predominance of chronology for the retrieval of information. Noguchi’s system was based on what he called, “the importance of recency.” He felt time was the best way to find relevant items. For example, I may not remember that the Chomsky article I annotated ended up in “Linguistics” or “Anthropology” or “MIT” or “Chomsky”, but I’m sure going to know that I worked on it in the Fall of 2008. And with date you get surrounding chronological context.

One of the wonderful things about Journler, and to a lesser extent DEVONThink, was the ability to flip through your work, to see what you were doing yesterday, or last week or this time last year, and to see it in the context of a set of ideas: what was I reading, what else was going on, was there an art exhibit of conference going on at the same time that influenced my thinking, what was in the news? Incorporating chronology in the system lets you recreate this function.

Reliance on metadata can be used to get at these dimensions, also but it’s very vulnerable to time and error as anyone who has had their “File Creation Date” redefined in a file copy operation can attest. As can those who lost a set of tags in the hidden OS X DS_ files that did not make the jump to a new folder, or were synced out of existence. So the identification of this information, date in particular, has to be more robust than a metadata tag. It also has to be modifiable and definable. A record may well be best dated in December because that was when you first worked on the project, not yesterday when you created the file. Relying on the file modification dates or other metadata means loosing that control completely.

The File Name

So the system has to accessible from multiple axis, one of which needs to be chronological, and it needs to be robust enough to be application independent.

In her system amberV uses codes added in MultiMarkdown, a very sophisticated approach. I use the file name. For me it’s simpler and I don’t have to learn MMD. For every file I save I add a date, a code, and a unique identifier. That’s the file name.

So this file would be …

090608-W2-File System Philosophy.rtf

Where …

  • 090608 is the YYMMDD date of my choosing (might be today, might reference prior dates if the material needed to be fixed in a different time)
  • “W” stands for Writing, and “2” is a sub category for non-fiction essays.
  • “File System Philosophy” is the unique title, which can be preceded by its own characterization if I desire (I rarely do). The file itself is in a folder called “Dougist” because that is were it was published. But it could have gone in a folder called “Productivity”, or inside a rather extensive tree I have called “Systems”. It doesn’t matter because Spotlight can find the files anywhere.

I like using the file name for these codes because I can use the same system for every file type, RTFs, PDFs, JPEGs, Scrivener containers, whatever. And when you get all your file names coded this way they line up in perfect, neat columns down rows in Finder so what looks like clutter turns out to be a very good visual reference.

I also like using the file name because you can use a bulk file renamer to do all the coding for you. This would be would be an insurmountable obstacle for most people (like me) if they have more than a few hundred files. I use A Better File Renamer and it adds the codes like magic. PathFinder does the same thing, but I like the stand alone ABFR because of its power. And it is very fast. It took me months to structure up my data in Journler, a week to go from Journler to DEVONThink, but less than an hour to go from DT’s export to a fully coded and indexed system using ABFR. And then one day when I decided that I liked “R” (Record) better then “D” (Daily) for my every day records, the renaming took 18 seconds. Similarly, I re-categorized all the sub-categories of personal and professional development from “3” from “4” and it took about a minute.

And here is a key aspect of the system: rather than putting data into an application and using the ho-hum functions of that app to work with my ideas, I keep my data separate and have best-in-class applications, using higher levels of functionality, work on it.

For example, both Journler and DT (and EagleFiler, and notoriously MacJournal) have anemic text editing functionality. I use Bean and Scrivener (and occasionally Word, OmniOutliner, and WriteRoom) on my Thought and Writing files and get full-functionality.

Similarly, on my PDFs I use Preview or Skim, or if I’m really out for some major modifications, Adobe. This was the key architectural point that Alex Payne was after in his article about Everything Bucket applications. By using Everything Bucket applications you give up functionality for compactness and eventually that equation works against your creative process. By working in the file system you use the best app for each specific purpose.

As an example: From a tagging perspective, ABFR is vastly more sophisticated than the internal tags of Together will ever be, and even if the ABFR developer goes belly up, you can just move on to the next bulk file renaming utility and proceed, not modification import or export required. The integrity and functionality of your data is not dependent on the existence of an application.

When you start to contemplate the power you get from Word, Pages, or Chronosync vs what you give up in say, Evernote, the technical obstacles necessary to setting up a files system based info management system begin to melt.

A fellow Journler user, a brilliant and dedicated supporter of the product named NovaScotian, once commented on this approach, “but all you’ve done is recreate Journler,” to which I say, yes, but it’s unbreakable and fully extensible. It will not have file size limits, or file type limitations, it can port data into any project and it provides information supplies to all my work efforts. And some day when OS X is replaced by ??? I’ll still be in business the next day with all my material.

The Tags

Now whether this coding goes into the file name, as I do it, or the first line of the document as others do, or the last line as a tag, or in the multimatemarkdown text is really less important than getting your head around the actual codes you will use.

Here’s my set.

The six File Name categories I use are:

• Record -R- Just personal recording. Ideas; observations; people watching; basically anything you might put in a diary. AmberV said, “It was liberating to separate thoughts from diary for me. In the past, I’ve had a problem with feeling guilty about keeping a mundane diary. I always felt like I should be doing something of quality in it. This category is not about quality–simply getting the “facts” down. I don’t have to worry about it being filled with eloquence, or using only the nicest inks, nibs, and papers. Just get it all out.”

  • 1. Diary
  • 2.
  • 3. Action (ToDo, Project, etc) (Sub type “P” -R2P- for major projects)
  • 4. Development

• Thoughts -T- AmberV described it as, “I draw the line between Record and Thoughts by saying, something that intends to “become” something goes in thoughts. Whether that be a thing that is already taking shape, or just an idea that might expand later. Perhaps creative things that are not attached to any particular project, like a line of prose. If I feel it is going to be become a story, or if it is a list of subjects for the next time I take my camera out, then it goes in Thoughts. This is where I am most liberal about sub-categories. It just makes sense to designate which book something is about, or whatever.” (Was once C=Creative)

  • 1. Snips, Fiction
  • 2. Observations, Non- Fiction
  • 3. My processes and procedures, (Craft processes synthesis T3-W)
  • 4. My life ideas, dreams (Goals: T4-G)

• Notes -N- Notes is just that; very similar to Record, except it is material that I have collected as opposed to produced. Everything from research for books, to funny anecdotes. This is also where I store bulk documents downloaded from the web or scanned from paper media. (Was once I=Information = Reference)

  • 1. Research
  • 2. Book notes (to sort and get my book list)
  • 3. Processes (Craft notes N3-W)
  • 4.
  • 5. Refference
  • 6. Quotes (??? Quotes are currently sub-categorised by QUOTE)

• Communications -C- Forums, emails, letters to friends, blog posts, tech support, and other things like that go here. I’ll sub-categorise this one too, if it is a person or forum that I frequently communicate with. (Was once M=coMmunication)

  • 1. Private
  • 2. Public
  • 3. Meetings (Large or small, F2F to conferences, includes phone call notes)
  • 4. Work submitted for review

• Writings -W- Thoughts that have grown, matured and been awarded a drivers license. This is my work of creation. Before long, writings end up in a Scrivener file, but output of versions are kept as separate files with the name of the recipient as a sub-category.

  • 1. Fiction
  • 2. Essays and Non-Fiction
  • 3. Writing about my writing (The process of my writing, what I am writing about)

Projects – P- Transformational efforts that can have notes, thoughts and records. The P is usually affixed to the containing folder. All writing work is project work, but it is not included in this category.

  • 1. Active Projects
  • 2. Finished Projects

The other axis is Contextual and is File Folders

  • Journal – Just like a paper one, a chronological list of items. Created from a smart folder that gets every code above.
  • Topics – A vast sea of labels in sub-folders, roughly mirroring a library catalogue system or the course offerings at a University, culled based on my interests. Few of these are in current use for a project, but if they were, there would be an alias to them in a Project Folder. When does an item end up in Topics and not Journal? At some level of substance a card will belong in Topics; it’s arbitrary. The parallel question in a paper based system would be, when would you copy out your journal notes and file them with torn out articles in manilla folder.
  • Projects – The main difference between Projects and Research is the transformative nature of the work occurring; sequential steps to get something done. Sub-folders are by year, because Projects are (should be) time bound with beginnings and ends. Quite often there is a Scrivener file in a project folder.
  • Writing – Writings are different, somewhat timeless and un-categorizable. My writing folder is a special case; a combination of Journal, Research and Project. A purist would have put current writings in Projects and future ideas in Research, but it’s my system, so I have them separate. Groupings by my Fiction vs Essays, WIP vs Published, a few topical smart folders, mostly in support of potential writing projects.
  • Organizer – This folder tells me where to go. It is a series of subfolders on my current contexts, like Writing Projects, Current Projects. The idea here these folder holds aliases to the data files in the rest of the database. The key is that these aliases are to current work. I once used flags and labels for this function, but I found that what I wanted to see was that I had a current project called “Develop Community of Writers”, not the 34 files associated with organizing a reading in February.

Other

Labels – I use labels arbitrarily to sort items in large folders. For my WIP, labels connote stage of development, from “goofy uncharacterized thoughts” to “ready to send out”. In class folders they separate administrative stuff, like syllabi, from thing like notes and assignments. The point is they have no global significance, their meaning can change from folder to folder.

X Files – Managing the undone. If I have unfinished work in a file, like I only partially completed a draft, I’ll add “X-” the the file name. I have a saved search that collects all these X files in one place, like a flagging system. Sometimes, if the list gets to long I’ll add ordinals to the X, “X1-” or “X2-” etc so they sort by some priority. I’ve also used labels in this situation too. I keep a little folder over in Organizer called “Administrative Tasks”. If something comes up that I will need to do, and that doesn’t fit anywhere else I’ll make an RTF/TXT here just so I have a file with an “X-” in its file name. It then shows up in the saved search. I’ve tried and tried all the todo list managers. OmniFocus alone devoured a collective month of my life. If you need to manage a list with that level of precision then you are a project manager not a writer. iCal todo’s, or a little rtf with tasks, work just fine. I could be convinced, maybe, to use TaskPaper for this stuff, but I’d manage it the same way, in the file system.

Current Jags Folder – In addition to the Organizer folder, my most used folder is called “Current Jags”. It lives in the Organizer folder. I tend to have a lot of stuff I‘ve pulled down and saved but haven’t gotten around to reading or filing yet. I keep my desktop clean so it can be used as a work space for the current activity I’m on, so all this unprocessed stuff goes into Current Jags. To help you understand it’s purpose I’ll tell you that at one time I called it, “Reading”.

Spotlight Comments – I use them sparingly. I have &trips tagged on R1’s about travel so I can search on them and see a history of all my sojourns. Similarly some N1’s are tagged &wrtitersonwriting when I have taken a note where an author speaks about craft. But I go back and forth about adding this text to the rtf itself so it is less susceptible to being washed away in a file copy someday. All my reading notes were once tagged &books but N2 took care of that.

More on file systems, archiving and note taking from Dougist…

Dating DEVONThink

Writing Tools – Journler

The Low Fi Manifesto – Data Architecture, and Journler

Shifting Mediums

WriteRoom and Notational Velocity

This entry was posted in Best Of, Productivity and tagged , , , , , , , , , , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

33 Comments