The Website of The Magazine of the Science Fiction & Fantasy Field
Locus Online
Sub Menu contents


Saturday, May 9, 2009

Extreme Geek

by Cory Doctorow

I am by no means the geekiest SF writer working in the field today; on the power-law curve of geekiness, there are many ancient and gnarly masters before whom I am but a noviate, barely qualified to check the syntax in their shell-scripts. Stross, I'm looking at you here.

Nevertheless, I am far more geeky than average, and that geekiness has crept into my writing practice in a way that is very close to perfectly geeky inasmuch as it probably costs me as much effort as it saves me, inasmuch as it delights me, and inasmuch as it points the way to civilian applications that someone else might want to develop into products that the less geekified may enjoy.

In that spirit, I offer you three quirky little tassles from the fringes of technology and SF writing:

1. Business: Book donation program

This is the lowest-tech entry on the list, but it's also the most generally applicable. As you know (Bob), I give away all my books as free, Creative Commons-licensed e-books the same day they go on sale in stores, on the grounds that for most people, a free e-book is more apt to entice them to buy the print book than to substitute for it.

But there's a small minority — mostly other geeks — for whom the e-book is all they want, and who, nevertheless, want to see the writers they enjoy compensated (bless 'em!). They write to me with some variation on, "Can't I just send you a donation?" And my answer has always been no, because:

  1. I don't want to have to bookkeep, file taxes on, and otherwise track your $5;

  2. I don't want to cut my extremely valuable and useful publisher out of the loop;

  3. I don't want to reduce my print-books' sell-through rates (which determine advance sizes, print runs, and bookstore orders).

So, traditionally, I asked my readers to compensate me by donating a book to a school or library or halfway house. But, practically speaking, this isn't very useful advice. Most of us have no idea how to give books away to schools or libraries — do you just show up at the reception desk with a book, shove it into the clerk's hands and say, "Here, this is for you?"

Starting with my novel Little Brother, I've been doing something different: I actually provide a matchmaking service to connect donors with willing recipients. I hired an assistant — the talented Olga Nunes — to monitor through a googlemail address that I published in a solicitation to schools, libraries, etc., telling them to e-mail their work contact details if they wanted a free copy of the book. Olga vetted these to ensure that they weren't fakers or scam artists, and then posted a geographically sorted list of would-be donees to my site.

Then, I put the word out to potential donors that there was an easy (or at least easier) way to compensate me if you liked the e-book and didn't need the hardcopy: visit your favorite bookstore and buy as many copies as you'd like for any of the organizations that solicited donations, then e-mail us the receipt so we can cross them off the list. Judging from donor e-mails, many of them just gave to the first outstanding request, others looked for requests from their region, and others judged by merit. Some donated several copies — as much as 15! As I type this, we've given away well over 200 copies to people who really wanted the book. I got the sales number, my publisher got the sale, the library or school got the material, and the reader got to feel like s/he had paid for the value s/he'd received.

Now, this wasn't cheap. I needed to hire someone with the good judgment to tell scammers from honest people and with the HTML skills to format and update the page. I definitely spent at least twice as much as I made on this program. As a commercial venture, it was a flop.

But as a proof-of-concept, it was a ringing success. There is a market opportunity here for someone who wants to automate the service. I envision something run jointly by, say, the American Library Association (or maybe the International Federation of Library Associations) and the Adopt-a-School program (to ease vetting), that works with a couple dozen booksellers, national and local, and lists books by all kinds of authors and requests from all over the world. Donors can either get a suggestion for a book to donate (perhaps based on preferences like "Science Fiction" or "Young-Adult Novels" and "Schools in My Area" or "Schools in the Nation's Poorest ZIP Codes") and, with a few clicks, donate a book, receiving a tax-deduction receipt in return.

2. Research: Twitter meets notekeeping

I'm in the middle of a research-intensive novel, for which I've read some 50 or 60 books. I made extensive notes as I did, unconsciously falling into a Twitter-style shorthand in my long text-file, for example:

  • Newborn babies are swaddled tightly at birth, it tames them. If you aren't swaddled, you grow up wild and restless. Socialism 79 #china #childhood #control

  • Louche boy wearing wide-bottom "trumpet trousers" and shirt rolled up to expose his belly on a hot day. Socialism 86 #china #fashion

  • "Drink vinegar" is "conjugal jealousy." Socialism 155 #china #slang #romance

These notes are from "Socialism is Great!", Lijia Zhang's amazing memoir of life in rural China during the period of economic reform and industrialization. The hashtags (#tag) are loose categories that each note seemed to fit into while I was writing them down. These notes, and hundreds more, live in a text file.

As I made these notes, I had a sense that, somewhere, there'd be a program that would parse through them, generating a tag-cloud [see picture] with clickable links to different hashtags' contents. Unfortunately, as this file grew longer, I realized that no such program existed.

I put the call out to the readership at Boing Boing, the blog I co-edit, and Dan McDonald, one of my readers, came through with a fantastic little Perl script called that does exactly this, parsing all my notes into a database that I can search or query visually, by clicking on the cloud.

Now, as I write the novel, this has become an invaluable aid: for one thing, it lends itself to a kind of casual, clicky browsing in which one hashtag leads to another, to a search-query, to another tag, exploring my notes in a way that is both serendipitous and directed.

For another, the format is one that comes naturally to me, because of all the other services I use — such as Twitter — that employ this telegraphic, brief style.

Dan's Perl script is freely licensed and can be downloaded from

3. Process: Flashbake

I know a lot of archivists and one of their most common laments is the disappearance of the distinct draft manuscript in the digital age. Pre-digital, authors would create a series of drafts for their work, often bearing hand-written notations tracking the thinking behind each revision. By comparing these drafts, archivists and scholars could glean insights into the author's mental state and creative process.

But in the digital era, many authors work from a single file, modifying it incrementally for each revision. There are no distinct, individual drafts, merely an eternally changing scroll that is forever in flux. When the book is finished, all the intermediate steps that the manuscript went through disappear.

It occurred to me that there was no reason that this had to be so. Computers can remember an insane amount of information about the modification history of files — indeed, that's the norm in software development, where code repositories are used to keep track of each change to the codebase, noting who made the changes, what s/he changed, and any notes s/he made about the reason for the change.

So I wrote to a programmer friend of mine, Thomas Gideon, who hosts the excellent Command Line podcast (, and asked him which version control system he'd recommend for my fiction projects — which one would be easiest to automate so that every couple of minutes, it checked to see if any of the master files for my novels had been updated, and then check the updated ones in.

Thomas loved the idea and ran with it, creating a script that made use of the free and open-source control system "Git" (the system used to maintain the Linux kernel), checking in my prose at 15-minute intervals, noting, with each check-in, the current time-zone on my system clock (where am I?), the weather there, as fetched from Google (what's it like?) and the headlines from my last three Boing Boing posts (what am I thinking?). Future versions will support plug-ins to capture even richer metadata — say, the last three tweets I twittered, and the last three songs my music player played for me.

He called it "Flashbake", a neologism from my first novel, Down and Out in the Magic Kingdom. I was honored.

It's an incredibly rich — even narcissistic — amount of detail to capture about the writing process, but there's no reason not to capture it. It doesn't cost any more to capture all this stuff every 15 minutes than it would to capture a daily file-change snapshot at midnight without any additional detail. And since Git — and other source repositories — is designed to let you summarize many changes at a time (say, all the changes between version 1 and version 2 of a product), it's easy to ignore the metadata if it's getting in the way.

Now, this may be of use to some notional scholar who wants to study my work in a hundred years, but I'm more interested in the immediate uses I'll be able to put it to — for example, summarizing all the typos I've caught and corrected between printings of my books. Flashbake also means that I'm extremely backed up (Git is designed to replicate its database to other servers, in order to allow multiple programmers to work on the same file). And more importantly, I'm keen to see what insights this brings to light for me about my own process. I know that there are days when the prose really flows, and there are days when I have to squeeze out each word. What I don't know is what external factors may bear on this.

In a year, or two, or three, I'll be able to use the Flashbake to generate some really interesting charts and stats about how I write: does the weather matter? Do I write more when I'm blogging more? Do "fast" writing days come in a cycle? Do I write faster on the road or at home? I know myself well enough to understand that if I don't write down these observations and become an empiricist of my own life that all I'll get are impressionistic memories that are more apt to reflect back my own conclusions to me than to inform me of things I haven't noticed.

Thomas has released Flashbake as free/open software. You can download it and start tinkering at As I said, it's not the kind of thing that an info-civilian will be able to get using without a lot of tinkering, but in the month I've used it, I've already found it to be endlessly fascinating and useful — and with enough interest, it's bound to get easier and easier.

Labels: ,


OpenID blog said...

Excellent article! I'm particularly interested in the tag-driven notetaking. The first thing that springs to mind is more immediate interaction with a webapp that allows you to enter text notes as in a text editor, but automatically and immediately indexes them and makes them clickable.

Also, your tag cloud would look a whole lot cooler if you fed it through :)

May 10, 2009 4:15 AM  
Blogger Alouisius A. Arthur said...

Okay you've got my attention. Flashbake is an interesting idea. I was invited into the home of a local bookstore owner. In the library, framed under glass, was a poem by Mark Strand former US poet laureate. It contained two drafts of a poem he wrote. The first was handwritten and had extensive notes and correction. The second was the finished work. I have to tell you that there is real knowledge to be learned by seeing how the masters do it. Maybe someday someone will want to know step-by-step how I did it. Kudos. Bill Ruesch,

May 10, 2009 10:15 AM  
Blogger Dan said...

Tiddlywiki has a tag cloud plugin. Close?

May 10, 2009 2:37 PM  
Blogger Vuk Cosic said...

I want flashbake to (first ask in settings and then) make a picture of me each 15 minutes and then compile a movie of my changing moods while preparing some long project.
Actually I want my laptop to do that every minute and then make different movies relating to different things I do.
And a big cumulative photo across a year or the entire life.
Not soundtrack, not wallpaper, but moodtrack or facetrack.
No, better, Dorian Greyscale.

May 11, 2009 5:32 AM  
OpenID cwilliams11 said...

Cory: Until your idea of a donation site is developed: works to match donors with classroom and school library wishlists. Friends of Library USA (FOLUSA) has a help page, Hope this helps everyone!

Cosic: Makes me think of a personalized hack of with Though moods change rapidly so you'd want to break down your database of tagged video into smaller increments than minutes. A set-up to generate automatic recognition of your facial expressions to categories seems possible - perhaps by tracking shape of eye and mouth. Fun to think about!

May 11, 2009 8:54 AM  
Blogger Kevin Carson said...

You actually file taxes on PayPal contributions? For me, one of the advantages of the informal/household/barter economy is that it evades tribute to our corporate and state overlords, and all the increased costs of the conventional high-overhead economy.

May 12, 2009 3:02 PM  
Blogger bowerbird said...


i'd like to program my own version of some of these tools.
can you point to a notes-file that has the type of tags that
you talk about, so that i can use it as some sample content?



May 12, 2009 5:59 PM  
Blogger mutter mutter said...

Ooh! I love the hashtag-search script idea. I have been trying to do that with various programs: Tag2Find, which is good for tagging whole files (and has autocomplete, which is nice) but no good for a notecard (tweetcard?)-size piece of information. Zotero has tag autocomplete, searchable tags and automatic tag capture from previously-tagged documents but has an awful tag-editing interface. I had ultimately been left just depositing notes onto different running text files according to topic - this is a great solution, though.

May 14, 2009 5:19 AM  
Blogger Chris Noto said...

Cory, I love the ideas and processes that you are using in your work, and have also enjoyed your science fiction and your work at BoingBoing. [GRAMMAR POLICE ON]However, as a recovering Catholic, and almost-priest who later became a Presbyterian minister, I've gotta tell you that your usage of the word novitiate to indicate a novice, though permissible, is extremely rare. It is also somewhat jarring, for me, at least, and perhaps also for others with a similar background, to whom the word novitiate is usually used to denote the time during which a person carries the title of novice.[GRAMMAR POLICE OFF]

All the best,

May 23, 2009 1:50 PM  

Post a Comment

<< Home

© 2009 by Locus Publications. All rights reserved.