Open books

by James Somers, September 19, 2009

I would happily trade the world’s libraries, all its cozy coffee shops, and the feel of a hardback in my hands for on-demand access to the complete plain text of every book.

Without difficulty, I can find nearly every published academic paper written in the last fifty years—you cite it and, within minutes, I’m reading it. That should work for books, too.

I’d want the plain text in particular because it’s so easy to work with. It’s already small but can be compressed very efficiently; it’s fully searchable; everyone knows what to do with it; it can be copied, pasted, annotated, etc. trivially; and it works everywhere.

Voracious book readers will tell you that, in its present state, “there is very little on the web but elementary expositions, provisional and forbidding technicalities, and rubbish.” Part of the reason for that is that there’s a lot you can do with hundreds of carefully wrought, tightly integrated pages that won’t work in the kinds of short forms encouraged on the Web. We will be cheating ourselves intellectually if reading books is even relatively inconvenient.

Would people stop writing if the text of their books were distributed this way? I interact with a dozen small-time authors every day, and yes every one of them is just goddamn thrilled that their book is “in print”—which makes you think that they’re excited by the physical object. But really what excites them is having finished their work, which is hard, and earning the approval of people whose approval is hard to earn.

As an analogy, publishing on a blog is less satisfying than publishing in Vanity Fair because Vanity Fair is very selective and your blog is hardly selective at all. Getting tens of thousands of people to read your post, though, would probably do the same trick as the VF piece (or similar). Once books are broadcast instantaneously over the Web, and shared in kind among their readers, “hard” and “satisfying” tasks for authors won’t go away—they’ll just take on new forms.

What’s nice is that there are no technical roadblocks in the way. In fact, someone at Google could probably run a single MySQL query to finish the first 90% of the project. (UPDATE books SET full_text = 1;)

(One wonders why there has to be so much scanning. Haven’t authors been using word processors for like thirty years? How about TeX?)