the jsomers.net blog.

Compuchemical Economies

A commenter on Why The Law wondered whether we can "have a society whose output has equivalent K-complexity as ours, but is generated from simpler rules"—in effect seeking the simplest legal system that can sustain something like the U.S. economy.

The idea reminded me of two papers—"Economic Production as Chemistry," by Padgett, Lee, and Collier (2003), and "Evolution of Cooperative Problem-Solving in an Artificial Economy," by Baum and Durdanovic (2000)—because in each, the authors are trying to develop a minimal model of fruitful markets: what are the necessary conditions for productive collaboration? Which treatments of property and compensation generate stability, and which degrade? Must agents be intelligent? Altruistic?

Hypercycles

Padgett et al. imagine a grid-world with three basic components: firms, one in each cell, each of which has a random initial endowment of rules, which are simple procedures for transforming products. So for example the cell at (2, 3) may have a rule C -> F that takes the product C as input from one neighboring firm and spits out an F to another. It looks something like this:

My quick-and-dirty drawing

A natural question to ask of such a world is whether it can sustain chain reactions, or coincidences where firm after firm spits out the very product that the next guy needs as input, as when a firm with rule A -> B happens to send off its B to a firm with rule B -> C.

In fact the model is deliberately set up with these chains in mind. Here are the rules of the game:

  1. Initially, technologies are distributed randomly among the firms, which are arrayed on a large grid with wraparound—so that every firm, even if it's on an "edge," has eight potential trading partners.
  2. In each "round" of the model (which proceeds in asynchronous discrete steps):
    • A random rule is chosen or "activated."
    • The firm that owns this rule grabs a product from its "input environment," modeled as an urn containing a bunch of products.
    • If the product it chooses is an input for the activated rule (e.g., if the product is a C and the rule is C -> Z), the rule does its thing and the output of its transformation is passed randomly onto one of the firm's neighboring trading partners.
    • Otherwise, the input product is sent to an output environment (also modeled as an urn).
  3. The round's action continues as firms receive their neighbors' output. This step is important: if you've just successfully transformed A into Z, and pass the Z onto me, and I have a Z -> V rule, then we've just engaged in a "successful transaction."
  4. Every time there is a successful transaction, one of the rules involved (in our example above, that would be one of A -> Z or Z -> V) is reproduced. It turns out that the behavior of the overall model depends a great deal on which one of the source or destination rules is copied. (See below.)
  5. To encourage competition, every time one rule is reproduced, another one, chosen randomly, is killed off. So the total number of rules in play is held constant. This selection pressure, combined with the ability for rules to reproduce, is what keeps the world from degrading into an unproductive mess. (Again, more on this below.)
  6. Firms that run out of rules die or "go bankrupt." Products sent their way are immediately directed to the output environment.
  7. Firms continue to pass products around until one of these lands on a firm that doesn't have any compatible rules (i.e., if a D shows up at a firm that only has Z -> F, E -> F, and C -> D). At that point, the product is ejected into the output urn and a new rule is randomly chosen, as in step 1. (Meanwhile, and here's the "asynchronous" part, other firms are passing around other products. So the whole model doesn't reset when a single chain comes to an end.)

So how does anything interesting happen in this world? How do firms even survive? The short answer is "loops":

In this model, the minimal requirement for long-term survival, both of firms and of clusters, is to participate in at least one spatially distributed production chain that closes in on itself, to form a loop. Not all production chains within a trading cluster need be closed into loops. And more than one loop within a cluster is possible, in which case we may have a dense hypercyclic network of spatially distributed production rules. But loops within distributed chains of skill are crucial, not for production itself, but for the competitive reproduction of skills. Loops set up positive feedbacks of growth in skills that give firms that participate in them the reproductive ability to out-produce firms that do not. Put another way, clusters of firms can produce products with or without hypercycles, but firms whose skill sets participate in production chains that include loops have the further capacity to keep renewing each other through time. This is the chemical definition of life.

So the key is to promote these stable hypercycles. And it turns out that for loop maintenance, the model's most important ingredient is one touched on in step #4 above—the choice between "source" and "target" reproduction:

In the spatial topology setting, there are two variants of “learning by doing” that can and will be explored:

  • “Source reproduction” is where the originating rule in a successful transaction is reproduced.
  • “Target reproduction” is where the receiving rule in a successful transaction is reproduced.

For example, if A -> B receives an A from the input environment, transforms it into a B, and then successfully passes that B onto a neighboring B -> C, who transforms it again, then “source reproduction” is where the initiating A -> B reproduces, and “target reproduction” is where the recipient B -> C reproduces. Variation in mode of reproduction thus defines who benefits from the transaction.

We think of source reproduction as “selfish learning,” because the initiator of the successful transaction reaps the reward (like a teacher). And we think of target reproduction as “altruistic learning,” because the recipient of the successful transaction reaps the reward (like a student).

So which works better?

...in comparison with source reproduction, target reproduction dramatically increases the likelihood of growing stable hypercycles. And it also increases the spatial extensiveness and complexity of the firm cluster that hypercycles produce.

And why?

As explained in Padgett (1997), the basic reason for this superiority of target reproduction, or altruistic learning, is repair. Target reproduction combats dynamic instability in a way that source reproduction does not. The basic process of dynamic instability, causing hypercycles to crash, is that as one skill reproduces rapidly, under competition other skills are driven to zero, thereby breaking the reproductive loop of skills. Spatial topology distributes this dynamic into an overlapping series of neighborhoods, thereby inducing local heterogeneity. This opens the door for localized co-adaptive feedbacks to operate. But source reproduction, or selfish learning, does not really attack the basic dynamic instability itself. Source reproduction is this: an initial activated rule passes on its transformed product to a neighboring compatible rule, causing the original activated rule to reproduce. Under source reproduction, frequently activated rules reproduce more frequently, often eventually driving out of business even compatible neighbors on whom they depend for their own survival. As we shall see in the next section, other factors like endogenous environment can sometimes moderate this destructive dynamic, but source reproduction in and of itself does not eliminate the non-spatial instability problem.

In contrast, target reproduction is this: an initial activated rule passes on its transformed product to a neighboring compatible rule, causing the recipient rule to reproduce. Here the more frequently the initial rule is activated the more frequently the second recipient rule reproduces. In this way, a hypercycle can repair itself: as the volume of one skill in a loop gets low, high volumes of compatible skills in neighboring firms reach in to that low-volume skill to build it back up. Peaks and valleys along loops are smoothed.

To help get a grip on this distinction, it's worth asking whether there are any real-world examples of "altruistic reproduction." These would be domains in which the output of "upstream" firms promotes the growth of other firms down the line, thereby creating a rich self-sustaining production chain.

This seems true of many industries but especially true of information technology, where small software companies create huge markets (for themselves and others) by developing cheap, portable code for large enterprises. Even open source components—like Linux or MySQL—accelerate industry growth and feed opportunities back to software providers in the form of support contracts and custom coding projects. The result is symbiotic.

(In contrast, one could imagine how—in the kind of closed world explored in this paper—the success of something like a casino would have the perverse effect of draining its own lifeblood, by siphoning the bankrolls of active gamblers. In that way it could be construed as an example of "selfish (source) reproduction.")

The Hayek Machine

The immediate goal of Baum et al. is to find a way to solve computational problems with large state spaces, like a Rubik's cube. Their approach is to kickstart an economy of modules—little bits of computer code—that dynamically combine into complicated chains. If these chains were somehow arranged to exploit the structure of the problem at hand, the system would be able to limit its search to a tiny fraction of the space of possible moves.

Intelligent computation of that sort is easier said than done, and so most of the paper is devoted to exploring precisely which features of the module "economy" improve performance, and why.

Before getting into that, though, it might help to explore the simplest of the three problems under attack, called "Blocks World." With that under our belt it should be easier to understand the model more generally.

An instance of Blocks World has four stacks (0, 1, 2, and 3) of colored blocks. The leftmost stack (0) is your template—you never move any of its blocks as you play. Instead, your goal is to rearrange the blocks in the other three stacks—with the important caveats that (a) you can only grab the topmost block off a given stack and (b) you can only pick one up at a time—until stack 1 matches the template.

Here's a picture:

blocks world

The "Hayek" referred to in the caption is the name of the economics-inspired computer program developed in the paper. It is of course an homage to the great Austrian economist Friedrich von Hayek, and it works as follows:

  1. The program is a collection of modules, or mini computer programs, each with an associated "wealth." The authors think of these modules as agents because they can interact with the world (the game).
  2. "The system acts in a series of auctions. In each auction, each agent simulates the execution of its program on the current world and returns a nonnegative number. This number can be thought of as the agent's estimate of the value of the state its execution would reach. The agent bids an amount equal to the minimum of its wealth and the returned number. The solver with the highest bid wins the auction. It pays this bid to the winner of the previous auction, executes its actions on the current world, and collects any reward paid by the world, as well as the winning bid in the subsequent auction. Evolutionary pressure pushes agents to reach highly valued states and to bid accurately, lest they be outbid." (You might want to re-read the preceding paragraph a few times, until you get a handle on exactly how the auctions work. It's not complicated—it's just that there's a lot of procedure packed into a few tight sentences.)
  3. "In each auction, each agent that has wealth more than a fixed sum 10Winit creates a new agent that is a mutation of itself. Like an investor, the creator endows its child with initial wealth Winit and takes a share of the child's profit. Typically we run with each agent paying one-tenth of its profit plus a small constant sum to its creator, but performance is little affected if this share is anywhere between 0 and .25. Each agent also pays resource tax proportional to the number of instructions it executes. This forces agent evolution to be sensitive to computational cost. Agents are removed if, and only if, their wealth falls below their initial capital, with any remaining wealth returned to their creator. Thus, the number of agents in the system varies, with agents remaining as long as they have been profitable.

    "This structure of payments and capital allocations is based on simple principles (Baum, 1998). The system is set up so that everything is owned by some agent, interagent transactions are voluntary, and money is conserved in interagent transactions (i.e., what one pays, another receives) (Miller & Drexler, 1988). Under those conditions, if the agents are rational in that they choose to make only profitable transactions, a new agent can earn money only by increasing total payment to the system from the world. But irrational agents are exploited and go broke. In the limit, the only agents that survive are those that collaborate with others to extract money from the world.

    "When not everything is owned, or money is not conserved, or property rights are not enforced, agents can earn money while harming the system, even if other agents maximize their profits. The overall problem, then, cannot be factored because a local optimum of the system will not be a local optimum of the individual agents" [emphasis in original].

What happens when this system is let loose on Blocks World?

The population contains 1000 or more agents, each of which bids according to a complex S-expression that can be understood, using Maple, to be effectively equal to A * NumCorrect + B, where A and B are complex S-expressions that vary across agents but evaluate, approximately, to constants. The agents come in three recognizable types. A few, which we call "cleaners," unstack several blocks from stack 1, stacking them elsewhere, and have a positive constant B. The vast majority (about 1000), which we call "stackers," have similar positive A values to each other, small or negative B, and shuffle blocks around on stacks 2 and 3, and stack several blocks on stack 1. "Closers" bid similarly to stackers but with a slightly more positive B, and say Done.

At the beginning of each instance, blocks are stacked randomly. Thus, stack 1 contains about n/3 blocks, and one of its lower blocks is incorrect. All agents bid low since NumCorrect is small, and a cleaner whose B is positive thus wins the auction and clears some blocks. This repeats for several auctions until the incorrect blocks are cleared. Then a stacker typically wins the next auction. Since there are hundreds of stackers, each exploring a different stacking, usually at least one succeeds in adding correct blocks. Since bids are proportional to NumCorrect, the stacker that most increases NumCorrect wins the auction. This repeats until all blocks are correctly stacked on stack 1. Then a closer wins, either because of its higher B or because all other agents act to decrease the number of blocks on stack 1 and thereby reduce NumCorrect. The instance ends successfully when this closer says Done. A schematic of this procedure is shown in Figure 3 [below].

figure 3

This NumCorrect that they keep referring to is a hand-coded module that counts the number of blocks in stack 1 (contiguous from the bottom up) that match the template. It's a good proxy for performance, and the measure is critical in helping the agents to know whether they're contributing value to the world (i.e., whether the program is on its way toward a solution).

If NumCorrect is left out of the system, then, performance degrades. Although it's possible for Hayek to evolve its own quick-and-dirty version, with this mere approximation it's only able to solve instances about 1/4 of the size as when NumCorrect comes hand-coded.

It's worth asking, then: how does code, be it NumCorrect or something simpler like Grab(i,j), "evolve" in the first place?

Agents are composed of S-expressions, or recursively-built trees of procedures. You can think of these as sort of like recipes:

recipe S-expression; idea courtesy of nikhil

The difference here, of course, is that whereas in the Recipe World you might have "sautée" or "chop," in Blocks World you have things like "grab" or "drop" or "look":

Our S-expressions are built out of constants, arithmetic functions, conditional tests, loop controls, and four interface functions: Look(i,j), which returns the color of the block at location i, j., Grab(i) and Drop(i), which act on stack i; and Done, which ends the instance. Some experiments also contain the function NumCorrect, and some contain a random node R(i, j), which simply executes branch i with probability 1/2 and j with probability 1/2. [This last randomizer function has the effect of "smoothing" the fitness landscape.]

All our expressions are typed, taking either integer, void, color, or boolean values. All operations respect types so that colors are never compared to integers, for example. The use of typing semantically constrains mutations and thus improves the likelihood of randomly generating meaningful and useful expressions.

It's hard to overstate how cool this process is. Out of a soup of random S-expressions—primitive codelets like adders and subtracters, if...then statements, loops, and "look" / "grab" / "drop" modules—it's possible for coherent strategical "agents" to evolve. On its own this idea is incredibly powerful: one can imagine all sorts of programs being developed automatically in this way, growing increasingly powerful by mutating and recombining useful snippets of code. Indeed, there's a whole field devoted to this approach to programming.

But what's even more spectacular is that these evolved computational lifeforms (of a sort) cooperate—via the auction process—in long computational chains, even when this requires that the initiating agents defer their rewards.

This last point is critical. It's not always obvious what the game's next best move may be, and often moves that are in the interest of long-term progress are (or look) counterproductive in the short term—as when you throw a bunch of incorrect blocks onto stack 1 to fish out an important block on stack 2. So it becomes basically impossible to solve any but the easiest Blocks World instances without the ability to "delay gratification," that is, to link up lots of low-or-no-reward-actions into chains that collectively work toward some bigger reward at their end.

The only way for this to work is to enforce the rules specified earlier:

A program cannot hope to form long chains of agents unless conservation of money is imposed and property rights are enforced. To evolve a long chain, where the final agent achieves something in the world and is paid by the world, reward has to be pushed back up the long chain so that the first agents are compensated as well. But pushing money up a long chain will not succeed if money is being created or is leaking, or if agents can steal money or somehow act to exploit other agents. If money is leaking, the early agents will go broke. If money is being created or being stolen, the system will evolve in directions to exploit this rather than creating deep chains to solve the externally posed problems. Solving the world's problems is very hard, and an evolutionary program will always discover any loophole to earn money more simply. [Eric Baum, What is Thought, p. 246.]

That's also why it's so important for auction winners in this round to pay their bids to winners from the last round. That way, those agents who most help folks downstream are able to collect larger bid payments—which helps them reproduce and stay alive—and, if their work consistently leads to large rewards down the line, their downstream partners will also reproduce (having collected money from the world) and set up yet more collaboration. It's exactly the same principle that we saw in the hypercycles world.

Conclusion

To return to the comment that kicked off this whole discussion, it now seems clearer what sorts of laws one minimally needs to sustain a rich economy. The trick is to encourage long, dynamically stable collaborative chains, and to do so requires mechanisms for transferring the rewards of productive activity to everyone who contributed; otherwise, agents who make the kind of short-term sacrifices that fuel deep cooperation will die out, and only the shallowest computation (i.e., production) will be possible.

In the hypercycles model, this was achieved by connecting reproductive success to successful transactions. Target reproduction in particular ensures that successful firms don't "burn out" their own partners.

And in the Hayek machine, property rights and conservation of money ensure that (a) the only way to earn money oneself is to create wealth for the world and that (b) money trickles back along productive chains to every contributor. [1]

These, then, are the deep compuchemical reasons that such a substantial portion of the U.S. legal system is devoted to enforcing contracts and protecting property rights. Such laws are the bedrock of our—or any—economy.

Notes

[1] Although it may seem that in these models payments go in opposite directions, they actually don't—even under a regime of "target" reproduction, hypercycles loop back on themselves, which means that A's altruistic contribution to B's reproductive success ends up helping A in the end, precisely because the two are wound up in a circle. Target reproduction works better because it keeps such circles alive, and thus allows payments to continue flowing appropriately.

What it used to be like to look things up

Yesterday a paper I was reading made reference to a "Galtonian composite photograph." From the context I had a vague idea of what the phrase referred to, but I wanted to learn more. So I:

  1. Googled "Galton" and clicked on the first result, the Wikipedia page for Sir Francis Galton.
  2. Searched the text there for "composite," which turned up the following:
    Galton also devised a technique called composite photography, described in detail in Inquiries in human faculty and its development...
  3. Googled once again for "Inquiries in human faculty and its development" (no quotes), which turned up a complete Google Books entry.
  4. Searched inside that text for "composite," and found, finally, "Appendix A.—Composite Portraiture," which contained everything I might want to know on the subject.

The whole process took about two minutes.

I couldn't help but wonder, though, what my search would have looked like forty years ago, long before the Internet and the proliferation of personal computers. How would I have traced a casual allusion to its source?

Step 1: Go to the library

Short of phoning a friend, there would have been no way to look something up without going to some sort of library. If I lived in the sticks, my best hope would have been a wealthy resident's set of encyclopedias or some sort of bookmobile—essentially a cart that hauled boxes of books around to small towns. For any remotely obscure topic, I would have been out of luck.

Step 2: Card catalogs and the metasearch

Once I got to my local library, probably funded by Carnegie, my first task would have been to figure out where to start looking. It's a process that has been mostly obviated by search engines, whose crawling and indexing and relevance rankings allow one to scour the whole world at once, without much regard to where the info comes from.

Back in the day, though, I would have had to start a sort of metasearch to find the right books, periodicals, journals, or microfilms in which to begin my actual search.

Luckily, by the 1960s these materials were indexed exhaustively in large card catalogs, which mapped author names, titles, subject headings, or keywords to the precise coordinates of actual items.

Not so luckily, I would have had to know some author name, title, subject heading, or keyword to look for. In the case of my query above, "Galtonian composite image," I would have had several plausible angles of attack: the name "Galton, Sir Francis," which if I didn't know, I could discover in an encyclopedia; the key words "composite image"; or the subject heading "photography."

But what if I had no context for my query? For example, what if I ran into a sentence like, "He resembled Little Chandler, if not in size then in stature"? Clearly this is a reference to some moderately famous person or character, but without more to go on, how could I figure out what to look for in the card catalog?

It depended. If I suspected that this was a real person, I could have searched the Encyclopedia Brittanica, something like the Dictionary of National Biography (for dead people), or a Who's Who (for living people). Once I found my man, I could pick up a biography or two.

Finding a fictional character would have been more difficult, but still possible. There was actually something called the "Cyclopedia of Fictional Characters," or I could have consulted the once-apparently-indispensable "Benet's Reader's Encyclopedia," which in addition to characters also catalogs author names, novels, stories, literary terms, etc. If I had gotten a handle on where my character first appeared, it would then have been a matter of picking up that particular book. Problem solved.

Unfortunately, it turns out that "Little Chandler" doesn't appear in any of these. Which means I would have had to either ask a librarian—who might have some tricks up his sleeve (Dewey, of the Decimal System, once said, "...the librarian must be the index of indexes")—or a very well-read friend, or maybe a professor of literature.

Step 3: The deep crazy world of indexes

Supposing I did have some success with the card catalog—as I would if I were searching for "Galtonian composite photography" instead of "Little Chandler"—I would now be sitting at a table with a large stack of reference works, regular books, a newspaper microfilm, and maybe a few journal indexes, which are like regular indexes, but span the full output of hundreds or thousands of journal issues.

Indexes, by the way, are curious things. They don't work the same way as Google's "index," which is, in technical parlance, really a concordance, because it maps every word to either a snippet or the full text of the page on which it originally appears. Concordances, because they are such a pain to compile by hand, were really only created for stuff like the Bible or the works of Shakespeare. (Example.)

Real indexes (don't say "indices") only contain a small, carefully chosen subset of all possible terms. They're put together by professional indexers who work for publishers on a contractual basis. (Some authors do write their own indexes, but this is apparently frowned upon by the American Society for Indexing and one of Kurt Vonnegut's characters).

Theirs is difficult work. Indexers must take care to use a controlled vocabulary, i.e., a set of canonical headings that "solve the problems of homographs, synonyms and polysemes by a bijection between concepts and authorized terms." The idea is to avoid multiple entries like "Cats, 50-62" and "Felines, 175-183", and to make sure not to confuse the two meanings of words like "pool" (think swimming and billiards).

They must compose a semantically sensible and reliable syndetic structure, which is the complex network of inter-index references ("See also...") that link related topics. It's important to avoid circularity (A -> B -> A) and to consistently connect semantically "close" entries (i.e., if "General Patton" points to "WWII", you probably also want "General MacArthur" to point there). To do so requires a fairly detailed understanding of the book and its subject matter: Which topics are related? Are these four ideas subunits of some general concept? Is X important enough to index?

Finally, good indexers have to put themselves in the reader's shoes. What sorts of things will someone new to the subject be looking for? If they want to know more about X, what word will pop into their head? Will they care to see the reference on p. 124, or is it too cursory to be included?

There are whole journals devoted to the subject. In fact, if you want to see a great index, presumably you couldn't do much better than the journal index for the journal Indexer. (Among other things, there is a section in each issue devoted to "praising" and "censuring" indexes found and submitted by readers. Down the rabbit hole...)

Step 4: Read, read, read

In any case, to return to my hypothetical 1960s search, I would now be (reverently) perusing the indexes of each item in my stack. I'd be looking for words like "Galton" and "composite," and I'd quickly skim the text of every reference until I found something promising. If yet more books were mentioned—or if I found some encouraging titles in a bibliography—I'd dig them up as well. My stack would slowly grow, and recede, and grow again, as I discarded dead ends and pursued new leads.

In contrast to how things work today, I wouldn't just be reading the stuff most directly relevant to my query. I'd be tempted by all sorts of tangents along the way: unusual or obscure topics, or hilariously terrible writing, or some new fantastic author I'd never yet encountered. In the search for truth I'd be taking the scenic route—longer, sure, but maybe more rewarding.

The Role of Deliberate Practice in the Acquisition of Expert Performance

I have just finished reading a famous paper by Ericsson, Krample, and Tesch-Romer called "The Role of Deliberate Practice in the Acquisition of Expert Performance" (1993, Psychological Review).

The paper's key claim is that performance—be it in chess, or swimming, or violin—is a monotonic function of accumulated deliberate practice. More deliberate practice equals better performance.

So what is "deliberate" practice, and how is it different from the regular kind?

The most cited condition concerns the subjects' motivation to attend to the task and exert effort to improve their performance. In addition, the design of the task should take into account the preexisting knowledge of the learners so that the task can be correctly understood after a brief period of instruction. The subjects should receive immediate informative feedback and knowledge of results of their performance. The subjects should repeatedly perform the same or similar tasks.

So there should be an "active search for methods to improve performance," immediate informative feedback, structure, supervision from an expert, and "close attention to every detail of performance 'each one done correctly, time and again, until excellence in every detail becomes a firmly ingrained habit.'" Deliberate practice is demanding and can be quickly exhausting. It's usually not enjoyable in its own right.

Here's an example from chess:

In informal interviews, chess masters report spending around 4 hr a day analyzing published chess games of master-level players. Selecting the next moves in such games provides an informative learning situation in which players compare their own moves against those selected in an actual game. A failure to select the move made by the chess masters forces the chess players to analyze the chess position more carefully to uncover the reasons for that move selection. There exists also a large body of chess literature in which world-class chess players explicitly comment on their games and encyclopedias documenting the accumulated wisdom on various types of chess openings and middle-game tactics and strategies. An examination of biographies of world-class chess players... shows, contrary to the common belief that chess players have developed their chess skills independently, that these elite players have worked closely with individuals... who explicitly taught them about chess and introduced them to the literature.

Now that youngsters have access to cheap and super-powerful chess software, in which every position can be meticulously analyzed and moves can be compared across vast historical databases, it's no wonder that grandmasters are getting better faster. Structured deliberate practice is now easier to come by.

(As it is in poker. All these kids playing online can not only play many orders of magnitude more hands than their veteran predecessors, but they also—like chess players—have continuous access to analytics and hand histories which make perfect fodder for intense study. They can play at a dozen tables simultaneously and work hard on various techniques, situations, and methods, all the while collecting data and keeping track of metrics like percentage of limp-ins, % of hands played off the small blind, % flops seen, etc., for themselves and everyone else at the table.)

Anyway, that 4-hour figure cited above is actually quite common. In fact, the profile of top performers in every field the paper surveyed is remarkably consistent:

  • They start practicing seriously at around 8 years old (sometimes younger), usually after showing unusual "promise" or interest.
  • They seek out and work individually with a handful of mentors or teachers.
    ...it is generally recognized that individualized supervision by a teacher is superior. Research in education reviewed by Bloom (1984) shows that when students are randomly assigned to instruction by a tutor or to conventional teaching, tutoring yields better performance by two standard deviations.
  • In addition to coaches / trainers / tutors, they regularly compete on a local or regional level. Doing well in these competitions validates their and their parents' expense (of time, money for transportation / travel, etc.). So intense practice continues.
  • The duration and frequency of practice sessions gradually increases. (If you immediately started training at an expert's pace, you'd burn out.) Top performers practice about 4 hours per day in 80-120 minute sessions. There are decreasing marginal returns after the first hour and a half of a session.
  • Athletes work most intensely in the mid-afternoon. Scientists and novelists almost uniformly prefer the morning. These choices make sense from a biophysical perspective.
  • Experts are totally immersed. In addition to practice, they spend 50-60 hours per week on domain-related activities, like lessons, competitions, study, group practice / performance, individual play / performance, etc.
  • Before producing their best work, they need to have completed about 10 years or 10,000 hours of deliberate practice.

It's hard to overstate the importance of an early start. It's simply not possible for late bloomers to catch up, since they won't even be able to practice as much as the elites who started early:

...it is impossible for an individual with less accumulated practice at some age to catch up with the best individuals, who have started earlier and maintain maximal levels of deliberate practice not leading to exhaustion. As noted earlier, the amount of possible practice appears to slowly increase with accumulated practice and skill. Hence, individuals intent on catching up may suddently increase the amount of deliberate practice... Within months these individuals are likely to encounter overuse injuries and exhaustion... Furthermore, the difference in accumulated deliberate practice in late adolescence for the good and best violinists [the two groups in one study, the latter of whom started earlier] is remarkably large and to eliminate this difference the good violinists would have to practice an additional 5 h per week beyond their current optimal level of weekly practice for more than 8 full years.

There are some exceptions. Scientists in particular don't start intense research until their late teens or early twenties, which is why they often produce their best work in their mid-thirties. The key for them is to write:

In support of the importance of writing as an activity, Simonton (1988) found that eminent scientists produce a much larger number of publications than other scientists. It is clear from biographies of famous scientists that the time the individual spends thinking, mostly in the context of writing papers and books, appears to be the most relevant as well as demanding activity. Biographies report that famous scientists [like Darwin, Pavlov, and Skinner] adhered to a rigid daily schedule where the first major activity of each morning involved writing for a couple of hours.

Why is writing so useful? Writing is hard, structured, and it clarifies thinking. It's also one of the most effective ways for scientists to get feedback. This is part of the reason why consistent academic blogging can be so effective, especially if the blogger has a decently large audience. (And it makes one yearn for the writerly equivalent of chess or poker software, like a program that spat out a quality metric for each of your sentences as you typed.)

Anyway, the overall picture that emerges from this (quite long) paper is that innate talent counts for very little—even things like lung capacity, heart size, capillary density, dexterity, etc., that we might take to be genetically endowed, turn out to change considerably with years of deliberate practice. Or to take another example, excellent pianists don't have faster reaction times than amateurs; they only outperform the amateurs on tests specifically related to training on the piano. Nor is there a clean relationship between chess ability and IQ. And so on.

Which is why the authors are right to instruct us to think of "expert performers not simply as domain-specific experts but as experts in maintaining high levels of practice and improving performance." These are people who have what Sir Francis Galton called "an adequate power of doing a great deal of very laborious work."

So you want to become good at something? Use the Archimedean method:

Archimedes taught us that a small quantity added to itself often enough becomes a large quantity (or, in proverbial terms, every little bit helps). When it comes to accomplishing the bulk of the world's work, and, in particular, when it comes to writing a book, I believe that the converse of Archimedes' teaching is also true: the only way to write a large book is to keep writing a small bit of it, steadily every day, with no exception, with no holiday.

- Paul R. Halmos, I Want to Be a Mathematician

Kenjitsu

You can sort of let a novel run through you: the language is loose enough that you don't need to chew on sentences—you can swallow them whole, steadily one by one, and still have a perfectly clear picture of who everyone is and what's going on.

Mathematics textbooks are different. If you churned through even an introductory text at anything close to a novel-y clip, you probably wouldn't be able to solve the most basic exercises. If given an exam on the subject, you'd fail.

I think that's roughly what Paul Halmos had in mind when he penned this excellent advice:

Don’t just read it; fight it! Ask your own questions, look for your own examples, discover your own proofs. Is the hypothesis necessary? Is the converse true? What happens in the classical special case? What about the degenerate cases? Where does the proof use the hypothesis?

That's how you work through a math text—with lots of chewing, and brooding, and musing. You have to play with the stuff in the same way that a programmer might play with another person's code: not by reading it straight through, but rather, by running it on his own machine—exploring each function with a range of inputs, tracing stepwise through the algorithms, exposing data structures with print statements or a debugger, etc., until he becomes so well-versed in the code's architecture and purpose that he could rewrite it himself in a different way.

I have a hunch that this approach generalizes beyond math, that every thing you read—be it a blog post, or a paper, or even a novel—presents you with the option to "fight it," to "run it on your own machine" instead of merely reading. The trouble is in breaking the habit to be passive and, more critically, figuring out what sorts of questions to ask. (Because obviously you won't get very far with "How does the proof use the hypothesis?" in non-mathematical contexts.)

I've thought a lot about this recently. I read a fair amount, but I'm afraid that too little of it sticks. Even if you asked me to describe an article just after I've read it, there are too many times where I'd hand-wave and stammer my way through a patchy explanation. And part of the problem, I've surmised, is that I take too much on face—I don't engage, or wrestle with, 90% of the sentences that I encounter. Occasionally I'll look up words or Wikipedia entries, sure, but I don't attack most texts in the way I would if I were actually trying to understand them, like if I were preparing to answer hard questions on the subject.

So I've tried to develop a modest set of techniques to overcome my own readerly inadequacy. Think of them as the basic tenets of what I'll call "the art of knowledge-fighting" or, more succinctly, kenjitsu, from ken = "one's range of knowledge" and jitsu = "fighting art":

  1. Try to become like the kind of pestering student who slows down classes. Incessantly ask questions and restate what the "teacher" says in your own words. Read at the speed of understanding—don't disengage from the hard stuff just to finish an article. When you start to glaze, or skim, or you feel like you're just sort of scanning over the forms of words, reboot.
  2. Read with a pen. I've perused the books and notebooks of my smart friends, and one thing these people have in common is that (a) they pack their reading with margin-notes and (b) these notes seem to harass the author. They're highly critical, in that they go past just trying to figure out what the author means and ask, "What would that imply? What other theories fit these facts? Isn't this a kind of wishful thinking?..." So every time you highlight a passage or circle a word, think about why you found it important. Rather than writing "yes" or "interesting," think about what led you to agree or how it's interesting. Be contentful, specific, and concrete—all the time.
  3. Think like Feynman:

    We had the Encyclopedia Britannica at home, and even when I was a small boy, he used to sit me on his lap and read to me from the Encyclopedia Britannica. And we would read, say, about dinosaurs. And maybe it would be talking about the brontosaurus or something [. . .] or the tyrannosaurus rex. And he would say something like "this thing is twenty-five feet high, and the head is six feet across"

    So he’d stop always, and say, "Now let’s see what that means. That would mean that if he stood in our front yard, he would be high enough to put his head through the window… But not quite, because the head is a little bit too wide — it would break the window as it came by."

    Everything we’d read would be translated (as best we could) into some reality… And I learned to do that — everything I read I try to figure out what it really means, what it’s really saying.

    Imagine actively. Use the phrase "that would mean..." to force yourself to think on your own terms with your own vivid images. It's easier said than done, but I'm convinced that this little trick is what made Feynman such a great explainer. Because everything he ended up teaching to someone else he had already taught himself, that first time he encountered it and tried to translate it into his own words and pictures.
  4. One thing that good philosophers and lawyers are good at is generating counterexamples. For each of your assertions, they seem to be able to conceive of a simple scenario where your thesis doesn't hold. Or if you present a thought experiment, they somehow know which knobs to turn—i.e., which parameters to change—so that it no longer serves your point. I'm not sure how exactly they do this, or what sort of practice one needs to develop the skill, but it can't hurt to constantly throw caveats at the general claims you might encounter in a day's reading.
  5. Be adversarial. For every position you run into—and nearly every blog post, article, paper, or magazine feature takes a side—put yourself in the shoes of someone arguing the opposite. What would their objections be? Would they feel that their position is being represented fairly by the other guy? What in the argument would they be forced to concede, and what would they be inclined to push back on?
  6. Explain stuff. There is no easier way to expose the holes in your own understanding than to try teaching someone else. Or if you really want to go nuts, try writing up the ideas that make you uncomfortable—the process, while painful, will clarify your thinking. The point is to never let ideas cross your mind without being engaged, or debated, or somehow extruded through discourse. When in doubt, hash it out.

Why the Law

I've been trying to articulate why I would want to study the law, in part to prepare a "statement of purpose" for applications, but also as a way of selling the subject to people who might be cold to it, or who don't understand its appeal. [1]

This is what I've come up with:

The law is a kind of intellectual cathedral -- a massive structure filled with loosely related thought-artifacts, each carefully wrought and later refined, whose total impact easily exceeds the sum of its parts.

In that sense it's not much different from any other discipline: we could say the same thing about economics, say, or philosophy.

But law has a few features which in my view set it apart:

1. It is mostly non-mathematical, i.e., the predominant idea-vehicle is English prose. So there is jargon but very few nonstandard symbols. For me this is critical simply because I don't have as much patience for highly technical mathematical symbology as I do for highly technical prose; I just wouldn't want my days to be filled deciphering that stuff. [2]

But the point is deeper than that: like many branches of contemporary philosophy, the law is at its core a kind of "language-game" in the Wittgensteinian sense -- "doing law" amounts in large part to playing with words and tracing the boundaries between them: Does Martha's grief constitute "severe emotional distress"? What sort of legislation is precluded by the "necessary and proper" clause? Is a tricycle a "vehicle"? Can an e-mail act as a "written agreement"?

I am attracted to a discipline whose most common form of puzzle-solving involves words. It's familiar territory.

2. Maybe for this reason, lawyers tend to write exceptionally well. This may be surprising given that the subject is notorious for spewing "legalese," but it's important to realize that that type of tediously precise language is not what lawyers use to communicate among themselves. Law review articles do not read like contracts.

Instead, they are remarkably crisp. Which makes sense given that convincing arguments are the basic currency of the profession. [3]

3. I know very little of the law. So the raw information gain of a three-year J.D. program would probably edge out other options for me, especially if I'm excluding the hard sciences.

4. Philosophy seems like it would be an excellent rival: it, too, is essentially "about" language; its practitioners are extraordinary writers and thinkers; its students are trained to become argument-jedis; and it even makes heavy use of hypotheticals. Etc.

The difference, I think, is that the concept-taxonomy developed by philosophers -- the particular way they have arranged the subject's ideas and vocabulary into a loose hierarchy -- is easily less instrumental than the law's. By that I mean that if you rearranged some nodes in the "org. chart" of philosophy, your only real impact would be on the way philosophers talk and think; in that sense philosophy is wrapped up in itself in a way that the law is not.

Real people pay real damages when legal reasoning goes one way instead of another, and whether a person lives inside or outside of a cage in prison can depend on how exactly the facts of his case figure into a broad juridical narrative. Which is to say that it matters how you decide to cut up the conceptual space.

5. Like physicists, lawyers can in some sense "see the man behind the curtain," i.e., they have a privileged, detailed understanding of forces that drive the everyday world we non-physicist non-lawyers take for granted.

A physicist, for example, can explain why a mirror seems to reverse left-and-right but not up-and-down, or how trains stay on the tracks, or why planes can fly upside down. Likewise a lawyer can explain whether you're liable if you unwittingly let a thief into your building, or what you're allowed to do if someone sucker-punches you at a bar, or what rights you have as a tenant.

Both types of knowledge play well at cocktail parties, and both are powerful, because they each expose the mechanics of complicated things: nature, for physics, and society for law.

6. The most common kind of law school exam question, I'm told ([4]), is what's called an "issue-spotter," in which you're presented with a detailed hypothetical and are expected to discuss the operative ambiguities, i.e., the "issues" on which a legal judgment of the situation might turn.

Example:

Brandishing a large hunting knife, Melissa entered Gary's elegant Coconut Grove mansion and threatened to stab him. Elliott, a free-lance director, was in Gary's living room at the time shooting a commercial for Bedford Falls University, and he captured the moment on videotape. A week later, Suzanna, Gary's fiancee, played a copy of Elliott's tape on Gary's VCR, mistakenly thinking that it was the couple's favorite episode of *thirtysomething*. Upon viewing the tape, Suzanna suffered severe emotional distress, but no bodily injury. Can Suzanna prevail in a lawsuit against Melissa in a jurisdiction that treats §46 of the Restatement (Second) of Torts as highly persuasive? Why or why not? [4, pp. 289-290]

To answer these capably you need to know the relevant law so well that the mapping from key features of the situation:

A. Suzanna wasn't in the room at the time of the incident;
B. Melissa knew she was being taped;
C. Gary and Suzanna weren't married, just engaged

to the lines drawn by statutes or cases:

A. "...who is present at the time..."
B. "...intentionally or recklessly causes severe emotional distress..."
C. "...to a member of such person's immediate family..."

occurs to you almost naturally. Hence the heavy books, and hence my confidence that if nothing else law students acquire a huge volume of information.

Of course, once you map the territory and get good at identifying these "forks," [4] or potential turning points, you still have the problem of weighing competing interpretations of each. Indeed that seems like the crux of the enterprise: since questions are designed to be tricky, odds are that every argument will come with qualifications. Which encourages a careful kind of adversarial thinking/writing -- developing each claim in light of the fact that another lawyer, on whichever side you're not, will play the devil's advocate.

It adds up to excellent intellectual training, a gauntlet which actually sounds like fun, especially if you see it from the other side: the problem of generating good hypotheticals, those that awkwardly slice the law or stretch it in uncomfortable ways. [5]

I'm encouraged that such questions figure so centrally in the curriculum.

7. Lastly, a law student is asked a lot of "should" questions: Should Marcus have prevailed in yesterday's case? Should universities consider race in their admissions decisions? How should the federal government compensate the owners of land taken under eminent domain?

In a way, you have to earn the right to be asked these questions, since absent exhaustive training you'd likely struggle to give the mere beginnings of an answer.

But for the prepared, this type of policy analysis -- where you're asked to consider a particular ruling or piece of legislation in its broader context, perhaps as the leading indicator of an important change -- would become a battleground for the big ideas -- Ethics, Equity, Justice -- and an opportunity to articulate your picture of a more perfect world.

Notes

[1] I should say that there are plenty of reasons I'd be after a J.D. that have nothing to do with its appeal as an intellectual object. Example: the law quad at the U of M was its architectural sine qua non -- inspiring on a purely aesthetic level. Also, because of how LSAT scores are weighted in the admissions process, I could plausibly attend a top 20 law school in spite of my low GPA, whereas if I wanted a philosophy Ph.D., say, I'd likely end up in a no-name program.

[2] This is somewhat strange given that I get along fine with computer code. Mind you I actually make regular use of math, and am happy to read symbol-laden papers -- I'm just not willing to make a life of it. I'm far more comfortable with words.

[3] It may seem like a minor point, but a discipline's writerly aesthetic becomes insanely important once you realize that you'll be reading hundreds of thousands of pages of the stuff.

[4] Getting to Maybe: How to Excel on Law School Exams, by Richard Michael Fischl and Jeremy Paul (1999).

[5] This kind of thought-experimenting is a crucial cognitive skill, though it's rarely emphasized outside of philosophy ("Yeah, but what if Mary had never seen a raven?") or math ("Suppose G is a non-abelian group of order..."). I like that law school encourages you to constantly invent edge cases.