The Bestseller Code

The Bestseller Code, by Jodie Archer and Matthew L. Jockers, describes the results of a five year computer analysis of over 20,000 novels. The authors wanted to figure out what differentiates the 500 or so New York Times bestsellers in their corpus from the rest of the titles that didn’t make the bestseller list.

The subtitle, Anatomy of a Blockbuster Novel, describes the book’s goal, which is to describe the common traits of bestsellers, revealing some hidden and unexpected characteristics along the way.  The book does not pretend to offer a formula for writing bestsellers. A similar book in the world of sports might reveal insights into what made Michael Jordan, Kobe Bryant, and LeBron James such great basketball players, but that doesn’t mean you or I could follow the formula and become the next NBA superstar. This book is a descriptive anatomy, not prescriptive how-to.

The authors use highly-customized Natural Language Processing (NLP) software to analyze thousands of data points within each book, including the frequency of different words and word types, sentence length, which topics appear and with what frequency, and where the emotional high and low points of the plot occur.

One thing you should know about NLP software is that it enables the computer to describe a text, but it does not enable the machine to understand the text. For example, in reading Harry Potter, NLP software will point out that virtually every paragraph that mentions Voldemort is full of words that express negative sentiment (words like evil, terrible, fearful, etc.). From this, the software can infer that Voldemort is the villain. However, NLP software does not understand what it reads the way a human does. It cannot answer complex questions like, “How does Mrs. de Winter’s understanding of her world change when Maxim says, I never loved her?”

Most of Archer and Jockers’ findings make perfect sense, and will be familiar to people who read a lot of novels and to those who read the advice of successful authors on the craft of writing. Readers prefer an active protagonist to a passive one. Readers prefer language that’s close to the everyday vernacular over the more formal type of writing that appears in essays. Bestselling authors do not overload their sentences with adjectives and adverbs. They convey meaning with nouns and verbs, which makes the reading experience smoother and more fluid.

Bestsellers tend to focus on a few topics within each work, rather than trying to hit on every theme the author can think of. Typically, the three major topics of a bestseller account for about 30-40 percent of the total topical matter of the work. Certain topics are more likely to make a bestseller: technology, work, family life, close interpersonal relationships. Surprisingly, sex is not one of those topics, and bestsellers in general tend to feature less sex than non-bestsellers.

Archer and Jockers identify seven structural patterns common to the plots of all bestsellers. They present the structural patterns as line graphs, which give a clear picture of the story’s emotional highs and lows. The summary of the classic love story, for example, is “Boy meets girl, boy loses girl, boy gets girl back.” If you were to draw that as a line graph, there would be a high point near the beginning when the boy first meets the girl, followed by a low point in the middle when he loses her, and another high point at the end, when he gets her back.

One of the book’s surprising findings is that the emotional curves of all bestsellers follow one of these seven graphs. The plot lines hold for trashy romances, far-fetched thrillers, and revered literary prizewinners. The Bestseller Code even charts the plots of some of these seemingly disparate novels together on the same graph to show you how similar they are.

Much of the value of this book comes from its clear and well-described insights into what readers respond to, and from the authors diving into a number of texts to provide illuminating examples of the generalized patterns that the computer has revealed. Jockers has more of a traditional English-lit background, and will occasionally touch on books by Virginia Woolfe and James Joyce. Archer comes from the publishing world, and her discussions focus on more current works.

One work they dig into is the unexpected phenomenon Fifty Shades of Grey, which, on its surface, seems to defy all the rules. It was written by an indie author with no marketing to propel it, one of its primary topics is sex (and not mainstream vanilla sex, but BDSM) which is not part of the “bestseller DNA”, and both readers and critics mocked the quality and style of its writing. But for all that, it sold hundreds of millions of copies. Why?

This is where the computer analysis really shines, as it points out characteristic patterns of the work without the baggage of emotional or aesthetic judgement that a human reader would bring. The analysis showed that E.L James, despite what some might call a lack of style, had hit on every element of the blockbuster novel, from topical makeup to plot structure to character. The analysis also showed that, based on the number of paragraphs devoted to each topic, the book was more about close interpersonal relationships than sex in general. And “close personal relationships,” the authors remind us, is one of the top themes common to all bestsellers.

While the overall public discussion of Fifty Shades tended to focus more on the sex, the computer was able to see that readers were experiencing, perhaps on a less unconscious level, the same sorts of interpersonal relations that fascinate them in the genres of mystery, thriller, and historical drama.

Even more interesting, Archer and Jockers point out, is the plot structure of Fifty Shades, which is a subtle and unusual variation of one of the six basic structures common to all bestsellers. James’ novel, generally follows “Plotline 4,” which Christopher Booker, in The Seven Basic Plots, calls “Rebirth.” Archer and Jockers point out that “these plots tend to see the main characters experience change, renewal, and some sort of transformation.”

The twist that James put on this basic plot is that, instead of following the plot’s typical emotional pattern of beginning-high-low-high-low-end, she created a series of highs and lows throughout the book, which occurred at such regular intervals that the graph of them looks almost like a perfect sine wave. Archer and Jockers refer to this pattern as the emotional rhythm of the plot.

Outside of the Harry Potter series, which was primarily aimed at young readers, the only “adult” book in the past twenty years whose sales numbers compare to Fifty Shades is Dan Brown’s The Da Vinci Code. The authors point out that although The Da Vinci Code’s basic plot differs from Fifty Shades, the two books share an almost identical emotional rhythm. Page 106 of The Bestseller Code shows a graph of the two plot lines, with the high points, low points, and inflections points of both novels appearing almost in lockstep. Now who would have thought to compare those two books, or even mention them in the same breath?

Both were runaway bestsellers, and on an emotional level, both provided a strikingly similar reading experience despite their differences in topic, style, tone, and genre. That insight about emotional experience reminded me of a question and answer I read recently on Quora. A reader asked why the Harry Potter series was so popular despite the fact that its plots were not necessarily new and other writers had more interesting styles and were better at world building. An author named Nick Travers offered this response:

I used to think the same as you. I even thought, ‘I can write as well as that,’ so I started to write a novel to prove it.

Four novels later and I can write as well as J.K.Rowling, but the quality of my stories pale into insignificance compared to hers. What I’ve learned is that J.K. Rowling is not a particularly good writer, but she is a master story teller. When she tells a story it sparkles with magic in a way that draws people (especially youngsters) into her world. I wish I could tell stories like that.

The Bestseller Code maps out some of the characteristics and common traits of great story-telling in illuminating ways, and the authors’ commentary on a diverse body of well-known works makes it a fun and interesting read. If you’re interested in understanding what drives readers to buy books, this one is worth a read.

How’s this for coincidence?

Back in April of this year, I was heading to south Florida to present at a conference at the University of Miami. By coincidence, my wife was returning from Miami just as I was packing to go. She had been visiting her father, and she said, “My dad wants a copy of Impala. Will you bring him one?”

I said, “Sure.” I signed a copy of the book and tossed it into my suitcase. And then for no particular reason, I tossed in a second copy. I flew out on a Tuesday for the conference that would occupy all of Wednesday and Thursday. On Thursday night, I would have dinner with my father-in-law and his wife in Key Biscayne, where I’d give them the book. On Friday, I’d fly back to Virginia.

After arriving in Miami and checking in to my hotel in Coconut Grove, I had dinner, and then went to a bar called The Grove Spot and had a beer. I gave my credit card to the bartender. She rang up the sale, and when she returned with the card, she said, “You have the same name as my husband.”

I said, “Andrew?”

She said, “Andrew Diamond.”

I told her I had once met another Andrew Diamond, in New York, around 1990 0r so.

Before I went to bed that night, I grabbed the second copy of Impala from my suitcase, signed it with a message for the bartender and her husband, and dropped the book off at the bar.

The conference occupied most of the next two days, though each morning, I got up between 4:00 and 5:00 to write. I was working on another mystery, and for several weeks, I’d been compiling a list of questions about the logistics of drug smuggling, distribution, and money laundering. My plan was to finish a first draft, in broad strokes, and then find answers to these questions so I’d be able to fill in the finer details. I was wondering who who would be able to answer these questions, and I figured I could get some leads by asking other authors, people on Goodreads, and the readers and writers on Quora.

The conference went well, for the most part, and after it ended late Thursday afternoon, I took an Uber across the bridge to Key Biscayne and had dinner with my father-in-law and his wife on the patio of an Italian restaurant. We talked for an hour or so, lingering over a glass of wine, and then I gave them the book.

They drove me back to Coconut Grove, and as it was still early and the air was warm, I took a walk. Before I went back to the hotel, I stopped again at The Grove Spot for a beer. The woman who had served me the first night wasn’t behind the bar this time. She was at a table on the patio with some friends. The bartender, a man, gave me my beer. I took two sips, put the glass down on the bar, and started thinking through my questions about drug smuggling and money laundering, and how I should write them all down.

Just then, someone beside said, “Andrew Diamond!”

I turned and saw a guy with brown hair and a brown mustache, dressed casually in slacks and a collared shirt. He held a beer in one hand and extended the other in greeting.

I shook his hand, and he said, “I’m Andrew Diamond. Thanks for the book.”

He took the seat next me and we started chatting. He asked me what brought me to Miami, and I told him about the conference. I asked him what he did, and he said he was a retired federal agent.

“What agency?”

“Customs, which used to be part of the Treasury Department, then was merged into Homeland Security.”

“What kind of stuff did you work on?” I asked.

He worked on cases involving drug smuggling, distribution, and money laundering.

We talked for about an hour, and I went through every one of the questions I’d been compiling. He answered all of them, and even gave detailed examples, from cases in which he was personally involved, of how drugs are smuggled into the US, how they are distributed after they get here, how the dealers launder the profits, and how the feds, through painstaking work, are able to bust them. He even touched on some topics I had not considered, such as the tensions within and among different agencies of federal law enforcement, and different ways in which they approach their work.

Now how’s that for serendipity? My wife returns home to catch me while I’m packing and tells me to put a book in my suitcase. For no particular reason, I throw in a second copy, which will soon open the door to an unexpected meeting. On my first visit ever to Coconut Grove, I happen to go to the bar where the guy with the same name as me hangs out. And he happens to be the person who can answer all the questions I’ve been turning over in my mind for the past few weeks.

It’s now mid-August. I got an email from Andrew Diamond a few days ago saying he had just finished reading my novel. In case you haven’t read it, Impala is about a twenty-something hacker named Russ who finds himself with a load of stolen Bitcoin that a bunch of Russian thugs are eager to take back from him. He’s also being pursued by another gang, and by a federal agent who wants to haul him in before the thugs can get him.

In a final twist of coincidence, here’s Andrew Diamond’s commentary on Impala.

Thanks for the book. Good summertime read. Character reminded me of a Russian whiz kid we picked up in Cyprus in 2010. All of 23 years old and managed to swindle millions in New York and laundered it through bitcoin. Unfortunately for him, he also took a bunch from some businessmen in Russia with strong Kremlin ties, so – in the spirit of detente – we were all looking for him. Like your character, this kid was smart and tenacious. Unlike your character, he was arrogant and in love with the flash. It was ultimately his undoing. Truth be told, he was lucky my partner and I got to him first. US prisons always get far better TripAdvisor ratings than Russian gulags…

Not only did I give the book to someone with the same name as the author, he had actually participated in a story similar to the one I wrote. The federal agent in Impala, Jack Hayes, is not an upstanding citizen. Andrew Diamond seems to be, from what I know of him. And that’s a good place for the coincidences to end.