Thursday, February 20, 2014

Legacy John Fisks Hugh Howey

Joe sez: There is a new report by Hugh Howey and Anonymous Data Guy. This time they looked at the rankings of over 54,000 Amazon titles.

The data blew me away. But my imaginary Big 5 Pundit, Legacy John, wasn't impressed.

Legacy John: His data is all full of lies and nonsense and nonsensical lies. I want to do one of those fisting things that you do.

Joe: You mean fisking?

Legacy John: Where you take someone's post and insult him, line-by-line.

Joe: Actually, the sarcasm is only a by-product of debating poor arguments, faulty logic, and bad data. I use it to accentuate how shoddy and worthy of ridicule their points are.

Legacy John: Whatever. You say you'll let traditional publishers have their say on your blog, so will you let me fask Howey or what?

Joe: Sure. Have at it. I'm all about contrary opinions.

Hugh: One week ago, we released our initial Author Earnings report on the prevalence and breakdown of nearly 7,000 genre e-books on Amazon’s bestseller lists. We only looked at three categories of genre fiction. Since that time, our spider has been hard at work gathering data on a wider variety of titles as it probes deeper into the lists. This time, over 54,000 titles were collected, practically every book on every Amazon bestseller list.

Legacy John: It is widely known by those who know things that the Amazon bestseller lists represent less than 2% of all book sales. This data is all bullshit, and I've heard that Hugh steals cars from the poor and tries to run over the elderly and military veterans and the disabled. Also, he invaded Peru.

Joe: Do you have any sort of evidence to back up these claims?

Legacy John: The truth needs no evidence. 

Hugh: For you techies out there who geek out on methodology, the spider works like this: It crawls through all the categories, sub-categories, and sub-sub-categories listed on Amazon, starting from the very top and working its way down. It scans each product page and parses the text straight from the source html. Along with title, author, price, star-rating, and publisher information, the spider also grabs the book’s overall Amazon Kindle store sales ranking. This overall sales ranking is then used to slot each title into a single master list. Duplicate entries, from books appearing on multiple bestseller lists, get discarded.

Legacy John: Bullshit! What kind of spider is smart enough to grab all that data? Insects have very small brains, and a spider couldn't possibly remember all of that. Plus, how could it pass that data along to Hugh? SPIDERS CAN'T WRITE OR TALK!

Joe: I'm pretty sure a spider is an Internet bot used for indexing information.

Legacy John: I'm pretty sure you lie about your sales, and are only popular because you got your start with the Big 6. You're just angry they kicked you out because your writing sucks.

Joe: That's actually not how it happened. If you read my blog--

Legacy John: No one reads your stupid blog, loser. Don't cry sour grapes to me because no one wants to publish you.

Joe: I sense a little hostility.

Legacy John: Can you sense me flipping you off? Because that's what I'm doing, right now, Konrath. And I'm going to Tweet that, too, and my six Twitter followers are going to RT and we're all going to laugh and laugh like a cool high school clique who laughs at others. Then we're going to take selfies combing our hair.

Hugh: As before, our spider is looking at a snapshot of sales rankings for one particular day — in this case February 7, 2014. Extrapolation is only useful for determining relative market share and theoretical earnings potential. Our conclusions assume that the proportion of self-published to traditionally published titles doesn’t change dramatically from day to day, and the similarity of this dataset, collected 9 days after the previous one, lends that assumption some support. By comparing successive reports over the coming months, we will be able to answer the day-to-day variance question more completely.

Legacy John: Snapshots! Now the spider can use a camera?! Give me a break.

Hugh: Of the ~54,000 titles sampled, ~11,000 (or 22%) were genre fiction. ~30,000 (60%) were non-fiction. ~900 (1.8%) were literary fiction. And ~10,000 (20%) were children’s books (young adult is not included in this last category). The preponderance of nonfiction in this sample does not reflect market share. Rather, it reflects the many hundreds of detailed Amazon sub-sub-sub-category bestseller lists for non-fiction (Health, Fitness & Dieting > Alternative Medicine > Holistic, for example), that make lower-selling nonfiction more visible to the spider than equally low-selling fiction.

Legacy John: Blah blah blah numbers blah blah blah data.

You know something, Howey? I got numbers too! Numbers like 17 and 3000% and 2.89/5 and eleventeen. Try to refute that!

Hugh: In order to better understand where and why this data differs from the three genre categories of our original report, let’s look at four different segments: Genre fiction, literary fiction, non-fiction, and children’s books. Here’s how daily unit sales and gross dollar sales divide up among them:50000-unit-sales-by-category50000-gross-sales-by-category
 Legacy John: I got pie graphs too, ya pinhead.

As with our previous report, daily unit sales and dollar sales estimates are based on crowdsourced sales rates by overall Amazon ranking. Adjusting these sales rates does not greatly alter any of our conclusions, as all titles are affected. What these graphs represent, then, is a snapshot of Amazon bestseller rank—which has been observed to correlate neatly with daily sales figures—and price.

With all of genre fiction lumped together, the previous estimates of 70% market share still hold. Future reports will break these genres down further. The goal of this report is to look at all e-books, rather than a single subset.

Legacy John: That's not your goal. Your goal to to try to make me admit I've been treating authors unfairly. Which is nonsense. We care about authors. Lots.

As I sit here, typing this in my solid gold swimming pool on my waterproof Cray XC30, I can't help but think of all the authors I've nurtured. If it wasn't for me, those authors wouldn't be paying part of their grocery bill, twice a year, with what they make in royalties that we send them when we remember to. And without groceries, they'd starve, and if they starved, they couldn't write more books that pay for my jet fuel. So don't tell me I don't care.

Hugh: Genre Fiction (All Genres)

The roughly 11,000 genre titles from our 54K sampling look very similar to the previous dataset of Mystery/Thriller & Suspense, Romance, and Science Fiction/Fantasy. These 11,000 genre books now also include Action & Adventure, Horror, Historical Fiction, Erotica, and the like. Here is the breakdown of how these 11,000 genre titles were published, and we can see that including all genres has given a boost to small publishers when compared to our initial report:
This breakdown is very similar to our original report, with self-published authors commanding roughly the same share titles on bestseller lists as the Big 5 combined. Ah, but where on the lists are these books? By estimating daily sales according to rank on the overall list, we can get a clearer picture:
Not to belabor the point, but no matter how the unit sales by rank figures are adjusted, all titles are impacted equally, so the share of the pie remains largely unchanged. The above graph is a neat visual indicator of relative strength across Amazon bestseller charts. We can see that small publisher titles are well-represented on the lists but that the sales are relatively muted. As with our first report, self-published and Big 5 published genre works are roughly equivalent.

Legacy John: Genre fiction! Bullshit! Genre fiction is cultural wasteland devoid of any redeeming values, as evidence by this chart:

Hugh: Gross sales and author earnings come next, where we anticipated a fall-off for self-published authors as we included all genres, but we’re seeing just a few percentage points difference from our earlier report:11000-genre-gross-sales-by-pubtype

Legacy John: Ha! Look at your own data, Howie! We're making 55% of the money! Suck it!

Joe: That number represents gross sales. Authors don't care about what publishers earn. They care about how much they earn.

Legacy John: That's absurd. Authors don't care about money. They're clueless hobbyists, and should be kissing our asses that we might even consider publishing them in the first place.

Here's why those indie authors are doing so badly: They write crap. And there is so much crap, readers are drowning in it. Literally drowning. Last year there were 22,345 people who drowned to death reading crappy self-pubbed books, as evidenced by this statistic:


Bad self-pubbed writing also killed 124,928 puppies and kitties.


Without us gatekeepers to guard against indie swill, more will die. Quite possibly this many:


Do you want to be responsible for that many deaths? Do you, all you indie hacks?

Hugh: Again, because of the higher royalties for self-published works, this daily snapshot of earnings reveals indie authors as a group making more than traditionally published authors:

See our 11K genre spreadsheet at the bottom of the report for more graphs and the full data set. It would be a lot to include everything here. People can only take so much pie.

Legacy John: You can make "data" say whatever you want to using your "methods". But did you know that Amazon is evil and wants to take over the world so they can turn people into half-man/half-fish creatures called "mishes"? We'll see how much you like Amazon when you're flopping around on your sofa unable to breathe, gill-boy.

Now you had a bunch more worthless data and graphs, which I left out because they are easily refuted by Bookscan and the DBW survey and the USA Today Bestseller list and Bowker and Publishers Lunch and the AAR and the Authors Guild and The Guy I Met On The Bus Talking To Himself About Aliens and smart people like Mike Shatzkin who predicted that by 2021 there would be 101,979,020 bookstores in the USA.


Or maybe it wasn't Shatzkin. Maybe it was that bus guy. But that doesn't deny the fact that print books account for 130% of all book sales, and ebook sales are dying and will probably be gone by tomorrow. Saturday at the latest.

Also, Amazon is evil and gatekeepers are important defenders of culture and authors need us and self-publishing is only for stupid losers. But if those stupid losers make it big, we'll offer them contracts with deal points like:

Publication within 10 years of signing.
Maybe some sort of editing.
1/6 the royalties you can earn on your own.
7 free galleys!!!
Cover art you'll love, because you have no choice.
Rigorous, ongoing sodomy. Without lube. (We call this the "Love Room")

Hugh: Across the entire range of e-books, fiction and nonfiction, adult and children’s, genre and literary, indie authors make up a large slice of the overall Amazon pie. While indie market shares in bestselling nonfiction, literary fiction, and children’s fiction are still catching up with genre fiction—where indies already beat out the Big-5—indies have already made surprising inroads into those other categories, too. We see also that including all the other genres alongside the three top-selling categories of our first report did not appreciably alter the distribution of self-published titles across the lists. Once again, no matter how we tweak the relationship between daily sales and bestseller rank, the effect on all titles is more or less evenly distributed. That means the market share by publishing type holds steady. This can be buttressed by running more reports over time.

Indies All the Way Down

Let’s try something interesting. What if we ignore the top 1,000 e-books and look at the 49,000 titles that follow? By removing the most extreme outliers, we can see if the lists are top-heavy for traditionally published authors or if the most extreme self-published bestsellers are the exception as some claim. Frequently, self-publishing success stories are explained away as rarities. If this is true, once we remove the top 1,000 from consideration, we should see the needle move toward the traditionally published mid-list authors who are making a steady living further down the charts.

Here’s the top 50,000+ e-books again:
And here we have the same group of e-books but with the top 1,000 bestsellers removed:
It’s indies all the way down.

Once we look below the Top 1,000, indications are that the indie midlist is healthy indeed. Or it could be that we’re glimpsing the rising swell of tomorrow’s new Top 1,000. All of this remains to be seen.

Legacy John: There you go again, using your "data" to "prove" that "self-publishing" is a "better alternative" to "sodomy"--I mean "Signing With the Big 5".

Your data makes me laugh. What are you, Mr. Data? Do you have an academic background with advanced degrees in hard-science and engineering from MIT & Stanford and professional experience doing exactly this same kind of competitive analysis of App-store charts for leading game industry companies and online casinos?

Joe: Actually, that's exactly the background of Hugh's Anonymous Data Guy.

Legacy John: Oh yeah? Well, Amazon is evil and self-publishing is stupid!

Hugh: In Summary 

The picture emerging from relative ranking on Amazon bestseller lists is that self-published authors have captured a large piece of Amazon’s total market share, more than any other single publisher and often more than all five major publishers combined. Looking at daily sales rankings for 54,000+ titles reaches well beyond outliers and beyond even what might be considered midlist e-books.

Our next report will step away from Amazon for a moment. Our spider has been crawling up B&N’s waterspout. What we have discovered there surprised us. Stay tuned.

Legacy John: Once again, this was a big waste of my time. I don't even know why I bothered fasting Hugh and his so-called data. We had record profits this year, because we're taking such a big cut from clueless authors, and this gravy train is going to last forever because existence bias is a proven way to embrace the future. Authors are clueless and ignorant and naive and stupid and eager, and they'll keep begging for the crumbs we toss them no matter what contrary data or opinion is posted.

For every number Hugh has, I have a different, happier number. No one cares about Amazon's data, but they do care about the lofty state of quality literature that will only exist if people like me keep telling most authors how much they suck.

You suck too, Konrath. You're mean and rude and your blog posts are waaaaaaaay too long and all you do is yak yak yak about us vs. them and how right you are all the time. Oh, and thanks for letting me guest post. It's a wonder more legacy folks don't post here.

Joe sez: Thank you for your, uh, insights, Legacy John.

As for Hugh and Data Guy, congrats on another job well done. To show that self-pub domination of the market extends far beyond the top 7000 top ebook bestsellers on Amazon is a revelation, and you're to be commended for this Herculean task.

Legacy pundits may not value your data. They may ignore it. They may disparage it, and try to discredit it and you.

But authors are listening. You're revealing that a whole lot of self-pubbed writers are making a whole lot of money outside the legacy system. You're also showing a whole bunch of legacy-pubbed writers what they could be making if they struck off on their own. This is valuable, and appreciated.

On behalf of all writers, thank you Hugh and Anonymous Data Guy. And if you haven't read the latest author earnings report in its entirety, check it out.