This morning I read a fascinating blog article by Shane Snow in which he used two measures of reading level to rank a large number of books, both fiction and non fiction. His main contention was that many of the most successful books, at least in modern times, are comparatively easy to read. This makes sense; not many people are going to slog through a novel if the reading level is too challenging for them. He also drew the inference that blog articles with a lower reading level are much more likely to be shared on social media. Obviously, these insights are of great interest to me as a writer. Because the article piqued my interest, and because I’m at the point in writing my own book where I am happy to jump at any distraction, I decided to extend his analysis a bit on my own.
It only took a minute or two to find an open source Java app that calculates the Flesh-Kinkaid Grade Level and Flesh Reading Ease Level of any text or PDF file. The former gives the number of years of education required to comprehend the writing. The later is a similar measure, in which a higher score indicates that the work is easier to read.
The first thing I did was to run it on several manuscripts which I have on my laptop. These included my recently published monograph, the current draft of the nonfiction book I’m writing, and a novel manuscript and three short stories which I am currently trying to sell. I also ran it on all four of my blogs.
|My Own Writing|
|Nonfiction||Flesh-Kincaid Grade Level||Flesh Reading Ease Level|
|Current Book Project(1)||12.91||43.67|
|Handyman Kevin Companion Blog||7.97||70.46|
|Angry Transportation Rants (Dormant)||7.69||68.43|
|Old School Essays (Dormant)||8.26||60.16|
|(1) First draft, about 6% complete|
|(2) Body text is nearly identical to my MBA thesis|
|(3) Unpublished manuscripts from my current “slush pile”|
Since raw numbers aren’t that intuitive, I plotted a chart. Notice how the different pieces of writing cluster quite neatly by type.
I was happy to see that both my fiction and my current nonfiction project are in same zones that Snow found for these types of writing. This is quite important from a marketability standpoint, since any editor I send them to would be instantly turned off if the reading level were too high or low.
My blogs fall in the middle, which makes sense since they are basically nonfiction, but are written more casually than a nonfiction book. However, going by Snow’s article, they are probably written at too high a reading level to be shared much. In fact, I don’t get many shares compared to other bloggers. I think I can live with that, since I tend to target my blogging towards my fellow writers. I suspect that you people are comfortable reading at a higher level than the general public.
My monograph, Freight Forwarding Cost Estimation: An Analogy Based Approach, appears to be nearly unreadable to anyone without a graduate degree in operations research. I suppose that explains why sales haven’t exactly skyrocketed. It is what it is, though–an adaptation of my master’s thesis. My committee loved it.
I think there is real benefit to a writer knowing that the reading level of his work is appropriate to the target audience.
Of course, being a Great Books fan, my next move was to run the app on all the Great Books that I have written about so far on this blog, as well as the next few I plan to cover.
|Selected Great Books|
|Flesh-Kincaid Grade Level||Flesh Reading Ease Level|
|Hebrew Bible (1)||7.57||76.51|
|House of Atreus||2.23||90.86|
|Apology, Crito & Phaedo||8.03||70.01|
|Leaves of Grass||12.26||58.00|
|(1) King James Version|
Again, when I plotted the points, they clustered nicely by type.
These results held a few surprises. First was the fact that Homer and the Greek dramas are actually written at a very low reading level, at least in terms of sentence and word length. I believe this is because these works were intended to be recited or performed orally. Spoken language is always simpler than written language. Also, these reading level metrics don’t take vocabulary into account. Epic poetry and Greek drama tend to use a much wider range of words than a novel, for example. Examining this factor would require some sort of word frequency analysis. Unfortunately, I didn’t have an “off the shelf” app to conduct a frequency analysis. I’m sure I could have kludged up a Python script in a couple hours, but that would have been more time than I wanted to spend.
Another surprise was that Walt Whitman’s Leaves of Grass, which I would have expected to show up close to Homer’s epics, is actually a much tougher read. It graphs down closer to the serious Greek philosophical works. As I’ve stated before, though, Leaves of Grass is a rather unique work.
The biggest surprise, however, is that the Great Books are written at a lower reading level, on average, than my own work. Granted, these sample sizes are pretty small. I suspect, however, that I have stumbled upon another of the factors that contribute to a book being Great: the authors manage to convey complicated ideas in simple, readable language.
So, besides being a good way to check the appropriateness of my manuscripts for the target audience, does any of this have a practical application? Well, the fact that books cluster by type means that reading level could be a good way to sort them. It would be quite simple to modify the Java app into a data mining tool to sort a collection of books into categories like fiction, nonfiction, plays, etc. I can easily see situations where this could be useful for anyone who has a large collection of e-books with incomplete meta-information. Project Gutenberg and Internet Archive, I’m looking at you.