The accessibility of scientific writing #1: readability metrics

In science, communication is crucial. The work done by researchers only becomes useful after it is transformed into a publication or presentation of some kind. Otherwise it may as well have never been done. We should therefore be concerned if our publications are getting less readable. A recent article in eLife makes this claim, finding that papers have been steadily decreasing in readability over the last 135 years (Plavén-Sigray et al., 2017). They measured this with two metrics: Flesch Reading Ease (FRE), and the New Dale-Chall (NDC) readability formula. The FRE considers sentence length and the number of syllables in each word. The NDC also considers sentence length, along with the number of “difficult” words used.

For abstracts, the mean FRE in 1960 was around 20. In 2015, it was 10. The NDC score in 1960 was around 12.2. By 2015 it had increased to 13 (note that for NDC higher scores are worse). The authors establish a correlation between the scores found for abstracts and those in the full text of the articles. The FRE scores are higher (simpler sentences) for full articles than for abstracts. The change over time is also less pronounced. From the data in the article I calculated that the FRE scores for the full text will have decreased from 26 to 22 between 1960 and 2015. The NDC scores are lower (fewer difficult words) for full articles than for abstracts. Using the data from the article they increase from 11.9 in 1960 to 12.3 in 2015 (note that these are estimates I extracted from the graphs).

To put the FRE score in perspective I found this article from Shane Snow. He gives FRE scores for a variety of authors, including J. R. R. Tolkein (around 80), David Foster Wallace (around 68) and Malcolm Gladwell (around 66). Appropriately, he includes an “academic paper about reading level” which has a score of around 48. For the NDC scores it is more difficult to find examples. Generally, it appears that scores under 10 are understandable by college students. This seems an appropriate upper-limit for scientific writing.

These metrics are relatively crude. Even a more sophisticated metric (the "Lexile") came under heavy criticism when the ratings it gave defied expectations. I do think however that these metrics are worth considering when we write papers. I am not arguing that simpler readability scores indicate “better” writing, but that being mindful of them makes papers more accessible. Just because the ideas may be hard to understand, that does not mean they need to be presented in an obscure way. Some of this is a matter of style. Where there is the option of making something easier to read without losing meaning though, it should be taken. Native English speakers should consider that they are writing for an international audience. Your readers may have had to learn a second language to read your papers. It is unfair to present them with writing that is unnecessarily difficult to follow.

my_readability.png

For my own part, I have been trying to make my writing more accessible. I have plotted metrics for my recent first-author papers in the figure above. As well as the FRE and NDC scores I also calculated the Gunning Fog metric (calculated in a similar way to the FRE). Note that the axes for the NDC and Gunning Fog metrics are flipped upside-down. This way, the upward slope in all three plots shows increasing readability. I analysed text only from the Introduction and Discussion sections of the papers. This means a straight comparison should not be made against the values from the Plavén-Sigray et al. (2017) study. Comparing the Introduction sections against the Discussions revealed negligible differences (I had expected that my Introductions might be more readable than my Discussions).

In 2016, I started specifically rewriting my paper drafts to improve their FRE score. I will cover how I do this in another post in the future. This extra step affected one of the papers in 2016 (obvious on the graph). For the 2017 paper the Introduction FRE was much better than the Discussion (45 vs 31), so I must have been a bit slack. In the future, I will be aiming to keep the FRE scores for my writing above 45. I will also try to keep the NDC scores below 10 (which I am already managing to do). I have seen suggestions (e.g. Armstrong, 1982) for journals to set readability standards for submitted articles. I am not aware of any journals that do this, but I think it is an great idea to at least have some “floor” value. I hope to come back to this topic soon with discussions of other metrics, and a demonstration of how to improve the readability of a piece of writing. I will also be posting updated versions of the graphs above as more papers come out.

P.S. For this blog post: Flesch Reading Ease = 67, New Dale-Chall = 7.2, Gunning-Fog = 9.6

  • Armstrong, J. S. (1982). Research on scientific journals: implications for editors and authors. Journal of Forecasting, 1(1), 83–104.
  • Plavén-Sigray, P., Matheson, G. J., Schiffler, B. C., & Thompson, W. H. (2017). The readability of scientific texts is decreasing over time. eLife, 6(e27725), 1–14.