More Smoke and Mirrors: A Critique of the National Reading Panel (NRP) Report on "Fluency"

Stephen Krashen
Phi Delta Kappan (October, 2001)

In her review of the National Reading Panel's (NRP) report on phonics,Elaine Garan concluded that the report involved "a limited number of studies of a narrow population. . ."1 In this note, I will argue that this problem is not limited to the section on phonics: It also applies to the NRP's section on "fluency." It is only by omitting a large number of relevant studies, and misinterpreting the ones that were included, that the NRP was able to reach the startling conclusion that there is no clear evidence that encouraging children to read more improves reading achievement.2


The selection criteria used by the NRP for selection of studies were as follows:
"1. The study had to be a research study that appeared to consider the effect of encouraging students to read more on reading achievement.
2. The study had to focus on English reading education, conducted with children (K-12).
3. The study itself had to appear in a refereed journal.
4. The study had to have been carried out with English language reading."3

The NRP claimed it could find only 14 studies that met these criteria.4 Of these, 10 were studies of the impact of sustained silent reading (SSR) programs in which some classtime is set aside for free voluntary reading with little or no "accountability." Of these 10, three had positive results, with the students who were engaged in free voluntary reading outperforming comparison groups. Another study showed positive results for one condition but not for other conditions, and the other studies showed no difference or no gains. Table 1 summarizes these outcomes.


Summary of National Reading Panel Results: Duration of Treatment and Outcomes5

DurationPositiveNo differenceNegative
Less than seven months280
7 months--1 year220
positive = students in sustained silent reading programs outperform comparisons
Results include ten studies, 14 comparisons

In other sections of the NRP report, such as the sections on phonics and phonemic awareness, the NRP listed studies that were excluded from its analysis. This was not done for the section on fluency. We do not know, therefore, which excluded studies were simply missed and which were rejected, nor do we know the specific rationale for their rejection.

In table 2, I present an "expanded" set of SSR studies in which tests of reading comprehension were used. Many of the studies summarized in table 2 meet the four criteria of the NRP and were apparently missed, but there were some "violations": A few were done with students slightly older than the age limit imposed by the NRP; in all cases, the subjects were undergraduate college students. Subjects in some of the studies were students of English as a second language.6 In several studies, students read in Spanish, not English; in these cases, the students were native speakers of Spanish. Finally, some studies were not published in refereed journals.

Table 2 summarizes the results of these studies. It includes studies included by the NRP as well as those that the NRP did not include.


Duration of Treatment and Outcomes of SSR Studies: Expanded Set7

DurationPositiveNo differenceNegative
Less than 7 months7133
7 months--1 year9110
Greater than 1 year820

In the studies in table 2, SSR students did as well or better than comparison students in 50 out of 53 comparisons. For longer term studies (those longer than one year), SSR students were superior in eight out of ten studies, and there was no difference in the other two. Moreover, there are plausible reasons why the results were not even more positive: In one study carried out by Isabel Schon, Kenneth Hopkins, and Carol Vojir, there was no difference between SSR students and comparison groups, but only five of the eleven SSR teachers actually carried out SSR conscientiously.8The classes taught by these five achieved significantly better gains.

In a study by Ruth Cline and George Kretke, another study showing no difference, subjects were junior high school students who were reading two years above grade level, and probably had already established a reading habit.9 Similarly, in Zephaniah Davis' study of eighth graders, SSR helped medium level readers but not better readers.10 SSR appears to be most effective for less mature readers, its aim being to interest them in outside reading. Those who are already dedicated readers will not show dramatic gains. It is doubtful, for example, that readers of this paper will improve if they add to their daily schedule an extra 10 minutes of reading.

It is important to note that the NRP did not include any studies lasting longer than one year. A more comprehensive review of the literature indicates that the positive impact of recreational reading increases over time.

Even applying the NRP's stricter criteria, SSR does very well, with readers doing as well or better than comparisons in 35 out of 36 comparisons. This suggests that the "violations" do not affect the central issue of whether encouraging recreational reading impacts literacy development. Even if one only allows studies that strictly meet the NRP's criteria, the result still favors recreational reading.

Misinterpreted Studies

In addition to excluding relevant studies, the NRP misinterpreted some of the studies that it did include. Carver and Liebert 's study11 should not have been cited as evidence for or against recreational reading because the students were constrained with respect to what they could read. They were allowed to read books only at or below their level, and the choice of books was limited to 135 titles (the regular library stacks were off limits). There was heavy use of extrinsic motivators, students had to take multiple choice tests on the books they read, and reading time was heavily concentrated, with students reading in two hour blocks. Successful sustained silent reading programs allow access to any books readers want to read, do not use extrinsic motivators, do not make students accountable for what they read, provide a wide variety of books, and typically meet for a short time each day over a long period.12

The NRP claimed that the advantage shown by readers in Joanne Burley's study13 was "small." Students in sustained silent reading were clearly significantly better in reading comprehension than comparison students in three other conditions, but it was not possible to calculate measures of the size of the effect. It is not clear how the NRP concluded that the difference was small, especially considering the fact that the treatment lasted only six weeks and contained only 14 hours of reading. In a response to a commentary of mine, Shanahan claims that "the problem here was not with the statistics, but with the design of the study. Each of the four treatments was offered by a different teacher, and students were not randomly assigned to the groups. It is impossible to unambiguously attribute the treatment differences to the methods." 14 This is not accurate: Student assignment was in fact random and the four teachers were randomly assigned to one of the four groups.15 In addition, the group that did SSR was superior to all three comparison groups, taught by three different teachers.

Not included in my summary of studies in Table 2 is a study that the NRP did include. Janet Langford and Elizabeth Allen16 used the Slossen Oral Reading Test, which consisted of reading words aloud, which may or may not involve genuine reading comprehension. The difference between the groups was highly significant and I calculated an effect size of 1.005, which is quite large. Nevertheless, in discussing this study, the NRP concluded that "the gains were so small as to be of questionable educational value."17

In Gary and Maryann Manning's study,18 students who engaged in SSR made better gains than a comparison group, but the difference was not statistically significant. SSR was significantly better than traditional instruction, however, when readers interacted with each other, that is, when they discussed their reading with each other and shared books. The NPR refers to this group's advantage as "slight,"19 but it is not clear how they arrived at this conclusion. I computed a respectable effect size of .57 for the difference between the peer-interaction group and the comparison group.

The NRP interpreted a study by Sandra Holt and Frances O'Tuel as showing no difference between readers and comparisons in reading comprehension.20 This study contained two samples, seventh and eighth graders. According to the text of the article, for the total sample, the readers were significantly better on tests of reading comprehension. The text also states that the difference was statistically significant for the seventh graders but not the eighth graders, a conclusion that is consistent with mean posttest scores presented in the researchers' Table 1 (pretest means were not presented). In their Table 2, however, the difference for reading comprehension for grade 7 was not statistically significant. The effect size for grade 7 (my calculations), based on posttest means, was a substantial .58. The NRP did not mention this discrepancy. I classified the results of this study as a split-decision.

The NRP reported that D. Ray Reutzel and Paul Hollingsworth found no differences between SSR and skills practice.21 What the panel did not mention is that the entire treatment lasted only ten days (not one month, as the NRP reports), and that each of four skills groups did intensive work on specific comprehension skills (locating details, drawing conclusions, finding the main idea, finding the sequence). Reutzel and Hollingsworth found no difference among the five groups on tests of comprehension skills and concluded that "engaging in sustained reading of connected and meaningful text appeared to be just as effective as spending time of the learning and practicing of discrete comprehension skills."22

Additional Evidence

It should also be pointed out that the case for reading does not rest entirely on studies of sustained silent reading. In "read and test" studies subjects show clear gains in vocabulary and spelling after a brief exposure to comprehensible text.23 It is hard to attribute these gains to anything but reading. There are, in addition, compelling case histories that cannot be easily explained on the basis of competing hypothesis, cases such as Richard Wright, who credits reading with providing him with high levels of literacy development: "I wanted to write and I did not even know the English language. I bought English grammars and found them dull. I felt that I was getting a better sense of the language from novels than from grammars."24 Or consider the case of Ben Carson,25 a neurosurgeon who says that his mother's insistence that he read two books a week (of his own choosing) when he was in the fifth grade was a turning point in his life. Carson credits reading with improving his reading comprehension, vocabulary, and spelling, and it helped him move from the bottom of his class in grade 5 to the top in grade 7. Yes, I know; there was no control group, no tests were given, and the results were not in a refereed journal. But it is hard to imagine any other source for this obvious improvement, and cases like these are not uncommon.


The NRP concluded that "the handful of experimental studies" in which encouraging voluntary reading have been done, "raise serious questions" about its efficacy.26 There are more than a handful of studies. Moreover, the addition of more studies to the analysis provides substantial evidence in support of the effectiveness of recreational reading.

Note that even a finding of "no difference" between free readers and students in traditional programs suggests that free reading is just as good as traditional instruction, which confirms that free reading does indeed result in literacy growth, an important theoretical and practical point. Because free reading is so much more pleasant than regular instruction (for both students and teachers), and because it provides students with valuable information and insights, a finding of no difference provides strong evidence in favor of free reading in classrooms.

At worst, the impact of free reading appears to be the same as traditional instruction, and it is often better, especially when studies are continued for more than an academic year, a finding that the National Reading Panel has obscured by omitting important studies and describing others incorrectly. Garan asks that we look beneath the smoke and behind the mirrors of the NRP phonics report. The same needs to be done with the report on "encouraging fluency."


