Language Learning 46:2, June 1996, pp. 327-369
Review Article
The Case Against Grammar Correction
in L2 Writing Classes
John Truscott
National Tsing Hua University
The paper argues that grammar correction in L2 writing classes should be abandoned, for the following reasons:
(a) Substantial research shows it to be ineffective and none
shows it to be helpful in any interesting sense; (b) for both
theoretical and practical reasons, one can expect it to be
ineffective; and (c) it has harmful effects. I also consider
and reject a number of arguments previously offered in
favor of grammar correction.
In second language (L2) writing courses, grammar correction
is something of an institution. Nearly all L2 writing teachers do
it in one form or another; nearly everyone who writes on the
subject recommends it in one form or another. Teachers and
researchers hold a widespread, deeply entrenched belief that
grammar correction should, even must, be part of writing courses.
But on what do they base this belief? The literature contains
This paper has benefitted from the comments of Nathan Jones, Johanna
Katchen, Hsien-Chin Liou, and Yenlung Shieh; not, of course, that these
helpful people agree with everything I have to say.
Correspondence concerning this article should be addressed to the author
at Department of Foreign Languages, National Tsing Hua University,
Hsinchu, Taiwan. Internet: truscott@fl.nthu.edu.tw.
327
328
Language Learning
Vol. 46, No. 2
few serious attempts to justify the practice on empirical grounds;
those that exist pay scant attention to the substantial research
that has found correction ineffective or harmful. Most writing on
the subject simply takes the value of grammar correction for
granted. Thus, authors often assume the practice is effective,
without offering any argument or citing any evidence. When
someone cites evidence, it generally consists of only one or two
token sources, with no critical assessment of them.
Researchers have similarly failed to look critically at the
nature of the correction process. Work on the subject rarely
considers the many practical problems involved in grammar
correction and largely ignores a number of theoretical issues
which, if taken seriously, would cast doubt on its effectiveness.
Finally, researchers have paid insufficient attention to the
side effects of grammar correction, such as its effect on students'
attitudes, or the way it absorbs time and energy in writing classes.
Commentators seem to feel that we cannot eliminate such problems through limited adjustments in the correction process, so we
simply have to live with them. They assume that grammar
correction must be used in writing classes, regardless of the
problems it creates; this assumption is very rarely discussed
seriously.
Grammar correction is too important to be dealt with so
casually. We have an obligation to our students and to our
profession: to go beyond this uncritical acceptance and to look
more seriously at the evidence, at the logic of correction, and at the
problems it creates. This will mean seeing the subject through the
eyes of a skeptic, which is what I propose to do.
'My thesis is that grammar correction has no place in writing
courses and should be abandoned. The reasons are: (a) Research
evidence shows that grammar correction is ineffective; (b) this
lack of effectiveness is exactly what should be expected, given the
nature of the correction process and the nature of language
learning; (c) grammar correction has significant harmful effects;
and (d) the various arguments offered for continuing it all lack
merit.
Truscott
329
Before proceeding with the argument, though, I need to
clarify a few points. First, I do not deny the value of grammatical
accuracy; the issue is whether or not grammar correction can
contribute to its development. Nor do I generally reject feedback
as a teaching method; I will have very little to say about responses
to the content, organization, or clarity of a composition, for
instance, and I certainly will not suggest that such responses are
misguided. Finally, the key term needs some clarification: By
grammar correction, I mean correction of grammatical errors for
the purpose of improving a student's ability to write accurately.
This correction comes in many different forms, but for present
purposes such distinctions have little significance, simply because
there is no reason to think any of the variations should be used in
writing classes, and there is considerable reason to think they are
all misguided.
Grammar Correction Does Not Work
A large number of studies have attempted to show the effects
(or lack of effects) of grammar correction. Their general logic is
straightforward: The researchers compare the writing of students
who have received grammar correction over a period of time with
that of students who have not. If correction is important for
learning, then the former students should be better writers, on
average, than the latter. If the abilities of the two groups do not
differ, then correction is not helpful. The third possibility, of
course, is that the uncorrected students will write better than the
in which case, correction is apparently harmful.
corrected ones—
Evidence Against Grammar Correction
To begin with, there is a great deal of evidence regarding first
language (LI) writing. Knoblauch and Brannon (1981) and
Hillocks (1986) have done extensive reviews of this research (see
also Krashen, 1984; Leki, 1990). They looked at many studies,
including research done with various types of students and many
530
Language Learning
Vol. 46, No. 2
different types of grammar correction. They found that correction
had little or no effect on students' writing ability. It made no
difference who the students were, how many mistakes were
corrected, which mistakes were corrected, how detailed the comments were, or in what form they were presented. The corrections
had no effect. The conclusion for LI, then, is clear: Correction is
not helpful.
These studies on LI learning certainly do not prove that
correction is ineffective in L2 language learning; conceivably a
technique that is not helpful in the one case could be helpful in the
other. But they certainly provide strong grounds for doubt; in
view of their results, it would be folly to assume, without strong
evidence, that correction is useful in L2 learning. In other words,
the effect of the LI research is to place the burden of proof firmly
on those who would claim that correction is helpful.
So I turn now to the research on L2 learning. Can a case be
made that correction works? Clearly and unambiguously not. In
fact, the L2 evidence fits very well with that from the LI studies;
correction is clearly ineffective.
Hendrickson (1978) reviewed the available research and
concluded that little was known. He claimed that learners should
be corrected, but the work he reviewed did not support such a
view. His own work (1978, and in more detail in Hendrickson,
1981) indicated that correcting all errors was no better than
correcting only those that produced communicative problems:
Neither method had any significant effects. A few more recent
papers (Krashen, 1992; Leki, 1990; VanPatten, 1986a, 1986b)
have briefly reviewed the evidence, all of them reaching the same
conclusion: Grammar correction is ineffective.
Looking at the rest of the literature, one has no difficulty
understanding these pessimistic assessments. Cohen and Bobbins
(1976),'for instance, examining the written corrections received
by three students in an advanced ESL course, concluded that "the
corrections did not seem to have any significant effect on students'
errors" (p. 50), They found that the corrections were not well done;
they believed that this was the real cause of students' problems,
Truscott
331
but offered no reason that better-done correction would have
helped.
~,
Semke's (1984) large, 10-week study of German students
produced similar results. She divided the students into four
'; groups, each receiving a different type of feedback. Group 1
^received only comments on content, with no concern for errors.;
Group 2 received only comments on errors. Group 3 received both
types of comments, and Group 4 had their errors pointed out and
were expected to make corrections themselves. Semke found no
significant differences among the groups in the accuracy of their
writing. In addition, Group 1 (comments on content only) was
significantly better than all the others on fluency and on a cloze
test. Thus, feedback on errors was not only unhelpful, but also
harmful to learners. Those who received comments on content
' plus correction were significantly inferior to those who received
;! only comments on content. Semke also found Group 4 (selfcorrection) inferior to all the other groups on fluency1—
evidence
against the use of a technique frequently recommended in the:D
S literature (but always with little or no supporting evidence; e.g.^
JBartram & Walton, 1991; Hendrickson, 1978, 1980; Higgs, 1979;t
?Hyland, 1990; Raimes, 1983).
|i
Grammar correction's futility also showed in a study by
; Robb, Ross, and Shortreed (1986). They used four very different
types of feedback: (a) explicit correction, indicating the errors and
the correct forms; (b) the use of a correction code to point out type
i
and location of errors; (c) the use of highlighting to indicate the
locations of errors, without any explanation; and (d) a marginal
tally of the number of errors in each line, with no indication of
!: what the errors were or where in the line they were located. In all
four conditions, students were to rewrite their compositions,
making the appropriate changes. At the end of the course, the
authors found no significant differences in students' writing
ability.
•
Robb et al.'s (1986) study could have been made more clear
I and compelling by the inclusion of a fifth group, which would have
received no feedback of any kind. But the negative implications
332
Language Learning
Vol. 46, No
for grammar correction are reasonably clear nonetheless. For one
thing, the amount of information contained in the feedback varied
so much among the four groups that one would expect significant
differences among them if the information were at all valuable.
That there were no differences argues strongly against its having
any value. Moreover, the practical difference between the hypothetical fifth group and the actual fourth group would have been
small. In fact, Frantzen and Rissel (1987) found that, even when
told the exact location of an error, learners usually could not
determine exactly what that error was; in view of this finding, it
would be extremely surprising if the learners in the fourth group
gained any insights from their much more limited information. So
one can reasonably treat these learners as a control group. The
lack of any contrast between them and the groups that received
more informative feedback thus provides good evidence for the
ineffectiveness of grammar correction.
More evidence of this ineffectiveness comes from Kepner
(1991), who experimented with two forms of feedback in intermediate Spanish as a foreign language (FL) courses. Half the
participants received comprehensive correction on sentence-level
errors with brief explanations or statements of rules; the other
half received comments on content instead, written in the target
language. Kepner then checked their sixth assignment, written
after 12 weeks of instruction, for grammatical accuracy, as measured by a count of all grammar and vocabulary errors. Kepner
checked the quality of the writing's content by measuring the
number of "higher-level propositions" it contained. Kepner
found no significant differences in accuracy. However, students
who had received content-oriented feedback were significantly
superior in the measure of content. These results held for both
high-verbal-ability and low-verbal-ability students, and there
were no significant interactions between the variables. Thus,
once again grammar correction was not helpful.
Sheppard (1992) experimented with two different types of
feedback in a writing class. One group received comprehensive
responses to errors, using a correction code, and discussed their
Truscott
333
errors (and nothing else) in conferences with the instructor. For
the other group, feedback and conferences dealt exclusively with
the content of the students' writing. Thus, if error correction were
helpful, the content group should have suffered on measures of
grammatical ability. However, Sheppard found no advantage for
the error-correction group, the results actually favoring the content group. In accuracy of verb forms, there were no differences
between the groups, both improving significantly. For accurate
marking of sentence boundaries (through appropriate punctuation), the content group made significant gains, the error group
did not, and the difference was significant. Finally, on a measure
of the complexity of students' writing—
the relative frequency
with which they used subordinate clauses—
the content group had
no significant changes, although the error group got significantly
worse (though there was no significant difference between the two
groups on this measure). Sheppard attributed this latter result to
an avoidance strategy on the part of the students who had been
frequently corrected—
their fear of making mistakes led them to
limit the complexity of their writing.
Thus Sheppard's (1992) work resembles that ofSemke (1984)
and Kepner (1991). Correction was not only unhelpful in these
studies but also actually hindered the learning process.
Finally, a few additional studies are worthy of notice. Work
by Steinbach, Bereiter, Burtis, and Bertrand (cited in Carroll &
Swain, 1993) found that feedback on compositions had no benefits
for students' grammar, diction, or mechanics. Similarly, VanPatten
(1986b, 1988) described two studies by Dvorak, one covering a full
year, in which lack of correction did not affect students' accuracy.
Dvorak's research was primarily concerned with oral correction,
but apparently covered some written work as well.
Some Possible Limitations of the. Research;'\ •.
The studies discussed above show that the situation for L2 is
the same as for LI: Grammar correction in writing courses is not
helpful. Any interesting research is subject to alternative inter-
334
Language Learning
Vol. 46, No. 2
pretations, though. The variable relevant here is the use or
nonuse of grammar correction, but a number of other factors could
have influenced the results of the experiments. However, all the
obvious candidates can be discounted.
First, the results probably cannot be explained by the difference between FL and SL learning, the identity of the target
language, or the learners' LI. The studies that found correction
ineffective included ESL, EFL, German FL, and Spanish FL;
besides, the students' origins and LIs differed widely.
Another factor that can probably be dismissed is the form of
correction used. The studies varied between direct techniques
(learners given correct forms for each error) and indirect ones
(errors pointed out, usually by means of a code, but correct forms
not given). In addition, Robb et al. (1986) alone included four
different degrees of directness. The case is somewhat less clear for
the other major variable of this sort—
the difference between
comprehensive and selective correction. Most of the studies
reviewed here relied on the former, but Hendrickson (1981) used
both types and found no difference between them. Also, the LI
research described above found comprehensiveness of correction
irrelevant. Additional reasons to doubt the value of selective
correction will be presented below.
Another explanation of the results is that the correction used
in these studies could have had a delayed effect that did not show
up during the research. However, available evidence argues
against such a view: Robb et al.'s (1986) study, covering mid-April
to mid-January, showed no more evidence of beneficial effects
than did studies lasting a single semester or a single quarter. Nor
did Dvorak's year-long study. Besides the lack of any a priori
reasons to expect delayed effects, this (admitedly limited) research makes the possibility of such effects rather remote.
Another possibility: The results were affected by the types of
assessment used. However, this argument also has plausibility
problems, for several reasons. First, all the studies used actual
writing samples from the students (rather than relying on grammar exercises, for instance); so in terms of authenticity (Bachman,
Truscott
335
1991; Hoekje & Linnell, 1994; Skehan, 1988,1989) this work fares
quite well. Second, they used a variety of measures. For accuracy,
these included counts of all grammatical and lexical errors in two
studies (plus style in one of them), verb form problems in two
others, and an independent measure of sentence boundaries in
one. They also frequently included measures of quantity and
complexity of writing, and one study added a cloze test. Not one
of these measurements found any significant advantages for
students whose writing had been corrected. Third, the measures
used in some of the studies did find significant differences between
groups—
always favoring the uncorrected students. Some also
found significant gains (and occasionally losses) from pretest to
posttest. Clearly, these measures can detect such differences.
Thus, that none of them found any significant advantages of any
sort for corrected students must be taken seriously.
Similar comments apply to differences in the type of instruction used in the various studies. The authors provided only
limited information, but this information suggests substantial
variation. In Robb et al.'s (1986) research, most of the class time
was devoted to correction practice and sentence-combining. Kepner
(1991) described her classes as proficiency-based, with a large
concern for personal growth and the development of faith.
Sheppard's (1992) students, in addition to extensive writing
experience, read two novels and underwent selective grammar
instruction (on topics overlapping the points later examined in
their writing). Thus, the consistent failure of grammar correction
probably cannot be attributed to any particular form of instruction; the studies vary substantially in this regard.
Nor can the results be explained by what was done or not
done after corrections were made. In Kepner (1991), students
apparently were not required to do anything with the corrections,
but rewriting was a requirement for all the students in Robb et
al.'s (1986) and Sheppard's (1992) studies. In addition, some of the
materials Cohen and Robbins (1976) examined included rewrites,
and Semke's (1984) indirect-correction group rewrote all their
assignments. (This group showed no advantage in grammatical
336
Language Learning
Vol. 46, No. 2
ability and was inferior to the uncorrected group on a cloze test
and to all the other groups on fluency.) Hendrickson (1981) did not
use rewriting, but after each assignment was returned, set aside
class time for students to study the corrections they had received.
It is also unlikely that the lack of benefits can be explained
by the students' proficiency level or ability. The classes studied
ranged from beginning to advanced levels of language proficiency.
In addition, Hendrickson (1981) included communicative proficiency as one of his independent variables, and Kepner (1991)
included verbal ability; neither found any effect.
Of course, other learner variables could be crucially involved;
learners differ from one another in an enormous number of ways
and the research discussed here considered very few of them.
However, though such a possibility cannot be ruled out, it remains
no more than speculation.
However, assume for the sake of argument that learner
variables are crucial to the effects of grammar correction: that
certain types of students do benefit. A new problem now arises,
because the knowledge that such students exist will not be helpful
unless instructors can determine exactly who they are. But for
now this is not realistic. The (hypothetical) distinction between
those who benefit and those who do not could rest on any number
of variables, such as gender, age, educational background, aptitude, field-independence, tolerance for ambiguity, anxiety, or any
of countless others. It might depend on certain characteristics of
the teacher or of the learning environment. It might involve some
complex interaction of some or all of these factors, or of these and
other, unknown factors. So research cannot identify correctionbenefitters now and is highly unlikely to be able to do so in the
foreseeable future.2 Thus, for the practical purpose of evaluating
grammar correction in educational settings, it makes no difference whether they exist or not.
Last, and perhaps most interesting, the negative results in
these studies could have been due not to problems inherent in
correction but rather to bad timing. Researchers investigating
naturalistic. L2 learning have found clear and consistent orders in
Truscott
337
which learners acquire certain grammatical structures; other
research has found these same sequences in formal classroom
learning situations, in spite of instructional sequences that run
counter to them. This raises the possibility that the corrections
used in the research described above failed because they did not
respect these sequences: Teachers corrected students on grammar points for which they were not yet ready.
The research on developmental sequences originated in the
morpheme studies of Dulay and Burt (1973, 1974), Bailey, Madden, and Krashen (1974), and Perkins and Larsen-Freeman
(1975). This work has since become the subject of some debate
(Dulay, Burt, & Krashen, 1982; Larsen-Freeman, 1976; LarsenFreeman & Long, 1991; Rosansky, 1976) and therefore cannot be
considered conclusive. However, subsequent work on a variety of
languages (e.g., Cancino, Rosansky, & Schumann, 1975; Ellis
1984, 1988, 1989; Felix, 1981; Hyltenstam, 1977; Pienemann,
1984,1989; VanPatten, 1987; Weinert, 1987; Wode, 1984) has left
little doubt that developmental sequences are real. This conclusion has met wide acceptance among SLA researchers (e.g.. Cook,
1993; Dulay et al., 1982; Ellis, 1990; Harley, 1988; LarsenFreeman & Long, 1991; Lightbown & Spada, 1993; Littlewood,
1984; VanPatten, 1986b). It signifies, for present purposes, that
grammar instruction (or correction) that does not respect these
sequences will probably encounter problems.
Thus, the failure of grammar correction in the research could
be due to lack of concern with timing. However, the significance
of this possibility is limited. There is no distinct evidence that
properly timed correction will be effective; the possibility remains
hypothetical. In addition, current knowledge of sequences has
serious limitations, as well as serious questions about how to
apply the research to the classroom. So, researchers might (or
might not) need to reexamine the research on grammar correction
in light of work on developmental sequences. But the latter does
not now make a compelling case that correction can be effective,
even in principle. It certainly offers no reason to think that
correction can be effective in the classroom now.
338
Language Learning
Vol. 46, No. 2
Nonevidence for Grammar Correction
The discussion of possible limitations on the research has
arrived at the same conclusion reached previously: Grammar
correction (at least in any form now available) does not work. It
is not enough, though, to show that many studies have obtained
negative results. A number of additional studies are commonly
presented as evidence favoring grammar correction; it is necessary to look at these as well. However, none of them contradict the
negative findings described above, primarily because none of
them actually address the present issue: Does grammar correction in writing classes make students better writers (better in any
sense)?
First, it is not unusual to find vague references to works that
seem, in the context of the discussion, to provide evidence that
correction works, but actually do not even attempt to do so. Two
examples will suffice: Higgs (1979) and Gaudiani (1981). The
former is simply a detailed description of Higgs' preferred method
of correction. Similarly, Gaudiani simply provided a design for a
writing course along with guidelines for teachers who wish to
implement it. Neither provided, or claimed to have provided,
evidence for the effectiveness of correction; they assumed that it
is effective.
Another work sometimes cited as evidence is Kulhavy (1977).
This paper is a review of research on feedback, but it is not about
feedback in language classes. Kulhavy was concerned primarily
with programmed learning in assorted content areas, a type of
learning far removed from the process of acquiring literate skills
in the use of an L2. There is no basis for generalizing Kulhavy's
findings to language learning or, more specifically, to the improvement of accuracy in students' writing.
A number of other studies commonly cited in discussions of
correction deal only with oral contexts and therefore have little
relevance to the issue of correction in writing classes (e.g.,
Chaudron, 1977; Herron, 1981; Herron & Tomasello, 1988; Ramirez
& Stromquist, 1979; Tomasello & Herron, 1988, 1989). In addi-
Truscott
339
tion, this oral research's credibility is weakened by a number of
other studies that found oral (or in some cases the combination of
oral and written) correction ineffective (EUis, 1984; Felix, 1981;
Holley & King, 1971; Lightbown, 1983a; Plann, 1977).
Fathman and Whalley (1990) studied the process of revision,
having one group of ESL students revise their compositions with
the benefit of comments from the teacher, while a second group did
their revisions without such comments. Not surprisingly, the
former group produced better final drafts than the latter. This
result, though interesting and valuable, does not address the
question: Does grammar correction make students better writers?
Fathman and Whalley have shown that students can produce
better compositions when teachers help them with those particular compositions. But will those students be better writers in the
future because of this help? Nothing in this study suggests a
positive answer.
Lalande's (1982) work appears more relevant; it did look at
the effects of correction procedures in writing classes and was
concerned with effects beyond the particular composition being
considered. But it too actually dealt with a question distinct from
that being considered here. Lalande's purpose was to test a
composition teaching method he developed, involving comprehensive correction by means of a special code, extensive rewriting
based on the corrections, and the use of a table showing the type
and frequency of the errors committed by each student throughout the course. The experimental group went through this
program, but the control group—
this is the crucial point—
was
taught through what Lalande described as a traditional type of
writing course, which included comprehensive correction and
rewriting based on the corrections. Thus, Lalande did not compare the effects of correction with the effects of noncorrection, but
rather with the effects of a different form of correction; as a result,
he found his own version to be significantly better than the
traditional alternative.
However, "better than" could just as well read "less harmful
than". The significant difference between the two groups resulted