| Case Study Of A Japanese Learner
Darren Elliott (MA ELT, DELTA)
Writing
I looked at three samples; a drafted MA assignment with
N's corrections, a short, set writing task about reading
and listening skills and a collection of informal email correspondence.
Speaking
Again, samples ranged from the formal (a recording of a
presentation given as part of an MA assignment) to the informal
(an interactive speaking activity, recorded telling a story
to other MA students at dinner). The interview and a pronunciation
test reading provided further data.
Reading
Reading assessment came largely from N's own analysis in
interview and writing.
Listening
N's own evaluation of his listening in interview and writing
was the basis for this study, supplemented with observation
of performance in MA seminars and lectures.
To define two of the major considerations in test construction,
reliability is the achievement of consistency in scoring.
To put it simply - if group of students take a test on a Wednesday,
they should achieve the same results on Thursday (Hughes,
2003: 36). A wide range of factors might affect this, from
the environmental (the temperature of the room, the comfort
of the furniture) to test features (item types, clarity of
the rubric) to scorer characteristics (standardisation procedures,
sufficient marking time and training).
A test can assert a certain degree of validity if it tests
what it is meant to test and if there is 'correspondence between
the test and the non-test activities that the scores are expected
to reflect.' (Luoma, 2004: 184).
In testing it is commonly accepted that 'there is a trade
off between [reliability and validity]' (Alderson et al, 1995:
187). That is, one might make claims for the reliability of
a carefully-designed multiple choice vocabulary tests, as
scorer consistency is easier to achieve. However, the validity
of such a test is possibly limited as a measure of the ability
of the examinee to understand and produce the vocabulary in
real life situations. Weir (1988: 34) argues that for a fuller
learner profile the largest possible sample of data should
be collected. Scale often makes this impractical, but with
only one subject I have been able to focus on the data in
a way which an institution designing regular assessment would
not.
The assessment of validity can be further broken down into
a number of types. Baxter focuses on three of these sub-divisions;
content validity, construct validity and face validity (Baxter,
1997: 18).
Content validity assesses the fit between the syllabus and
the test (Underhill, 1987: 106). In this case, there is no
syllabus of language study, so I am testing N's ability to
fulfil his academic obligations in English. If content validity
need not be linked to a course, but can be based purely on
authentic use, then this method of testing can be deemed to
have content validity.
Construct validity is seen by many as validity itself –
the main notion into which other types of validity feed, although
it '…essentially involves assessing to what extent the
test is successfully based on its underlying theory' (Alderson
et al, 1995: 182). It is, according to Weir (1988:27) often
linked to a close fit between the test and the teaching that
comes ahead of it. N attended a pre-sessional EAP course in
preparation for the MA, taught by some of the same tutors
who guided him on the MA.
I would like to turn finally to the question of face validity;
simply put, whether the test appears to test what it is trying
to test (Baxter, 1997: 20). If the test takers and scorers
cannot see how this test relates to the aspect being tested,
they will lose confidence and the test itself will suffer.
Face validity 'is not a scientific notion' (Hughes, 2003:
33), and cannot really be measured as such, although test
designers can judge the reactions of scorers and examinees
through interviews, questionnaires and other such methods.
The methodology under discussion here is transparent in what
it aims to test, and is clearly directed to the learner very
specifically. N was happy to participate in the study as the
benefits to him are clear, so face validity is high.
The main problem with focusing on authentic samples and
not using a standardised test is that of practicality. It
has been difficult to mine the rich data collected in this
study for strengths and weaknesses, and a practice test with
an answer key would no doubt have been a simpler option. However,
Tarone & Yule (1989: 77) point out the danger of variability
in testing. No matter how well designed a test, it is hard
to correlate precisely accuracy in tests with production in
authentic discourse. That is why I focused almost solely on
samples from my subject's real life.
Evaluation and analysis of the learner's
strengths and weaknesses
Productive Skills: Writing
Strengths
Task fulfilment and effect on the reader
Each of the samples fulfils its communicative purpose to
a certain extent. The emails, although error prone, create
a strong impression by appealing directly to the reader (D37
Right?, D40
YOU know the reason).
Appropriacy of style and genre
Again, the writer has a strong understanding of the conventions
of email and uses capitalisation and vocabulary for emphasis
or dramatic effect (D25
"but what they mailed me back is totally THE SAME", D41
"Just because they are tamed"). These techniques are not used
in the semi-formal writing, done for the purposes of this
study, or the assignment draft, where they would have been
inappropriate. The writer has done especially well to select
the appropriate tone for Appendix
C without any clear guidance. This is what I had hoped
for from an advanced English learner.
Organisation, cohesion and layout
Paragraphing is accomplished, as expected for the level.
In the academic assignment, the writer uses a number of cohesive
devices to link his ideas accurately and clearly (B70 However,
B73 That is,
B86 As a result,
B97 To cover these factors). These and other linkers are
both inter and intra-sentential. The range of linkers in the
emails is more limited, D19,
21, 23, 27 "so") but appropriate for the genre. The reader
is still able to understand the chronological order of the
story and the repetition of language (lexical cohesion) helps
to give a sense of the frustration of the writer.
Range of lexis and grammar
N attempts a broad range of both complex and simple structures
in his academic writing as to be expected for this level.
The range of lexis in the assignment is appropriate for an
advanced learner, with use of collocations and set phrases.
Accuracy of lexis and grammar
Lexis is used accurately for the most part, and grammatical
errors do not interfere with communication. N has been very
careful to proofread his assignment and, with the help of
his tutor, correct his grammatical errors, as seen in the
second part of Appendix
B (
B235 – D505).
Weaknesses
Task fulfilment and effect on the reader
Appendix D demonstrates that it is difficult for the writer
to answer the question fully and clearly for the reader in
academic texts at this level. This will be discussed further
under the category of organisation.
Appropriacy of style and genre
Although generally appropriate, occasional stylistic techniques
in Appendix
B are too informal for this genre. (
B27 "Why is this so?" rhetorical question). The switch
between the theoretical and the practical is somewhat challenging
and sometimes jars.
Organisation, cohesion and layout
At sentence and paragraph level, all the texts are well
organised and punctuated for the genre and level. However,
as stated above it is difficult for the reader to follow the
argument in Appendix B due to the overall arrangement of ideas.
This is something N knows is challenging for him. The problem
is set out at the beginning of the piece but no solution is
offered until the end in the Japanese style. The assignment
would benefit from restructuring to bring the proposals to
the fore, a discussion of each, and a restatement at the end.
Range of lexis and grammar
Reporting verbs are overused at this level ("state"
B21, 33, 42, 48, 80, 122, 134 etc, similar use of "assert").
Accuracy of lexis and grammar
A number of grammatical errors come up in Appendix B, and although these are informal emails and errors are
more acceptable, tense mistakes (B20 "I noticed that no book has arrived", B13 " Before I start travelling this time") are quite
surprising at this level. The grammatical corrections in the
latter part of Appendix B are a wealth of data. For examples, at this level one
would expect fewer mistakes with passive forms (B256, 265). Lexis is selected well for the most part,
although word forms are sometimes problematical (D11 frustrating).
Page 1
2 3 4
|