![]() |
College
Writing Assessment
Online
Community & Resources |
![]() |
|||||||||||||||||||||||||||||
|
New Jersey Institute of TechnologyBest Paper Assessment ProceduresEach semester, we hold a best paper reading when the course is complete. Instructors inform their students that their papers will be evaluated by the stated criteria (Figure 1) and, with their students, select the best papers for review. Instructors then meet at the end of each semester to review the papers. Historically, the papers have been read reliably by our experienced readers using a holistic assessment method; that is, the inter-reader reliability coefficients have always exceeded .6, the lowest minimum coefficient of correlation set by the Department to establish the inter-reader reliability needed to undertake further analysis. ![]() Figure 1 Each semester, instructors—untenured and tenured, one-year appointments and lifers—gather together on Reading Day at the end of the semester to review papers on the six point scale. Readers use sample training papers to augment the scale and resolve discrepant readings (those that are not matching or adjacent) in order to award a score from 2 (the lowest) to 12 (the highest). At the conclusion of the reading, program coordinators create a one-page summary sheet, (Figure 2) to use as a touchstone for discussion with instructors about the level of student achievement. ![]() Figure 2 In the instance shown here, instructors may be confident that they are evaluating as a community and scoring the best papers reliably (r = .79). In addition, it is clear that the scores are well distributed along the scoring matrix (from 2 to 12). And it is also clear that the scores across semesters are consistent, with no significant difference found across two administrations (t = .941, p = .1737). Thus, faculty and administrators may be confident that our assessment has verified that graduating seniors demonstrate proficiency in writing ability as measured by our assessment community. Assessment Procedure: From Portfolio to Cluster (taken from "The Assessment of Technical Writing: A Case Study"
in the Journal of It soon became apparent, however, that it would be impossible to evaluate complete portfolios The amount of information, so vital to the pedagogical life of the course, was simply too overwhelming. Because they included all drafts, the portfolios simply provided too much information. Even if the portfolios included only an unmarked final version of each assignment, the sheer volume of reading would inhibit efficient group scoring. We finally decided to evaluate clusters of student writing taken from
the portfolios. After an end-of-semester conference with their instructors
to select their strongest samples, students were asked to submit their
cover letters, resumes, and two best samples. The cluster, though only
a subset of the portfolio, allowed us to sample the best writing of our
students. To set guidelines for inter-reader agreement, we decided that only matching or adjacent scores (e.g., 4/4 4/3) would be acceptable; scores differing by more than one point (e.g., 4/2 or 3/1) would be termed discrepant These discrepant scores would be subtracted from the overall rate of agreement to produce an inter-reader agreement scale. Near the end of the semester, all instructors met for a four-hour reading sessions. Six pre-selected sample clusters were presented to the readers. Readers began by reviewing a cluster illustrating a high level of student performance; its merits were described in terms of the rubric. Readers then reviewed a second cluster, a sample demonstrating low performance. The final four samples were read in silence and the readers nominated a score for each. Extensive discussion then followed. When the group felt it had reached consensus, the "live" papers were then scored. Each cluster was read and scored by two readers, neither of whom was the student's instructor. To continue the objectivity provided by this measure, scores were masked so that the second reader could not be influenced by the score given by the first reader. Clusters with discrepant scores were then given a third reading, although the readers did not know they were reading such papers. This procedure was followed consistently in the second (1989-1990) and
the third (1990-1991) years of the study. We evaluated 308 clusters in
all. |
|
|||||||||||||||||||||||||||||
|
Freshman Composition | Senior
Capstones | Technical Communication
| MSPTC |
ESL Portfolios | Best Papers | Timed Samples | Home |
|||||||||||||||||||||||||||||||