Key URLs and Links from Talks

Brian Huot:
The Big Test by Nicholas Lemann
On a Scale: A Social History of Writing Assessment in America by Norbert Elliot
Standards For Educational And Psychological Testing 1999 by AERA
Assessing Writing: A Critical Sourcebook by Brian Huot and Peggy O'Neill

Bob Cummings/Ron Balthazor:
No Gr_du_te Left Behind by James Traub
EMMA, UGA's electronic and e-portfolio environment

Marti Singer:
GSU's Critical Thinking Through Writing Project



Wednesday, October 24, 2007

8:45 – 10:15: Understanding and Using Mandates for Writing Assessment as Opportunities for Improving Our Programs and Teaching

Brian Huot, Kent State University Department of English

Along with Peggy O'Neill, Brian Huot is the editor of a soon to be published collection of essays on assessment from Bedford/St. Martin's called Assessing Writing: A Critical Sourcebook.

To learn more about Brian and Peggy see Assesing Writing: About the Editors.

To see an annotated table of contents see Assessing Writing: Contents.

To read the introduction to the collection see Assessing Writing: The Introduction.



Brian's talk:

Raw, directly typed notes. Typos and errors are mine –Nick Carbone—not Brian's.

History
History of search for reliability, mainly inter-rater reliability. If you had a target, reliability would be how often you hit a part of the target. It's about consistency more than being dead center.

1912 Starch and Elliot on how English teachers don't agree on grades they give students.

1930's SAT's were new and could help w/ scholarship. Used for students who needed to get scholarship and info had to get in sooner.

12/1941 WWII SAT becomes part of national security zietgiest. SAT grows as CEB gives ECT (English Comprehensive Test) – they right prompts and teachers read and grade. No reliability.

Nicholas Lamont the _Big Test_ ETS out in 1947.

Norbert Elliott, _Writing on a Scale_

In 1950's common to assess writing with no writing at all.

No we have computer scoring: erater, accuplacer, and scores are more reliable than what you get w/ human readers (but at sacrifice of validity).

From moving to arena where you have people who cannot agree to a machine that always agrees.

Validity
Intelligence testing takes off at turn of century in response to laws demanding universal education. So children whose parents never went to school and others come and they're hard to teach. What worked for privileged didn't work for masses. So testing began. Intelligence testing starts to measure students to find out why they can't learn.
Don't hear validity defined. Don't hear a lot about it on validity. Trusted testmakers to affirm that their tests were valid for measuring the thing they purport to measure.

Validity: how well a particular measure correlates to another measures of the same thing. It becomes a circle of one test checking another.

Traditional approach: does a test measure what it purports to measure.

In the '50's a test is valide if it serves the purpose for which it is used. This raises a question of worth and value: what's the value of this test? Do you have a rationale for having this assessment? How will it increase

Three keys:

Content Validity -- is it on the content and skills it purports to measure. are writers writing.

Criterion Validity – is it a consistent measure and does it match other measures

Conformity of measure with phenomenon of interest – does it support theory of what good content is


Compass Test: Indirect test of writing (yeah right). Instead of grammar test, it's an untimed editing test on a computer and that information is used to place you in writing courses (750,000 students use it).

To get validity, you have a group of essays and the correlation of compass test and scores on essay test were close enough to match (criterion validity)

The degree to which an argument can be made for integrated

n it's about judgment
n it's about decision making
n it's not about the measures you use, it's about the decisions you make
n validity is partial and always ongoing
n it assesses the assessment
n it's about consequences

For example if instructors will do some things and not others, you can't force them or measure the thing they won't do.

Reliability is subsumed under validity. Any argument about validity must consider reliability. Newer assessment models where folks don't give scores, works better. So instead of wholistic scoring, use a scheme that asks teachers which classes a student should be in.

If assessment has no benefits for teaching and learning, don't do it. But assessment is part of our jobs and can impact in a positive way.

If you know a program is working well, assessment can protect it. Make your own assessment or you will be assessed upon.

_Standards for Educational and Psychological Testing: 1999_ not from NCTE, CCC, this is from measurement community says is ethical and responsible use of assessment.

Research

Kent started new program. Moved second course to sophomore year. Move from literature based CT to process based comp program with more computer use and multimodal text.

Prepare folder:

syllabus – all of assignments and handouts.
sample of student writing: Above, at, and below students should be doing.
One page self-assessment of how class went from teacher's point of view.

Pay a $100 per folder. Ten teams w/ four people in each team and every team will ready four portfolio. Pay readers a $100 to read. Cost for large program $15,000.00

This will give them snapshop of what their program is doing and will get a sense of how well courses are meeting requirements and goals of new curriculum. Removed from personnel.

People will also get a sense of cool ideas from other classes.

This will give a sense of what's going on.

Will write a report to administration on what is going on. Will see patterns if certain goals aren't being met, can address that.

No performance by students except for samples. In year five they will do deeper research on student performance.

Opportunity

Assessment can be a community building activity. They can talk about teaching and students, but need to do it so people don't feel under the gun. Research into second year is not evaluative but descriptive because they are asking people to change the way they're teaching. It's radical change for some, for example, teaching in a computer classroom. So adjustment period is required and it needs nurturing, not chain-ganging, of instructors.

10:30 – 11:15: Outcomes-Based Assessment and Writing Portfolios: Feedback for Students, Insights for Instructors, Guidance for WPAs, and Data for VP

Robert Cummings, Columbus State University English Department, for himself and Ron Balthazor, University of Georgia at Athens English Department

Bob added the following comment, that I want to elevate because you'll find it helpful:

Ron and I would love to hear your opinions, positive or negative. Please feel free to contact us at either cummings_robert1@colstate.edu and/or rlbaltha@uga.edu.

Nick's raw notes on Bob's and Ron's presentation

Bob: Evaluating Levels of Cognitive Thought

Context: Spellings leveraging NCLB like practices to go after colleges via calls for accountability assessment.

quote from "No Gr_du_ate Left Behind," James Traub, NYT Magazine, 9/30/07

To anticipate NCLB like tests of learning to "prove" education, testing companies are gearing up. For example, ETS has Major Field Tests. From what Bob and Ron could see, English major questions were cannonical, like GRE questions. Tests designed for broad scale assessment using multiple choice exams.

Purpose is to demonstrate learning outcomes for a field. Administrators go to these tests because they need to meet the purpose test claim to meet. Need to show to some constituent that learning is happening.

Meanwhile, the University System of Georgia is considering moving to outcomes rather than credit hours as way to measure achievement. If outcomes based testing is coming, then it's worth keeping this idea in mind:

"If the proverbial gun is being pointed at our collective heads, how can we improve our current assessment systems to meet these demands without shooting ourselves first?"

What teachers don't want:

* Bubble sheet exams to measure outcomes which attempt to nationally normed.

* Machine readers for student writing. (Computers are fast, efficient and dumb)

* Or human readers in Bangalore (Shorthand for doesn't want to outsource assessment.)

* Exams which are external to our curriculm.

* Assessment which determines our curriculum.

What teachers want:

assessment that allows us to teach to our strengths and caps on passions to determine curriculm

FYC Directors want:

to see who is learning to write better according to disciplinary measures and consistent w/ values

assessment that helps teachers strengthen their pedagogy.

U. Administrators want
assessment that is quantitative; is independently verified; broad enough to allow comparisons across system institutions so every institution is a winner in some way.

We have to have some type quanitative element to assessment because of ETS/College Board precedents.

How do we meet all disparat these wants?

One idea is to use an Outcomes Based E-portfolio Assessment

At UGA eportfolio requires reflection, two revised essay, bio w/ image, peer review, and so on

End of semester eportfolio is capstone and must be substantial part of course

Reflective intro would be on board of regents first year composition outcomes, describing how writing artifacts from course satisified those outcomes.

Student reflective essay connects course artifacts in portfolio to the BOR outcomes

Persuade the reader the artifacts meet the outcomes.

Instructor of record using wholistic rubric to assess the outcomes reflective essay.

Same essay also read by another FYC comp instructor in USG system and scores reflective essay w/ same rubric as instructor of record.

What would this do?

  1. Have departmental discussion about what works
  2. Increase student transference, or know what they learned sooner, and are able to appy it only courses.
  3. Student, teachers, depts have artifacts to share w/ external stakeholders: parents, employers.

Problems with a portfolio outcome assessment

Students

What if I don't want to switch to portfolios?

What's an outcome?

Teachers

Is this extra grading?

I don't want to use portfolios.

I don't use e-writing platform. I like to hold the paper. I like to grade in bed. I don't like compclass.

Administrators

Will reflective essays yield authentic prose?

Will my teachers suspect the panopticon?

Next step: Create a pilot.

Hardest thing to overcome after a pilot? – most likely the work says Bob.

11:15 – 12:00 Assessing Critical Thinking through Writing: a University-wide Initiative

Marti Singer, Georgia State University's Rhetoric and Composition Program



Nick's Raw Notes From Marti Singer's Presentation



Key URLs from Marti's talk:
Critical Thinking Through Writing: http://www.gsu.edu/ctw/

Marti currently chairs the SACs accreditation committee.

Leading up to that, she was part of a committee that worked on establishing university wide learning outcomes (LOs). In 2003 that committee came up w/ the following LOs:

Written communication; critical thinking; collaboration; contemporary issues; quantitative skills; technology; and then added an oral communication.

People on campus were aware of these new outcomes, but not always involved. Thus the LOs were on paper, but not yet in the curriculum, as was learned from 2005 reports on LO implementation university wide.

In 2006, university bought into Weave Online – an outcomes based software program that helps you analyze progress on outcomes.

Fall 2006 Marti asked to be chair of UW assessment committee to analyze Weave results.

Assoc. Provost for Institutional Effectiveness – had them look at one thing for SACs and they decided everyone cares about thinking and writing so that was focus. WAC helped and they developed Critical Thinking Through Writing initiative: http://www.gsu.edu/ctw/

Using U. Sentate committees to help get faculty buy in. Made a motion to require every student at G. state to take two courses that offer CTTW.

Have u-wide assessment committee (big committee, 15 people) who approve the CTTW course for each major.

Next step was group called coordinators who offer workshops to teach and help faculty create assisgnments that meet CTTW: What is critical thinking in . Hard to do.
Then when you add the writing piece, it's even harder.

How do you encourage disciplines to come on board if you say their writing has to look like mine?

As of today, all of the ambassadors (46) are to put forth and share the draft of their initiative plan and courses they are going to use to meet the CTTW goal.

Look at drafts and do follow up workshops in November.

Have to define critical thinking
Have to describe why they chose the courses they did.

1:00 – 1:45: Restrict Rubrics to Regain Rhetorical Responsibility

Rebecca Burnett, Georgia Institute of Technology, School of Literature, Communication, and Culture (LCC)


Nick's Notes:

Restrict Rubrics to Regain Rhetorical Responsibility:

Rubrics have huge benefit because of work load; they make it possible for time-challenged instructors and placement readers and other assessors to provide quicker feedback on writing. (Rubrics are often necessary for survival.)

But there is a tension between rhetorical theory and how rubrics are applied.

Rubrics have an inherent caste system:

There is the one size fits all rubric
or
the custom rubric

One-size-fits-all example: superficial commercial tools; Rebecca recalled seeing a site that boasted instructors could "create rubrics in seconds." Not just commercial issue, however; many higher ed. sites –including lots of WAC programs—offer examples of rubrics that fit this model. Maybe not necessarily intentionally, but certainly when an "example" of rubric is taken and applied without any thought to making custom changes to it.

custom rubrics:
Let you set item and cat criteria; when enacted programmatically, enable raters to compare artifacts. This use of rubrics make certain assessments not only easier but possible, i.e. home-grown placement exams where a program designs rubrics to match their curriculum and course goals.


Student benefits to rubrics:
• assignment specific feedback
• enable one kind of self assessment (Herb Simon says experts can self assess and teaching students to self-apply rubrics can help them to do that).
• encourage peer assessment – students use language and criteria of rubrics as guide to critically (and constructively) reading peers' essays.
• identify competence and weakness


Teacher Benefits:
• focus and sharpen thinking on expectations
• facilitate movement of instructors to new courses
• provoke new thinking about a process or an artifact
• provide consistency in multi-section course
• support Composing or Communication Across the Curriculum

Admin Benefits
• demo that courses deal w/ considerably more than "correctness"
• provide benchmarks for rater consistency, providing reliability for large-scale assessment.

Yet, for all these benefits, major and important-to-consider complications exist.

To excavate these complications, Rebecca passed out a copy of a rubric she found at the WWW site of a Research I university. The rubric is used to help faculty across the curriculum know what good writing is and to give students feedback on the criteria that go into good writing. (Sorry table is not formatted much, nc):


Scale ------------ excellent --------- good------------ fair-------------- poor
WRITING QUALITIES:
accuracy of content
adaptation to a particular audience
identification and development of key ideas
conformation w/ genre conventions
ogranization of information
supporting evidence
spelling
punctuation
grammar
usage

Some of the issues w/ the above as brought out in discussion Rebecca lead:

No way to know how to apply rubric – checks, numbers, scores?

Concepts were vague and some things overlapped – i.e. genre and audience.

40 percent of it is on mechanics and surface errors.

Reality is this rubric form is reproduced all over the place, using something akin to this. And people feel like they're doing a good job because they're using them.

On the plus side, at least 4 point scale does force a choice on the writing (Brian Huot noted from audience); with odd number of scale options, people tend to choose middle of the road option a disproportionate number of times.

Other inherent rubric problems: what will the rubric encourage and reward?

• will it reward risk-taking
• or will it encourage conformity.

Lot depends on the content of rubric to say whether it encourages risk-taking or conformity.

NC questions: Can you write a rubric that encourages risk-taking? How do you do that and apply it?

Synergy of Communication is lost when using rubrics:

An argument is not inherently persuasive, nor is it persuasive is isolation.

Instead, an argument is persuasive for a particular situation, to a particular auidence.

The synergy is lost when rubrics are typically presented.

Rubrics by their very nature create bounded objects of rhetorical elements. That is, they isolate qualities as distinct when in reality, many of those qualities can only be inferred and judged in their synergistic relationship to other qualities. You cannot, in reality, separate a consideration of idea development from one of audience. How much development an idea needs often depends upon whom the audience is, what the argument to that audience intends, how much space one has to write in, and other factors.

NC questions: is it possible to apply rubrics with synergy kept in mind? Can you assess development in light of intended audience? If so, how does the rubric scale and scoring communicate to the writer that the judgment on development was tied to an understanding/judgment on audience?

How might that look? What if a rubric was based on rhetorical elements? asks Rebecca.

Rhetorical Elements
• sufficiency and accuracy of content PLUS
• culture and context
• exigence
• roles/responsibilities
• purposes
• audiences
• argument
• organization
• kinds of support
• visual images
• design
• conventions of language, visuals, and desing
• evaluation criteria

Thinking about these things in interrelated way, some if forefront of mind, some absorbed/assumed or native even.

It's not about the specfic elements, or that you list them, all, but how they interact.

Alternative scale to excellent good poor fair might use these terms:
• exemplary
• mature
• competent
• developing
• beginning
• basic

(These come Communication across curriculum program at ISU where Rebecca taught before joining Georgia Tech)


GT now using a WOVEN Curriculum

Written communication
Oral communication
Visual communication
Electronic communication
Nonverbal communication
…individual and collbarative
…in cross-cultural and international contexts

Assessment/Rubric Features:
• idiosyncratic (match to specific assignment)
• organic
• synergistic
• self-assessing
• methodologically legitimate
• theoretically supportable

Activity we did.
Develop an assessment plan for an assignment where students where doing a Health project. The scenario is they work for a company and HR wants to do a health campaign. The students are working in teams to develop five pieces:
Posters, 15 second "push" phone messages, a powerpoint w/ voiceover, a memo, a boolet on better health. Students are sent to appropriate government and health WWW sites and print sources for researching data necessary.

At our workshop, tables worked on what they would do for developing a synergistic rubric for such an assignment. RB said rubrics can be for both formative and/or summative assessment.
I don't have as many notes on table reports because I was spending too much time listening. I do recall that our table had some disagreement on what we would emphasize: overall project rubric, or rubrics for parts of projects. Brian H. noted that because each of the five elements was different in media and purpose and audience (memo was to company heads as progress report for example), each would need its own rubric.

Others felt the whole project needed a unifying rubric. I thought the challenge was finding a way to do both.

NC final thoughts and questions: I remember a room consensus thinking that ideally the feedback would come from seeing what worked. In a real office, the effectiveness of the campaign would be measured by changes in employee behavior and their reception to the campaign. But in fact, a lot of the work on feedback on such a project wouldn't be rubric-based. It would be discussion based, meeting based, and workshop based. The team and the team's managers would meet to discuss their campaigns, to answer questions on why something was the way it was, on why they thought it was effective. Assessment would be in the form acceptance, rejecting, or acceptance with revisions (not so dissimilar from academic journal procedures, only faster).

NC Questions: Can you really create a rubric that approximates in some way that dynamic? What if the rubric were used in a feed back meeting w/ each team in a teacher conference?

1:45 – 2:30 Collaboration and Transparency: Creating and Disseminating Program Assessment Materials

Karen Gardiner and Jessica Kidd, University of Alabama English Department


Nick's Notes:

UA is building a new writing program. The old program had –and this is my word, not our speakers'—atrophied, fallen into a kind of auto pilot where adjunct and TA training had faded and folks were teaching pretty much how and what they wanted in the FYW course.

TA training wasn't working

FYWP was inconsistent, no common goals, no common course curriculum

Carolyn Handa came in as new WPA has helped them to make constructive changes and strides.

At about the same time as Carolyn began, the Dean of the school (a Math Prof by training) become interested in learning-centered teaching (because of SAC review).

So there was both need to redesign FYWP for both the sake of program integrity and to meet the pressures for changes from above that the courses be more learning-centered (rather than lecture centered, as many of the FYW courses had become).

Fall of 2005 C. Handa started composition committee to derive goals, mission statement, outcomes for the FYC. She reached out to all levels of dept. from grads, adjuncts, tenure track faculty and so on. So using the fact that there was pressure (and support) from the top, Carolyn moved to change the program from the roots up, which is key to buy in from key people – those who will teach the new courses and curriculum.

People were excited and engaged to be part of the project. They started with creating mission statement which they wanted to meet these qualities:

  1. Mission Statement -- needed to reflect UA's and UA's A&S's mission statements.
  2. Needed to address whole spectrum of FY – from ESL to Honors
  3. Wanted to avoid negative language
  4. Didn't want to claim were handling all writing teaching needs for everyone.

Next, they established goals, objectives and then outcomes from each course.

To see what they accomplised, including mission statement and go to http://www.as.ua.edu/fwp

To see how a particular course was given a Goals, Objectives and Outcomes (GOO) go to

http://www.as.ua.edu/fwp/101cg_ro.html

Happening now:

Ongoing process, not nearly done yet.

Mission this semester is to move from GOO to rubrics. The process is recursive – establishing rubrics makes them look back and see some goals/outcomes/objectives impossible to do or not clearly defined.

The process and discussion remains highly collaborative – lots of emailing to lots of people.

Also important to process is keeping things very transparent in 4 ways:

  1. Simply creating a WWW site so people can visit and see what was going on. (couldn't get elephant to sing alabama fight song). Enable to share quickly w/ constituents of all kinds what they were doing, and with talking to high school teachers.

  2. Created a customized handbook w/ Comp Program. And use A Writer's Reference and Comment. Trying to brainstorm and then have Comment code in tab and so on to make things easier for students. Inserted Comment instructions and goals, objectives, outcomes, and other materials from program are part of the book. Really important tool in getting program information into hands of people who needed the info. Can change each year to reflect program changes and new policies.

  3. Site and handbook combine to help w/ TA training. Had prescribed texts that were picked by someone and TA was given the book and assignments to use and sent to class to work on their own. TA's can pick own books now, and the TA's can use anything that works for them. TA's take a 3 credit hour pedagogy course where they select an analyze texts, create a syllabus, analyze and create course materials. And then before teaching a week long orientation. Also have four teaching observations: formative (2 by peers; 2 by faculty). Student opinion surveys on class and learning. And a teaching dossier that uses the Dean's form. GOO also helps w/ teacher growth.

  4. Opportunity that comes with being this aware does for outreach. Karen's interest is high school to college transition. Working on alignment. Did workshop w/ 25 high school teachers. HS teachers wanted to see what the college was urging so they could see what they were doing. So they worked on the fly on creating assignments that HS teachers could give to their students to ready them for college.

http://www.thenexus.ws/english

Karen's personal goal is to create a site for HS teachers to find info on what college expectations. GOO's are not just for one's own program, but also for others, especially sometimes HS teaches and students.

NC observation: It's also worth noting that the transparency and public information does two things that are important: it promotes and advertises the program change and shows U. admins that progress on meeting UA outcomes and learning objectives is being made. Also, rubrics and other elements will provide benchmarks for program to measure their progess on over time, setting a way for fine-tuning, program self-assessment, and continued innovation.

High School connection helps with recruitment, but more importantly retention. All good stuff.

2:45 – 3:30 Using the Web to Enable Authentic Assessment Practices

George Pullman, Georgia State University's Rhetoric and Composition Program.


Nick's Notes:

Activity for table to set discussion stage and get people thinking:

1. What is assessment? Define it in two sentences (one would be better).

2. What is grading?

3. How do you make assessment not an add-on to the act of teaching?

4. How do you increase learning w/out increasing teaching? (NC: Some days I want to say, how do you not.)

+++++++++++++++++++++++
GP says, as WAC coordinator, I tell people about revision and they have students go off and write a ten page paper and then ask them to revise it. Now he is trying to ask for thought experiments in writing because writing captures student engagement. Also want to capture faculty response to that thinking.

Set up a WWW site/database where tought experiments via writing can be created, assigned, assessed via rubrics and written feedback by both instructors and students. Goal is to have lots of short thinking pieces, a kind of writing to learn, but tied more specifically as critical thinking through writing (ala Marti's talk).

Asks depts to write and attach their own rubrics for these experiments in thought. Real value for the department is the conversation on creating the rubric.

Other goal is to have content and data that can be used for study and review of the programs.

Trying to get people to construct assignments they are often not used to constructing.

Program is elaborated email system with a database backend. Can set assignments, rubrics, peer review and work goes out to email.

Write an asssignment attach rubrics. Rubrics offer dimensions for an assignment.

Rubrics are created by departments, not be individual instructors.

+++++++++++++
RB says: The assignment interface and use of text only makes assumption that communication is entirely about words. At GTech, RB trys to teach them design is part of communication and not separable. George agrees that the system he has designed doesn't accommodate the kinds of composition or critical thinking beyond the means of text that Rebecca would need or want to use. For example, says GP, people in studio design don't want to use the program. Because they think visually and the program is prose-based.

The goal says, GP, is on attempted thought, not finished or even complete thought, and then getting feedback and re-thinking.

3:30 – 4:00: What have we learned?

We'll use this half hour to thank our discussion and workshop leaders and to summarize what we've learned.