How would you feel about having your essays or short written test answers graded by a software program? Instead of getting results back days or weeks later from an instructor, you’d get instant feedback — and a chance to rewrite the piece for a better grade.
This is the artificial intelligence-based grading model that EdX, a nonprofit co-founded by Harvard and MIT, has just unleashed for the courses it offers on the web. And it will make the automated software available free online to any institution that wants to use it. Are the days of teachers and tutors commenting on essays numbered?
As the New York Times points out, this new service puts EdX at the forefront of a growing ideological conflict over the role of automation in educational testing. There seems to be little controversy over using computers to grade multiple-choice and true-false exams. But relying on artificial intelligence to grade essays is widely criticized by educators and has many vociferous detractors.
According to proponents of the technology, it’s “instant feedback” potential is a highly useful tool, enabling students to retake and iteratively improve their work. But are today’s AI systems anywhere near as capable of grading essays as a real, live educator?
There seems to be little evidence that they are. Indeed, according to the group Human Readers, computerized essay grading is, even at its best: trivializing, reductive, inaccurate, poorly correlated with actual writing performance, and unfair to second-language learners and cultural minorities. As the group’s website states:
“Computers cannot ‘read.’ They cannot measure the essentials of effective written communication: accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity, among others.”
Steve Nelson in a HuffPost blog calls the essay grading software “insulting.” He makes the compelling point that “no software can or will ever be able to discern the stuff that rests between the lines of poetry and prose.” Beauty and meaning in language rely on things left unsaid, ambiguous clues, and a broad spectrum of expression that the heart as much as the mind interprets.
Mr. Nelson also points out that automating essay grading “threatens to dehumanize and digitize a significant part of education, from pre-school to post-graduate.” Will future generations learn to write to please machines, rather than experience the give-and-take of interaction with human educators about their thoughts and feelings as expressed in words?
Grading “toolbars” and grammar checkers have been around for a while and undoubtedly have a place in the realm of computer-supported writing composition. But having Microsoft Word tell me I just used the passive voice is a very different thing from having a 2,000-word opinion piece on a thought-provoking topic evaluated by AI software whose algorithms are proprietary.
Interestingly, the EdX software is adaptive, however. It allows a human educator to grade 100 essays first, while it “observes the grading technique.” The software assigns grades according to whatever scoring system the teacher creates (e.g., letter grade or numerical rank). It can also ostensibly provide generic feedback, such as “whether an answer was on topic or not.” But is grading essays simply about “technique,” especially given that English is perhaps the most malleable and plastic language on earth?
Potentially this system can add value in accelerating the grading of short-answer questions. But for me, I don’t think the benefits of quick feedback outweigh the massive disadvantage that the feedback could well be a crock — offering insight into the mechanics of writing but blind to the nuances that are at the core of communication and persuasion.
Whatever its shortcomings and the potentially disastrous long-term consequences of its overuse, essay grading software will save time and money and facilitate the convenience of online learning. Thus it will be embraced widely and quickly, especially in MOOCs and other enormous classes where grading essays is challenging and often relies on a peer-grading model.
According to EdX, “the quality of the grading is similar to the variation you find from instructor to instructor.” No doubt the system will be improved over time. But is it “good enough?” We’ll soon see: the twelve institutions that currently participate in EdX, which offer certificates for the completion of online courses, will be using it.
Featured image courtesy of ntoper.
Our free 20-page ebook is a step-by-step guide on how to select the right test for your student. Learn everything you need to know about using the PLAN and PSAT to improve student scores, how to leverage learning analytics to select one test over the other, and other tips on how to take the guesswork out of selecting the ACT vs the SAT.