In this series of blog posts, we will explore five ways in which you can use Chat GPT to revolutionise your classroom. Part 1 explained how to create more personalised learning experiences while part 2 discussed how to use Chat GPT as a virtual teaching assistant. Part 3 explored how teachers can take advantage of AI’s language and communication skills, and Part 4 examined possibilities for collaboration and peer-to-peer learning using Chat GPT. Part 5 will conclude the series, and includes ways that teachers can use Chat GPT to increase their grading and feedback efficiency.

In these blog posts, I include screenshots of Chat GPT responses as well as usable prompts for teachers. Keep in mind, as Chat GPT is continuously updated, responses to these prompts may vary in content or format. Also, the version of Chat GPT used in these blog posts is the free version, not GPT-4, in order to make these methods more accessible to all teachers.

5. Grading and Feedback Efficiency

Chat GPT streamlines the grading and feedback process, saving teachers valuable time. Teachers can use Chat GPT to automate grading for objective questions such as multiple-choice or fill-in-the-blank exercises. Chat GPT can quickly analyse student responses, provide instant feedback, and calculate scores. This automation allows teachers to focus on more nuanced evaluation and personalised feedback for subjective assignments. Teachers can also use Chat GPT to track student progress over time, identifying areas of improvement and adjusting instructional strategies accordingly. The efficient grading and feedback process facilitated by Chat GPT enhances student understanding, motivation, and growth.

Throughout writing this series, I’ve become more and more wary about Chat GPT’s proclamations concerning its capabilities. This time, however, the above paragraph is an almost-perfect description of what Chat GPT can do. There are a few limitations, and we’ll cover those shortly, but for the most part, Chat GPT can save teachers immense amounts of time when it comes to grading and providing feedback. To begin, I told Chat GPT to “Provide an example of student writing and Chat GPT feedback on it after being given a grading rubric from a teacher (include the example rubric as well)”.

I was impressed with this response for multiple reasons. Chat GPT proved it can follow a rubric, provide a score, and give feedback all within a few seconds. The time-saving implications for teachers are massive, but though we can see many aspects of grading and feedback generated above, I think it’s worth testing each one individually on real, non-AI-generated work. We looked at personalised feedback in part 1, so in this post, I wanted to demonstrate grading and feedback capabilities beyond individual analysis. I used the same pieces of writing from part 1 (written reflection assignments about EdTech topics) but changed the prompt: “Here are 8 written assignments. I’d like you to design a rubric for scores from 0 to 6. The highest scores will have a deep level of their own engagement with the topic as well as personal reflection rather than summary of what certain concepts are. Once you are finished analysing the written assignments, can you display the grading rubric and then give each assignment a score based on it?

[Assignments copy-pasted below]”

The output definitely took into account my grading priorities, and the rubric seemed reasonable, so I was pleased with the result. However, the resulting scores did not quite match the scores I had given the students. As Chat GPT even admits, there is a degree of subjectivity in grading written work, and therefore defining specific grading guidelines is very important when using AI to grade.

In order to get more specific, I asked Chat GPT if it had based its scores on grammar or proper language usage. It said it did not, and asked if I would like to include that in a re-evaluation. I responded, “No, these are ESL students and I don’t want their assignments to be graded on language technicalities, unless their errors significantly affect the meaning of their work.” The response to this included a re-created rubric based on the same guidelines I had previously provided to the AI. Strangely, when it recalculated the scores, they were different from the original scores. When I mentioned this, it recalculated again, producing yet another different set of scores. Finally, I asked it to explain why all three sets of grades were different. This was its response:

The explanation for the differences in output was fair, but I’m unsure how long this re-evaluation process would have lasted, nor if it would ever reach a point of consistency, especially on a subjective topic like written English.

I asked Chat GPT to “create a new grading rubric for these assignments that prioritises proper English use and the clarity of ideas as well as the structure of the writing”. It did so, but when it assigned new grades to the assignments, huge discrepancies between the grades I would have given and the ones Chat GPT provided surfaced once again.

In this case, the writing mistakes were clearer, as they were based on things like grammar and sentence structure, and yet Chat GPT’s scores seemed completely inconsistent, with some well-written passages being given lower scores than unclear passages with multiple language mistakes.

My lesson learned from this is to be careful when using Chat GPT for grading more open-ended assignments. Personally, I think its feedback is much more useful and dependable than its scoring. If you do want to use it to grade written assignments, ensure that you give it incredibly specific guidelines and double check scores for consistency (meaning: ask it to grade the assignments again).

Multiple-choice question generation and grading

Chat GPT does excel when it comes to more straightforward assessment. It can generate multiple choice questions and answers with ease, and the format allows the questions to be copied and pasted into any field that processes text. This alone should make test and quiz creation much easier, whether doing it manually or with the help of a creator.

As an additional bonus, even vague and general prompts appear to be sufficient for the LLM to provide decent questions; however, as usual, the more specific you get, the better your test questions will be. The only caveat here is that due to copyright reasons, you can’t prompt Chat GPT to generate assessments from specific textbooks. Instead, you can give chapter titles or even subtitles so that Chat GPT generally knows on what its questions should be based. I tried this with the following prompt: “Create a 10-question high school-level multiple choice quiz about play, development, and learning for early years children.” Despite the potentially ambiguous wording, Chat GPT returned a very coherent quiz, partly pictured below – even though I think some of the questions arguably have more than one correct answer (NOTE: If Chat GPT generates a quiz without an answer key, as it sometimes does, simply prompt: “Please provide the answers to the last quiz you generated.”).

There do seem to be subtle workarounds for this, though. For example, I asked Chat GPT to “Write a 10-question middle school level true or false quiz on chapter 13 of Brinkley, American History: Connecting with the Past 2017, 15e” (I must have been feeling nostalgic for my old history textbook). When it refused, I simply went along with its next suggestion and received a surprisingly relevant quiz.

How Chat GPT knew that Chapter 13 was about 19th century US history is a mystery – and it suggests it may simply be hiding the specifics of its training data. It later told me that it based those questions off its general knowledge of US history textbooks, but I remain sceptical. When I asked it to create questions about more well-known novels, such as Herman Melville’s Moby Dick or Chinua Achebe’s Things Fall Apart, it had no issues at all. Keep in mind that I varied my prompts to show you the variety of assessment questions that Chat GPT can effectively generate.

Depending on what systems you already have in place for grading, using Chat GPT to grade multiple-choice, true or false, or short answer questions may or may not save you time. The downside of using it for straightforward grading is that test or homework answers need to be copied in and out of the chat boxes. Many other online education tools and platforms (such as Google Classroom or ClassIn) already have test or quiz functions built in, so those may be more expedient. The area in which Chat GPT shines is prompt-specific assessment question generation.

Conclusion on Chat GPT’s effect on assessment and feedback efficiency

Giving feedback and test preparation are parts of a teacher’s job that can, at times, be cumbersome – they also happen to be parts that Chat GPT completely revolutionises. As demonstrated in the examples above, Chat GPT can create its own grading rubrics, or be prompted to do so within specific parameters. With a bit of copying and pasting, teachers have a much quicker way to grade and give feedback on writing assignments. Of course, it is highly recommended to look over the AI-generated feedback before giving it to students. Additionally, teachers will need to ensure that Chat GPT is grading based on the correct rubric rather than just giving general feedback (be specific when prompting it: ex. “Using the above rubric, please grade and give feedback for the following assignment…”).

Chat GPT is also capable of creating and grading multiple-choice and other types of exams. It can be asked to create questions based on certain texts (as long as that text is in its database and not copyright protected) or on certain subjects. As usual, the more specific the prompts, the better the questions. Experiment to see if the text you’re using in the classroom can be directly referenced by Chat GPT, or if you’ll need to get more specific by prompting with chapter titles, headings, and key vocabulary. Once you get used to using it as an aid, you’ll notice how much time you can save.

Any questions? Comments? Suggestions? Feel free to send me an email me at Luke.Kemper@haringeyeducationpartnership.co.uk

About the Author:

Luke Kemper

Luke Kemper is Insight and Intelligence Lead at HEP. He recently graduated from the University of Cambridge with an MPhil in Education, Globalisation and International Development. Before that, he worked for seven years as a university lecturer and high school teacher in China and Poland.

HEP Talks Podcast

The voice of Haringey Education Partnership. A weekly briefing on the latest stories in education news, deep diving into developments and interviews with leading voices in Education.

More Insights