Skip to main content

22 January 2025

Can LLMs reliably mark responses to exam questions?

Computing at  School profile image
Written by

Computing at School

If you were unable to join us for the thematic community meeting on "Can LLMs reliably mark responses to exam questions?", don’t worry! Here’s a detailed recap of the session, highlighting key discussions, insights, and resources shared by our expert speaker Omar Afzal from HigherSummit.

Key Takeaways

  • AI tools like ChatGPT can provide instant, detailed feedback at scale, saving significant time for educators.
  • Large Language Models (LLMs) can mimic human-like feedback, enhancing students' learning experience.
  • Current limitations of AI include inconsistencies and challenges in understanding nuanced responses.
  • Students can challenge AI feedback, fostering critical thinking and deeper engagement.
  • Platforms like teepee.ai automate marking, analytics, and feedback, streamlining administrative workloads for teachers.

Exploring AI’s Potential in Education

Omar Afzal kicked off the session with an engaging exploration of how large language models (LLMs) like ChatGPT are revolutionising marking and feedback processes in education. He outlined the challenges educators face: balancing extensive marking loads with providing personalised, meaningful feedback.

AI, Omar explained, can address these challenges by delivering immediate, high-quality feedback at scale. For instance, LLMs can evaluate 400 student responses in under five seconds—an otherwise impossible feat for human teachers.

Demonstrating AI in Action

Participants were introduced to a tool developed by HigherSummit to illustrate AI's capabilities. A mock assessment asked students to write code to convert Celsius to Fahrenheit, including specific conditions for input validation. The tool not only graded the responses but also provided detailed feedback, highlighting errors, identifying areas for improvement, and suggesting actionable changes.

This demonstration emphasised how AI can simulate one-on-one teacher-student interactions, providing timely and individualised insights for learners. Omar also showcased teepee.ai, a platform designed to create and mark assessments at scale. It offers analytics that help teachers identify class-wide and topic-specific learning gaps, allowing for targeted interventions.

Challenges and Limitations

Despite its advantages, AI has limitations. Omar highlighted inconsistencies in feedback and the difficulty of understanding nuanced student responses. However, these challenges can serve as learning opportunities. Students, for example, can request re-evaluations of their answers, engaging critically with the feedback process and deepening their understanding.

Questions and Community Discussions

The session concluded with an interactive Q&A where pertinent topics such as marking accuracy, GDPR compliance and personalising content were raised.

  • How accurate is AI compared to human markers? While AI achieves approximately 80% accuracy, its efficiency far surpasses traditional methods.
  • Can teachers upload their own assessment materials? Although platforms like teepee.ai primarily use predefined content, custom questions can be incorporated upon request.
  • What about safeguarding student data? HigherSummit ensures compliance with data protection regulations, allowing schools to conduct data protection assessments before using the platform.

Next Steps for Teachers

Here are some reflective questions and practical exercises to consider as you explore AI tools for your classroom:

Reflective Questions

  • How could AI tools streamline your marking and feedback processes?
  • What are the potential benefits and drawbacks of integrating AI into your current teaching practices?
  • How might your students react to receiving AI-generated feedback?

Practical Exercises

  • Test an AI tool like ChatGPT or teepee.ai with a small set of student responses.
  • Design an activity where students critically evaluate AI feedback and suggest improvements.
  • Compare AI-generated feedback with your own to identify gaps or strengths.

Further Resources

For those eager to dive deeper, here’s a list of helpful links and resources: