Rubrics & AI

June 25, 2025

GenAI tools make information more accessible, but they can also reduce users’ inclination to critically assess or expand on that information (Larson et al., 2024). Research indicates that AI tools have the “potential to provide immense gains to experts but disproportionately harm novices” (Valcea, Hamdani, & Wang, 2024). With this in mind, we conducted a workshop for B.Ed. students who already possess expertise in designing rubrics. As part of the workshop, we included a section focused on critical AI to help participants assess and evaluate the use of AI tools in creating rubrics for assignments.

Arri Morales, an elementary teacher candidate in the B.Ed. program, joined us for this workshop and shared his findings and reflections. Explore Arri’s findings and learn more about rubrics and AI in the presentation below. (Click or tap the right-hand side of the presentation image to advance to the next slide.)

AI and Rubrics by LDDI

Reflections

by Arri Morales, B.Ed. Teacher Canadidate

I investigated Copilot, ChatGPT, MagicSchool, and Brisk for this task. How credible and useful are the generated rubrics, and how well do they align with British Columbia’s curriculum? I prompted each tool to create a rubric for a spoken word project in a Grade 6/7 classroom. I also prompted each tool to focus on figurative language and performance quality. In all of these, I was curious to see how each rubric would differ and if similarities would exist between the outputs.

As I reviewed each rubric, I realized how each output showcased the Provincial Proficiency Scale. For Copilot and ChatGPT, the rubrics indicated how students are graded: either Emerging, Developing, Proficient, or Extending. However, rubrics from MagicSchool and Brisk did not carry this language. Instead, there was a focus on how students aligned with expectations. In spite of this, it was clear how each generated rubric showcased a progression of knowledge. This was accompanied by language that marked qualitative progress. Some examples showcased this progression through words such as “minimal”, “some”, “appropriate”, and “vivid”. This was not unfamiliar to the workshop participants as it produced ambiguous areas that could render the rubric incomplete.

When I reflect on the generated rubrics, I consider how both content and progression are identifiable. It is easy to see the progression of knowledge while also identifying content criteria. However, gaps are unveiled when rubrics are brought into context-specific classrooms. What would differentiate an appropriate showing of knowledge from a vivid showing of knowledge? Who, then, would become the arbiter that separates one grade from the other? How would the rubric change depending on the project? It then becomes difficult to fully rely on what Generative AI can produce.

In the workshop, many of the participants reflected upon the need for human intelligence and teacher intervention to better ground the generated rubric. It is not sufficient to use a generated tool from the get-go because criteria and expectation necessarily change. Moreover, generated rubrics may homogenize education and discount diverse learning needs. With this in mind, I believe it is paramount for novice teachers to use these generated rubrics as a springboard to meet various needs within the classroom. It is worthwhile to augment the generated tool with the expertise that only human agents bring.

References

Larson, B. Z., Moser, C., Caza, A., Muehlfeld, K., & Colombo, L. A. (2024). Critical thinking in the age of generative AI. Academy of Management Learning & Education, 23(3), 373-378.

Valcea, S., Hamdani, M. R., & Wang, S. (2024). Exploring the impact of ChatGPT on business school education: Prospects, boundaries, and paradoxes. Journal of Management Education, 48(5), 915-947.