Evaluating ChatGPT's Assessment of Creative Writing

As a teaching assistant for a course that involves students submitting creative writing, I was curious to examine the strengths and limitations of using ChatGPT in assessment. After grading a class, I took one student's submission and asked ChatGPT to evaluate it based on the criteria provided to the students. I then compared the resulting response to my comments on the piece of writing, applying interpretive frameworks (phenomenology, hermeneutics, semiotics, and iconography) to compare our approaches. From this analysis, I summarised the ethics, limitations, and strengths of ChatGPT's interpretive approach in assessing creative writing.

Use Case Title:

Description:

Tutorial for Use and Best Practices:

Isolate the assessment criteria for the creative writing task.
Enter the prompt into ChatGPT: “Critically evaluate this piece of writing in relation to the criteria identified in step 1 [insert copy and pasted assignment]”. Best practices: ChatGPT's free version is limited to 4,096 characters per enquiry, so ensure it is only used for assignments of this length or shorter. Also be aware that ChatGPT does not respond to the visual components of creative writing (ie. organisation, spacing, structure etc).
Review the generated response. If needed, increase the specificity of assessment by asking ChatGPT to assess the application of a certain technique or theory ie. “Be critical of the application of X in the following piece [insert copy and pasted assignment]".

Impacts on Learning:

ChatGPT, in these examples, provided a much more thorough analysis of the specific literary devices used in a well-organised response than I did. The language I used was certainly more personal, and arguably less concise and precise. Rather than identifying and naming certain techniques, my comment tended towards a broader thematic analysis. Moreover, I noticed I engaged with the piece to provoke more critical, interpretive questions of the student’s process and relationship to expression. These two approaches can be differentiated by the levels of iconographic analysis. Naturally, ChatGPT is not familiar with the author of the piece and therefore unable to consider their experience in the analysis of the piece. Its inability to know a person inherently limits their understanding of the text, especially at this level of considering questions of ‘why’ (the author’s motivations, purpose etc). Though not essential or intrinsic to the interpretation of a text and certainly not present in all grading, this interface between artist and audience can be fruitful in the analysis of a work. In this hermeneutic sense, understanding the artist’s relationship to the text can update our understanding of both the text itself, and the artist, and so on. Note that this could be considered an advantage of the technology, particularly where assessment unbiased by personal relationships might be important. I will not go too much further into it in this increasingly too-long section, but just wished to mark ChatGPT’s lack of a relationship with the artist as a potential tool to combat subjectivity in grading if deemed problematic.

Limitations and Ethical Considerations:

There are some obvious limitations with programs such as ChatGPT when assessing creative writing - it is not familiar with specific grading systems, its interpretation of nuanced theories can be contended, its immediate response is fairly surface-level, it cannot engage with intuitive individual emotional or physical responses, and it is difficult to provide it with all the background from the readings to gear it towards consideration of these theories/in-class discussions. Additionally, when asked to grade the student’s work, it returns the following response: As an AI language model, I am not programmed to assign grades or scores to pieces of writing, thus necessitating the re-introduction of subjectivity if giving scores is required. There also appears to be a limitation with the amount of text the program can process in one response, restricting the types of pieces one is able to evaluate. Moreover - and I believe most importantly - there are certain considerations in grading that are mediated by the grader’s discernment which might not be apparent to the AI. For example, phrasing feedback in a compassionate tone, being sensitive to the unique circumstances of the student beyond their work, offering new creative directions based on the understanding of a student’s interests, responding compassionately to the personal experiences they share, and offering follow-up support. These tools can facilitate the formation of relationships within educational settings, which can be such a useful element in the learning process. Some of these are specific to the context of assessing art; many of them aren’t. So though perhaps for now, I haven’t yet been replaced, it is still so fascinating to consider amongst debates over the use and implications of ChatGPT, how AI might be used on this side of the classroom - how we might use it to both the educator and students’ advantage whilst still preserving the ‘human’ aspects I’ve emphasised as important to the feedback process. In the words of ChatGPT: Human judgment and interpretation are still essential components of the creative writing process, and should be used in conjunction with AI tools like ChatGPT to fully appreciate the complexity and richness of creative writing. That is certainly one point on which we do not diverge.

Link to a Paper:

https://drive.google.com/file/d/1ybRd08zPVO6iWhC3aWsf9S7mkphYIwdE/view?usp=share_link

Built With

chatgpt

Updates

Emma Tam started this project — Oct 23, 2023 08:05 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.