The Role of ChatGPT in Introductory Programming Courses

Introduction

Programming education is on the cusp of a major transformation with the emergence of large language models (LLMs) like ChatGPT. These AI systems have demonstrated impressive capabilities in generating, explaining, and summarizing code, leading to proposals for their integration into coding courses. Aligning with ISTE Standard 4.1e for coaches, which urges the “connection of leaders, educators, and various experts to maximize technology’s potential for learning,” this post examines how ChatGPT and similar tools can be effectively integrated into introductory programming classes. It covers the benefits of AI tutors, insights from educators on their use, and current best practices and trends for deployment in the classroom.

The Current State of AI in Computer Science Education

The current integration of AI in computer science education is showing promising results. ChatGPT excels in providing personalized and patient explanations of programming concepts, offering code examples and solutions tailored to students’ individual needs. Its interactive conversational interface encourages students to engage in a dialogue, solidifying their understanding through active participation and feedback. Students can present coding issues in simple terms and receive a comprehensive, step-by-step explanation from ChatGPT, clarifying fundamental principles throughout the process.

Such dynamic assistance clarifies misunderstandings more effectively than static textbooks or videos. ChatGPT’s round-the-clock availability as an AI tutor offers crucial support, bridging gaps when human instructors are unavailable. According to research by Kazemitabaar et al. (2023), using LLMs like ChatGPT can bolster students’ abilities to design algorithms and write code, reducing the stress often accompanying these tasks. The study also noted increased enthusiasm for learning programming among many students after exposure to LLM-based instruction.

Pros of Incorporating ChatGPT into the Classroom

The rapid advancement of AI systems such as ChatGPT offers many opportunities and poses some challenges in computing education. ChatGPT’s conversational interface and its capability to provide personalized content make it an exceptional asset for adaptive learning in AI-assisted teaching. Biswas (2023) identifies multiple applications for LLMs in educational settings, including their role in creating practice problems and code examples that enhance teaching. Furthermore, ChatGPT can anticipate and provide relevant code snippets tailored to the programming task and user preferences, accelerating development processes. It can also fill in gaps in code by analyzing the existing framework and project parameters. Additionally, LLM-facilitated platforms help with explanations, documentation, and resource location for troubleshooting and diagnosing issues from error messages, streamlining debugging and reducing the time spent on minor yet frustrating problems.

Cons of Incorporating ChatGPT in Education

Despite the advantages of ChatGPT, there is concern that its proficiency in solving basic programming tasks may lead to student overreliance on its code generation, potentially diminishing actual learning, as evidenced by Finnie-Ansley et al. (2022) and Kazemitabaar et al. (2023). Finnie-Ansley’s research indicates that, while LLMs can perform at a high level (scoring in the top quartile on CS1 exams), they are not without significant error rates. Moreover, the benefits attributed to ChatGPT, such as code completion, syntax correction, and debugging assistance, overlap with features already available in modern Integrated Development Environments (IDEs).

Concerns extend to ChatGPT facilitating ‘AI-assisted cheating,’ which threatens academic integrity and assessment validity (Finnie-Ansley et al., 2022). To counteract this, researchers suggest crafting more innovative, conceptual assignments beyond simple coding tasks (Finnie-Ansley et al., 2022; Kazemitabaar et al., 2023). Educators in computing must adopt careful strategies for integrating ChatGPT, using it as a scaffolded instructional tool rather than a crutch for solving exam problems, to maintain a focus on in-depth learning.

Instructors’ Perspectives and Experiences

In a study conducted in 2023, Lau and Guo interviewed 20 introductory programming instructors from nine countries regarding their adaptation strategies for LLMs like ChatGPT and GitHub Copilot. In the near term, most instructors intend to limit the use of LLMs to curb cheating on assignments, which they view as a potential detriment to learning. Their strategies range from emphasizing in-person examinations to scrutinizing code submissions for patterns indicative of LLM use and outright prohibiting certain tools. Some, however, are keen to explore the capabilities of ChatGPT, proposing its cautious application, such as demonstrating its limitations to students by having them assess its output against test cases.

In contemplating the future, these educators showed greater willingness to integrate LLMs as teaching tools, recognizing their congruence with real-world job skills, their potential to enhance accessibility, and their use in facilitating more innovative forms of coursework. For example, they discussed transitioning from having students write original code to evaluating and improving upon code produced by LLMs—a few envisioned LLMs functioning as custom-tailored teaching aids for individual learners.

Pedagogical Strategies and Opportunities for Future Research

Designing problems that demand a deep understanding of concepts rather than the execution of routine coding tasks, which LLMs easily handle, is a vital pedagogical shift proposed by Finnie-Ansley et al. (2022) and Kazemitabaar et al. (2023). Utilizing ChatGPT as an interactive educational tool to complement teaching—instead of as a mere solution provider—may strike an optimal balance between its advantages and potential drawbacks. Given the pace at which AI technology is being adopted in education, there’s a pressing need for further empirical research to identify the most effective ways to integrate these tools and assess their impact on student learning.

References

Biswas, S. (2023). Role of ChatGPT in Computer Programming. Mesopotamian Journal of Computer Science, 8–16. https://doi.org/10.58496/mjcsc/2023/002

Kazemitabaar, M., Chow, J., Carl, M., Ericson, B. J., Weintrop, D., & Grossman, T. (2023). Studying the effect of AI Code Generators on Supporting Novice Learners in Introductory Programming. https://doi.org/10.1145/3544548.3580919

Finnie-Ansley, J., Denny, P., Becker, B. A., Luxton-Reilly, A., & Prather, J. (2022). The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. Australasian Computing Education Conference. https://doi.org/10.1145/3511861.3511863

Lau, S., & Guo, P. (2023, August). From” Ban it till we understand it” to” Resistance is futile”: How university programming instructors plan to adapt as more students use AI code generation and explanation tools such as ChatGPT and GitHub Copilot. In Proceedings of the 2023 ACM Conference on International Computing Education Research-Volume 1 (pp. 106-121). https://doi.org/10.1145/3568813.3600138

Teaching Computer Science with Minecraft

Introduction to Minecraft

Minecraft is currently one of the most popular games of 2023, boasting over 140 million monthly active users, according to searchlogistics.com. Despite this popularity, many players overlook that Minecraft offers an engaging and immersive environment for learning terminal commands, programming basics, computational thinking, and even artificial intelligence. ISTE standard 4.3a for coaches indicates that a successful coach should “Establish trusting and respectful coaching relationships that encourage educators to explore new instructional strategies.” So, in this blog post, I will delve into the educational benefits of Minecraft and explore the differences between the Java and Education editions.

While Minecraft is often regarded as merely a game, educators have recognized its potential as a valuable learning tool. At its core, Minecraft is built upon programming concepts. Players use blocks made of various materials to construct anything they can imagine, from simple houses to complex machines that require advanced knowledge of electronics, chemistry, and physics. This encourages computational thinking, creativity, and problem-solving as students work to bring their visions to life.

Concerning programming, Minecraft helps teach fundamental coding concepts, including commands, functions, variables, loops, and conditionals. Students can employ block-based coding or full-fledged programming languages such as Python and JavaScript to automate actions within the game. This hands-on approach to learning captivates students more effectively than traditional coding lessons, as Minecraft provides them with an imaginative space to immediately apply their newfound skills. Creating Minecraft modifications (mods) teaches students how to extend existing programs, a critical programming skill.

Minecraft Versions

Several versions of Minecraft are available for players to choose from, including Minecraft: Java Edition, Minecraft: Bedrock Edition, Minecraft: Education Edition, and Minecraft: Pocket Edition. However, for the specific purpose of our educational analysis, we will concentrate solely on the Java and Education editions. These two versions offer unique features and opportunities for learning that make them particularly relevant in an educational context.

Minecraft: Java Edition

The Java Edition is the original version of Minecraft developed in 2009 by Mojang Studios for Windows, macOS, and Linux, and maintains its popularity among long-time Minecraft players.

The Java Edition offers distinct advantages when teaching advanced computer science concepts due to its “mod-ability” and access to the source code of the game environment. The semi-open-source nature of the Java Edition allows for limitless customization through mods and plugins. Writing mods can illustrate a wide range of advanced programming concepts, including event handling, parallel programming, algorithms, data structures, debugging, and software design patterns. Developing mods not only imparts practical software development skills but also encourages students to show their creativity.

The Minecraft community has produced numerous mods that cater to various lesson plans. For instance, ComputerCraft introduces programmable turtle robots, while RedstonePlus enhances the game with advanced circuitry. The diversity of available mods supports a wide range of educational objectives, not only in CS but other disciplines.

Minecraft: Education/Bedrock Edition

Minecraft: Bedrock Edition was initially released in August 2011 and is particularly advantageous for classrooms with various devices. Bedrock Edition supports mobile devices such as iPads and Android tablets, which many schools already incorporate into their teaching environments. This enables students to start their Minecraft lessons on a classroom desktop computer during the day and seamlessly continue playing on their smartphones or game consoles at home.

However, Bedrock Edition offers less mod support and limited access to code customization. Minecraft Education Edition is a version of Bedrock specifically tailored for classroom use. According to Microsoft, it “typically runs about one full version behind the current Minecraft Bedrock production version” (FAQ: Game Features, 2023).

Advantages of Minecraft Education in the Classroom

One of the most significant advantages of Minecraft Education in a computer science course is its block-based CodeBuilder / MakeCode editor, similar to Scratch or Snap. This editor allows students to drag and drop commands to perform actions in the game. Younger students can learn coding logic and structure by creating houses, gardens, and machines using these visual blocks before transitioning to text-based programming languages like Python or JavaScript.

Another advantage of Education Edition is the teachers’ ability to implement special restrictions, such as limiting chat or preventing students from destroying blocks. These classroom controls create a safe environment for student exploration. Teachers can also switch to spectator mode to observe students and provide feedback; they also have the capability to build worlds and restrict access as needed. Here is a quick start guide for reference.

The Education Edition library offers hundreds of pre-made interactive worlds and lesson plans aligned with computer science curriculum standards (source: https://education.minecraft.net/en-us/resources/computer-science-subject-kit). Teachers can find lesson plans tailored to any grade level, making it much easier for educators to get started with Minecraft compared to building worlds from scratch.

According to research by Bile (2022), their study found that children aged 8 to 10 in a Minecraft education setting were able to solve abstract and complex scientific problems without prior prompting or theoretical knowledge. The game format also helped students retain knowledge better. Vostinar & Dobrota (2022) similarly found that in a primary school class, even though the majority of students had not programmed before in block or Python, they found the lesson enjoyable and easy. Furthermore, according to Nika Klimová et al. (2021), girls in grades 5-10 typically outperform boys in Minecraft education coding challenges, suggesting it may be a valuable tool for increasing diversity in computer science.

Disadvantages of Minecraft

As Vostinar & Dobrota (2022, p. 652) pointed out, there are significant disadvantages to using Minecraft in education. One such drawback is that Minecraft is not free and requires an additional cost per student, which, as mentioned in my previous post, raises ethical concerns about the practice of making students pay for educational software. Another disadvantage is that Minecraft may only appeal to a certain type of student, particularly those with a more creative inclination, potentially excluding students who do not have an affinity for the game.

Furthermore, teachers must become proficient in the game’s mechanics and capabilities to integrate it into the classroom effectively. Given the abundance of “cheats” in Minecraft, more experienced players may find trivial command-line solutions to problems if the teacher is unaware of their existence. Finally, as highlighted by Vostinar & Dobrota (2022), it’s essential to impose adequate constraints on the virtual world, especially when students collaborate, to prevent them from destroying the world with TNT blocks and other mining tools.

References:

Vostinar, P., & Dobrota, R. (2022). Minecraft as a Tool for Teaching Online Programming. 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO). https://doi.org/10.23919/mipro55190.2022.9803384

Bile, A. (2022). Development of intellectual and scientific abilities through game-programming in Minecraft. Education and Information Technologies, 1–16. https://doi.org/10.1007/s10639-022-10894-z

Nika Klimová, Jakub Sajben, & Lovászová, G. (2021). Online Game-Based Learning through Minecraft: Education Edition Programming Contest. https://doi.org/10.1109/educon46332.2021.9453953

FAQ: Game Features. (2023, September 15). Minecraft Education. https://educommunity.minecraft.net/hc/en-us/articles/360047117692-FAQ-Game-Features

The Pros and Cons of Autograders in Programming Courses

Programming courses typically require assignments where students write code to fulfill specific specifications. In such courses, an autograder serves as an automated tool designed to assess student code submissions by conducting input and output tests. Autograders have been in existence since the inception of computer science as a field of study (Hollingsworth, 1960). More recently, with the increase of massive online programming courses hosting up to 500 students, autograders have gained popularity as an efficient means for grading programming assignments (Keuning et al., 2018). They are instrumental in student engagement (Iosup & Epema, 2014) and pivotal in providing students with constructive feedback (Keuning et al., 2018). However, like any educational technology, autograders come with their own set of advantages and disadvantages that warrant consideration. This post aims to explore the significant pros and cons of employing autograders for assessments in programming courses.

Several renowned proprietary programming autograders are currently available, including CodePost, CodeGrade, Codio, and Mimir. Each tool offers a wealth of academic programming resources, including built-in problems, user-friendly interfaces, flexible question setting, and code review capabilities. However, these companies impose a substantial annual fee on institutions, ranging from $20,000 to $100,000 CAD, for a standard school comprising 1000 students. Additionally, each student is required to pay a monthly fee between $10 and $50 CAD.

In my view, such pricing is excessive (and greedy) and contradicts the principles outlined in the computer science code of ethics, particularly when the software is intended to advance software development. As a result, many post-secondary institutions opt to develop and maintain autograders in-house, tailoring them to their specific preferences. This approach allows faculty to propose new features and enhancements, and students can also contribute suggestions for improvement.

Advantages of Autograders

One of the most compelling incentives for using an autograder is the significant time savings it offers instructors compared to manual grading. Studies indicate that autograders can assess assignments at least three to four times faster than human graders (Ihantola et al., 2010; Keuning et al., 2018). This substantial reduction in grading workload allows instructors to allocate more time to essential teaching tasks such as lesson planning, curriculum development, and providing student support and feedback. The time savings can be particularly substantial in large classes.

Autograders also benefit students by providing quicker feedback on their work. This is especially valuable in introductory programming classes, where receiving prompt results on smaller assignments can significantly enhance student learning and motivation (Keuning et al., 2018). Unlike human grading, which can take days or weeks, autograders can assess submissions within seconds or minutes and instantly inform students whether their code has passed or failed the test cases. This expedited feedback allows students to validate and refine their work much more rapidly than traditional grading methods permit.

A prevalent concern with human graders is the inconsistency in grading from one assignment to another, from one student to another, or even within a single assignment. Factors such as fatigue, emotional states, and biases can impact the quality of human grading, potentially leading to unfairness or errors. Autograders, by contrast, eliminate this subjectivity by applying uniform standards and tests to all submissions, ensuring consistent and equitable grading across the entire class, and thereby enhancing student satisfaction (Hagerer, 2021).

In courses that employ autograders, students quickly learn the necessity of writing code that meets all the autograder test cases to secure maximum assignment credit. While the efficacy of test-driven development (TDD) as a software testing methodology is debatable, this workflow provides students with experience in the TDD framework. Here, students continually run tests on their code to rectify errors and attain the desired functionality (Wang et al., 2011). Essentially, autograders compel students to consider testing as an integral part of coding, rather than merely striving to meet the minimal functional requirements.

Disadvantages of Autograders

A significant drawback of autograders, frequently cited in literature, is their inflexibility compared to human graders (Ihantola et al., 2010; Keuning et al., 2018; Wang et al., 2018). Autograders strictly apply identical test cases to all submissions without exception. Consequently, creative solutions that meet the assignment requirements but deviate from the expected implementation or output format are marked incorrect. Even a minor discrepancy such as a missing whitespace can be the difference between a pass and a fail. Unlike autograders, human graders can exercise judgment to accommodate alternative approaches.

Most autograders assess the functional correctness of student codes, evaluating output for given tests. However, programming courses also aim to instill good coding practices, such as readability, modularization, adherence to naming conventions, coherent design, and appropriate commenting, in students. Autograders do not adequately assess these crucial design and style aspects, leading students to neglect good design principles as long as their code passes the functionality tests.

Another concern is that while autograders are designed to offer students a structured means to advance their knowledge across multiple courses, achieving uniformity in their application across various courses is challenging, especially in larger institutions. Typically, post-secondary institutions employ autograders to maintain consistency across different courses, enabling students to track their progress effectively. However, in institutions where numerous faculty members teach diverse courses with varying requirements, achieving universal acceptance and use of autograders is complex. Faculty members may prefer different tools they are more comfortable with, and some might choose not to use autograders. This results in a lack of uniformity in tool usage from one course to another, creating a disjointed student experience.

Relying exclusively on autograders poses the risk of students learning to pass test cases without acquiring a deeper understanding of programming concepts and problem-solving skills. The emphasis on meeting the autograder’s criteria can lead students to adopt a procedural approach, focusing on achieving the correct output rather than understanding the underlying logic. Some might resort to a trial-and-error method, tweaking their program until it gains autograder approval. While this approach may secure the desired grades, it does not foster genuine understanding or long-term retention of knowledge. Baniassad et al. (2021) introduced a submission penalty at the University of British Columbia to discourage over-reliance on their in-house autograding tool. This adaptation exemplifies the flexibility of modifying tool requirements, a possibility uniquely available when the tool is developed in-house.

Finally, like any web-based software system, autograders can experience technical issues that lead to grading failures and student frustration. The UC Berkeley incident highlights the “single point of failure” risk where an autograder disruption blocks all grading capabilities. Unlike distributed human graders, a centralized automated grader represents a vulnerability to technical problems. Some may fail to meet deadlines through no fault of their own. Furthermore, if instructors refuse to make accommodations for autograder malfunctions, students can feel cheated and that the grading is unfairly disconnected from actual instruction. This speaks to larger concerns around over-reliance on algorithmic systems in education. Automated aids like autograders should not be seen as the sole means of assessment.

Conclusion

The existing body of research on autograders underscores that they are not a panacea for replacing human graders entirely. Instead, to optimize their advantages and mitigate their limitations, autograders are most effective when thoughtfully integrated into a course assessment strategy, complemented by manual grading where it is most beneficial. Below are some best practices for incorporating autograders effectively:

  • Employ autograders for basic functionality testing, while manually reviewing selected assignments for flexibility, creativity, and design.
  • Utilize autograders to assess the correctness of core logic, and rely on human graders to evaluate structure, style, and readability.
  • Complement autograder evaluations with human feedback on prevalent mistakes and areas requiring enhancement.
  • Impose penalties for excessive submissions to discourage over-reliance on the autograder.

Proper integration of autograders aligns with technology integration frameworks like SAMR, enhancing existing processes without entirely transforming the grading in programming courses. It also redefines the manner in which students engage with programming, introducing a more gamified approach. Like any educational technology, the value of autograders is derived from their strategic utilization within well-defined goals and contexts.

References

Hollingsworth, J. (1960). Automatic graders for programming classes. Communications of the ACM3(10), 528–529. https://doi.org/10.1145/367415.367422

Keuning, H., Jeuring, J., & Heeren, B. (2016). Towards a Systematic Review of Automated Feedback Generation for Programming Exercises. Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education. https://doi.org/10.1145/2899415.2899422

Iosup, A., & Epema, D. (2014). An experience report on using gamification in technical higher education. Proceedings of the 45th ACM Technical Symposium on Computer Science Education – SIGCSE ’14. https://doi.org/10.1145/2538862.2538899

Ihantola, P., Ahoniemi, T., Karavirta, V., & Seppälä, O. (2010). Review of recent systems for automatic assessment of programming assignments. Proceedings of the 10th Koli Calling International Conference on Computing Education Research – Koli Calling ’10. https://doi.org/10.1145/1930464.1930480

Hagerer, G. (2021). An Analysis of Programming Course Evaluations Before and After the Introduction of an Autograder. (n.d.). Ieeexplore.ieee.org.

 Wang, T., Su, X., Ma, P., Wang, Y., & Wang, K. (2011). Ability-training-oriented automated assessment in introductory programming course. Computers & Education56(1), 220–226. https://doi.org/10.1016/j.compedu.2010.08.003

Baniassad, E., Zamprogno, L., Hall, B., & Holmes, R. (2021). STOP THE (AUTOGRADER) INSANITY: Regression Penalties to Deter Autograder Overreliance. Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. https://doi.org/10.1145/3408877.3432430