Models to Measure Students’ Learning in Computer Science

As computer science becomes integrated into K-12 education systems worldwide, educators and researchers continuously search for effective methods to measure and understand students’ learning levels in this field. The challenge lies in developing reliable and comprehensive assessment models that accurately and discreetly gauge student learning. Teachers must assess learning to support students’ educational needs better. Similarly, students and parents expect schools to document students’ proficiency in computing and their practical application. Unlike conventional subjects such as math and science, very few relevant assessments are available for K-12 CS education. This article explores specific models used to measure knowledge in various CS contexts and then examines several examples of student learning indicators in computer science.

Randomized Controlled Trials and Measurement Techniques

An innovative approach to measuring student performance in computer science education involves evaluating the effectiveness of teaching parallel programming concepts. Research by Daleiden et al. (2020) focuses on assessing students’ understanding and application of these concepts.

The Token Accuracy Map (TAM) technique supplements traditional empirical analysis methods, such as timings, error counting, or compiler errors, which often need more depth in analyzing the cause of errors or providing detailed insights into specific problem areas encountered by students. The study applied TAM to examine student performance across two parallel programming paradigms: threads and process-oriented programming based on Communicating Sequential Processes (CSP), measuring programming accuracy through an automated process.

The TAM approach analyzes the accuracy of student-submitted code by comparing it against a reference solution using a token-based comparison. Each element of the code, or “token,” is compared to determine its correctness, and the results are aggregated to provide an overall accuracy score ranging from 0% to 100%. This scoring system reflects the percentage of correctness, allowing for a detailed examination of which students intuitively understand specific elements of different programming paradigms or are more likely to implement them correctly.

This approach extends error counts, offering insights into students’ mistakes at a granular level. Such detailed analysis enables researchers and educators to identify specific programming concepts requiring further clarification or alternative teaching approaches. Additionally, TAM can highlight the strengths and weaknesses of different programming paradigms from a learning perspective, thereby guiding curriculum development and instructional design.

Competence Structure Models in Informatics

Torsten et al. (2015) introduced a new model in their discussion aimed at developing a competence structure model for informatics with a focus on system comprehension and object-oriented modelling. This model, part of the MoKoM project (Modeling and Measurement of Competences in Computer Science Education), seeks to create a competence structure model that is both theoretically sound and empirically validated. The project’s goals include identifying essential competencies in the field, organizing them into a coherent framework, and devising assessments to measure them accurately. The study employed the Item Response Theory (IRT) evaluation methodology to construct the test instrument and analyze survey data.

The initial foundation of the competence model was based on theoretical concepts from international syllabi and curricula, such as the ACM’s “Model Curriculum for K-12 Computer Science” and expert papers on software development. This framework encompasses cognitive and non-cognitive skills pertinent to computer science, especially emphasizing system comprehension and object-oriented modelling.

The study further included conducting expert interviews using the Critical Incident Technique to validate the model’s applicability to real-world scenarios and its empirical accuracy. This method was instrumental in pinpointing and defining the critical competencies needed to apply and understand informatics systems. It also provided a detailed insight into student learning in informatics, identifying specific strengths and areas for improvement.

Limitations

The limitation of this approach is its specificity, which may hinder scalability to broader contexts or different courses. Nonetheless, the findings indicate that detailed, granular measurements can offer valuable insights into the nature and types of students’ errors and uncover learning gaps. The resources mentioned subsequently propose a more general strategy for assessing learning in computer science.

Evidence-centred Design for High School Introductory CS Courses

Another method for evaluating student learning in computer science involves using Evidence-Centered Design (ECD). Newton et al. (2021) demonstrate the application of ECD to develop assessments that align with the curriculum of introductory high school computer science courses. ECD focuses on beginning with a clear definition of the knowledge, skills, and abilities students are expected to gain from their coursework, followed by creating assessments that directly evaluate these outcomes.

The approach entails specifying the domain-specific tasks that students should be capable of performing, identifying the evidence that would indicate their proficiency, and designing assessment tasks that would generate such evidence. The model further includes an analysis of assessment items for each instructional unit, considering their difficulty, discrimination index, and item type (e.g., multiple-choice, open-ended, etc.). This analysis aids in refining the assessments to gauge student competencies and understanding more accurately.

This model offers a more precise measurement of student learning by ensuring that assessments are closely linked to curriculum objectives and learning outcomes.

Other General Student Indicators

The Exploring Computer Science website, a premier resource for research on indicators of student learning in computer science, identifies several key metrics for understanding concepts within the field:

  • Student-Reported Increase in Knowledge of CS Concepts: Students are asked to self-assess their knowledge in problem-solving techniques, design, programming, data analysis, and robotics, rating their understanding before and after instruction.
  • Persistent Motivation in Computer Problem Solving: This self-reported measure uses a 5-point Likert scale to evaluate students’ determination to tackle computer science problems. Questions include, “Once I start working on a computer science problem or assignment, I find it hard to stop,” and “When a computer science problem arises that I can’t solve immediately, I stick with it until I find a solution.”
  • Student Engagement: This metric again relies on self-reporting to gauge a student’s interest in further pursuing computer science in their studies. It assesses enthusiasm and inclination towards the subject.
  • Use of CS Vocabulary: Through pre- and post-course surveys, students respond to the prompt: “What might it mean to think like a Computer Scientist?”. Responses are analyzed for the use of computer science-related keywords such as “analyze,” “problem-solving,” and “programming.” A positive correlation was found between CS vocabulary use and self-reported CS knowledge levels.

Comparing the Models

Each model discussed provides distinct benefits but converges on a shared objective: to gauge precisely students’ understanding of computer science. The Evidence-Centered Design (ECD) model is notable for its methodical alignment assessments with educational objectives, guaranteeing that evaluations accurately reflect the intended learning outcomes. Conversely, the randomized controlled trial and innovative measurement technique present a solid approach for empirically assessing the impact of instructional strategies on student learning achievements. Finally, the competence structure model offers an exhaustive framework for identifying and evaluating specific competencies within a particular field, like informatics, ensuring a thorough understanding of student abilities. As the field continues to evolve, so will our methods for measuring student success.

References

Daleiden, P., Stefik, A., Uesbeck, P. M., & Pedersen, J. (2020). Analysis of a Randomized Controlled Trial of Student Performance in Parallel Programming using a New Measurement Technique. ACM Transactions on Computing Education20(3), 1–28. https://doi.org/10.1145/3401892

Magenheim, J., Schubert, S., & Schaper, N. (2015). Modelling and measurement of competencies in computer science education. KEYCIT 2014: key competencies in informatics and ICT7(1), 33-57.

Newton, S., Alemdar, M., Rutstein, D., Edwards, D., Helms, M., Hernandez, D., & Usselman, M. (2021). Utilizing Evidence-Centered Design to Develop Assessments: A High School Introductory Computer Science Course. Frontiers in Education6. https://doi.org/10.3389/feduc.2021.695376

Potential of LLMs and Automated Text Analysis in Interpreting Student Course Feedback

Integrating Large Language Models (LLMs) with automated text analysis tools offers a novel approach to interpreting student course feedback. As educators and administrators strive to refine teaching methods and enhance learning experiences, leveraging AI’s capabilities could unlock more profound insights from student feedback. Traditionally seen as a vast collection of qualitative data filled with sentiments, preferences, and suggestions, this feedback can now be more effectively analyzed. This blog will explore how LLMs can be utilized to interpret and classify student feedback, highlighting workflows that could benefit most teachers.

The Advantages of LLMs in Feedback Interpretation

Bano et al. (2023) shed light on the capabilities of LLMs, such as ChatGPT, in analyzing qualitative data, including student feedback. Their research found a significant alignment between human and LLM classifications of Alexa voice assistant app reviews, demonstrating LLMs’ ability to understand and categorize feedback effectively. This indicates that LLMs can grasp the nuances of student feedback, especially when the data is rich in specific word choices and context related to course content or teaching methodologies.

LLMs excel at processing and interpreting large volumes of text, identifying patterns, and extracting themes from qualitative feedback. Their capacity for thematic analysis at scale can assist educators in identifying common concerns, praises, or suggestions within students’ comments, tasks that might be cumbersome and time-consuming through manual efforts.

Limitations and Challenges

Despite their advantages, LLMs have limitations. Linse (2017) highlights that fully understanding the subtleties of student feedback requires more than text analysis; it demands contextual understanding and an awareness of biases. LLMs might not accurately interpret outliers and statistical anomalies, often necessitating human intervention to identify root causes.

Kastrati et al. (2021) identify several challenges in analyzing student feedback sentiment. One major challenge is accurately identifying and interpreting figurative speech, such as sarcasm and irony, which can convey sentiments opposite to their literal meanings. Additionally, many feedback analysis techniques designed for specific domains may falter when applied to the varied contexts of educational feedback. Handling complex linguistic features, such as double negatives, unknown proper names, abbreviations, and words with multiple meanings commonly found in student feedback, presents further difficulties. Lastly, there is a risk that LLMs might inadvertently reinforce biases in their training data, leading to skewed feedback interpretations.

Tools and Workflows

According to ChatGPT (OpenAI, 2024), a suggested workflow for analyzing data from course feedback forms is summarized as follows:

  1. Data Collection: Utilize tools such as Google Forms or Microsoft Forms to design and distribute course feedback forms, emphasizing open-ended questions to gather qualitative feedback from students.
  2. Data Aggregation: Employ automation to compile feedback data into a single repository, like a Google Sheet or Microsoft Excel spreadsheet, simplifying the analysis process.
  3. Initial Thematic Analysis: Import the aggregated feedback into qualitative data analysis software such as NVivo or ATLAS.ti. Use the software’s coding capabilities to identify recurring themes or sentiments in the feedback.
  4. LLM-Assisted Analysis: Engage an LLM, like OpenAI’s GPT, to further analyze the identified themes, categorize comments, and potentially uncover new themes that were not initially evident. It’s crucial to review AI-generated themes for their accuracy and relevance.
  5. Quantitative Integration: Combine qualitative insights with quantitative data from the feedback forms (e.g., ratings) using tools like Microsoft Excel or Google Sheets. This integration offers a more holistic view of student feedback.
  6. Visualization and Presentation: Apply data visualization tools such as Google Charts or Tableau to create interactive dashboards or charts that present the findings of the qualitative analysis. Employing visual aids like word clouds for common themes, sentiment analysis graphs, and charts showing thematic distribution can render the data more engaging and comprehensible.

Case Study: Minecraft Education Lesson

ChatGPT’s recommended workflow was used to analyze feedback from a recent lesson on teaching functions in Minecraft Education.

Step 1: Data Collection

A Google Forms survey was distributed to students, which comprised three quantitative five-point Likert scale questions and three qualitative open-ended questions to gather comprehensive feedback.

MCE Questionnaire

Step 2: Data Aggregation

Using Google Forms’ export to CSV feature, all survey responses were consolidated into a single file, facilitating efficient data management.

Step 3: Initial Thematic Analysis

The survey data was then imported into atlas.ti, an online thematic analysis tool with AI capabilities, to generate initial codes from the quantitative data. This process revealed several major themes, providing valuable insights from the feedback.

Results of AI Coding

Step 4: Manual Verification and Analysis

Upon reviewing the survey data manually, the main themes identified by Atlas.ti were confirmed. Additionally, this manual step highlighted specific approaches students took to solve problems presented in the lesson. Generally, the AI-generated codes were quite accurate, but a closer analysis of the comments (like the ones below) shows even more insightful student suggestions.

AI Coding

Step 5: Quantitative Integration

With both qualitative and quantitative data at hand, we bypass the need for a separate step for quantitative integration.

Step 6: LLM-Assisted Analysis and Visualization

Next, themes were further analyzed using ChatGPT’s code interpreter feature. ChatGPT helped analyze the data and summarized the aggregated data very accurately. It even provided Python code for generating additional visualizations, enhancing the interpretation of the feedback.

Python pandas code

ChatGPT’s guidance facilitated the creation of insightful visualizations such as bar charts and word clouds.

bar chart of qualitative data
Word cloud output

Python offers a wealth of data visualization libraries for even more detailed analysis (https://mode.com/blog/python-data-visualization-libraries).

Best Practices for Using LLMs in Feedback Analysis

Research by Bano et al. (2023) and insights from Linse (2017) highlight the potential of LLMs and automated text analysis tools in interpreting student course feedback. Adopting best practices for integrating these technologies is critical for educators and administrators to make informed decisions that enhance teaching quality and the student learning experience, contributing to a more responsive and dynamic educational environment. Below are several recommendations:

  1. Educators or trained administrators must review AI-generated themes and categorizations to ensure alignment with the intended context and uncover nuances possibly missed by the AI. This step is vital for identifying subtleties and complexities that LLMs may not detect.
  2. Utilize insights from both AI and human analyses to inform changes in teaching practices or course content. Then, assess whether subsequent feedback reflects the effects of these changes, thereby establishing an iterative loop for continuous improvement.
  3. Offer guidance on using Student course evaluations constructively. This involves understanding the context of evaluations, looking beyond average scores to grasp the distribution, and considering student feedback as one of several measures for assessing and enhancing teaching quality.
  4. This process should act as part of a holistic teaching evaluation system, which should also encompass peer evaluations, self-assessments, and reviews of teaching materials. A comprehensive approach offers a more precise and balanced assessment of teaching effectiveness.

References

Bano, M., Didar Zowghi, & Whittle, J. (2023). Exploring Qualitative Research Using LLMs. ArXiv (Cornell University). https://doi.org/10.48550/arxiv.2306.13298

Linse, A. R. (2017). Interpreting and using student ratings data: Guidance for faculty serving as administrators and on evaluation committees. Studies in Educational Evaluation54, 94–106. https://doi.org/10.1016/j.stueduc.2016.12.004

Kastrati, Z., Dalipi, F., Imran, A. S., Pireva Nuci, K., & Wani, M. A. (2021). Sentiment Analysis of Students’ Feedback with NLP and Deep Learning: A Systematic Mapping Study. Applied Sciences, 11(9), 3986. https://doi.org/10.3390/app11093986

OpenAI. (2024). ChatGPT (Feb 10, 2024) [Large language model]. https://chat.openai.com/chat

Effective Technology Tools for K-12 CS Teachers

Technology plays a crucial role in teaching computer science and programming concepts to K-12 teachers. The most effective technology tools include interactive coding platforms such as Scratch, Snap! and Blockly. These tools provide a user-friendly interface and visual coding blocks, allowing students to learn programming concepts through hands-on activities and projects (Kashif Amanullah & Bell, 2020). Additionally, online learning platforms such as Code.org offer computer science platforms specifically designed for K-12 teachers. This blog examines various technologies used to teach CS in K-12 schools, drawing insights from a comprehensive study on visual programming languages (VPLs) and their suitability across different school levels.

Role of VPLs in K-12 Education:

VPLs like Scratch and ALICE have revolutionized CS education in schools. Scratch, developed by MIT, is particularly effective in elementary education due to its simplicity and interactive environment, making it an ideal tool for introducing programming concepts (Sáez-López et al., 2016). Although not web-based, ALICE has positively impacted all educational levels – elementary, high school, and undergraduate. Its ability to facilitate learning and enhance student confidence makes it an asset in the CS curriculum (Graczynska, 2010). In a 2019 study, do Nascimento et al. concluded that different visual programming languages (VPL) suit different school levels. The study focused on three VPLs: ALICE, Scratch, and iVProg. The findings indicate that Scratch is strongly suitable for elementary education, while ALICE is more appropriate for high school students. iVProg, on the other hand, has indications of suitability for high school and undergraduate levels.  

Enhancing Computational Thinking with Scratch

Studies have shown that Scratch’s block-based programming approach can significantly improve students’ computational thinking skills. Its integration into various disciplines through programming games and projects encourages creative problem-solving and logical reasoning among students (Stewart & Baek, 2023). In a significant study, Scratch was also found to integrate well into other subjects in the curriculum, such as math, science, and even art and history, where students achieved comprehension and application levels in Bloom’s taxonomy (Sáez-López et al., 2016).

Scratch Interface

The advantages of using Scratch in the classroom are that its intuitive drag-and-drop interface simplifies the programming process, allowing students to focus on the logic behind their creations rather than the code syntax. Overall, the visual programming approach via Scratch was effective for developing computational thinking, improving programming skills, enabling the creation of interactive projects, and supporting active learning pedagogies (Sáez-López et al., 2016). This is significant since Sun, Hu, and Zhou (2022) found that although girls in K-12 had higher computational thinking skills, they had more negative programming attitudes, which may impact their continued development in computational thinking. Visual programming may be a good strategic approach to engage females in computer science.

ALICE for STEAM Education

ALICE (which stands for Alice Learning in a Cyberworld Environment) is a free 3D programming platform developed at Carnegie Mellon University. The visual aspect of ALICE makes programming concepts more engaging and hands-on for students. Actions like loops, methods, and events correspond to actual animated motions they can see on screen. This helps concretize abstract coding notions that beginners often struggle to grasp.

ALICE Lists

Graczyńska (2010) highlights several example uses of ALICE targeted at middle school students:

  • Creating videos set to music, with lyrics displayed as subtitles. This combines coding with music appreciation and language arts.
  • Recording narration for animations, like reciting poetry in English or other languages. This boosts public speaking and foreign language skills.
  • Building simple games with sound effects and animations like fire. This makes programming exciting and fun.

After testing ALICE with students, Graczyńska found increased engagement and interest in programming and academics overall. The visual nature of ALICE also helps attract female students to computer science, where they are traditionally underrepresented.

The use of 3D visual programming tools like Alice has shown positive effects on students’ performance and attitude towards computer programming. Al-Tahat (2019) found that teaching visual programming greatly improved understanding of related concepts in object-oriented programming, making it a perfect fit for the intermediate grades.

Challenges and Future Directions:

The adoption of these technologies in K-12 computer science (CS) education has shown promise, yet challenges remain to be addressed. There is substantial evidence that incorporating VPLs into the K-12 curriculum significantly boosts female engagement (Sun et al., 2022; Graczyńska, 2010). Therefore, it is important to focus on course design that appeals to diverse learners, including females and underrepresented minorities. Additionally, ongoing research and development are necessary to keep up to date with technological progress and the changing needs of education (McGill et al., 2023). Sáez-López et al. (2016) have suggested that VPLs should be implemented across various subjects, particularly in social sciences and the arts, where their visual nature can inspire creative projects. Lastly, the successful integration of new programming tools hinges on teacher training and professional development. Teachers need robust support to acquire and apply these technologies effectively.

References

Kashif Amanullah, & Bell, T. (2020). Teaching Resources for Young Programmers: the use of Patterns. https://doi.org/10.1109/fie44824.2020.9273985

Sáez-López, J.-M., Román-González, M., & Vázquez-Cano, E. (2016). Visual programming languages integrated across the curriculum in elementary school: A two year case study using “Scratch” in five schools. Computers & Education97, 129–141. https://doi.org/10.1016/j.compedu.2016.03.003

Graczyńska, E. (2010). ALICE as a tool for programming at schools. Natural Science02(02), 124–129. https://doi.org/10.4236/ns.2010.22021

do Nascimento, M. D., Felix, I. M., Ferreira, B. M., de Souza, L. M., Dantas, D. L., de Oliveira Brandao, L., & de Oliveira Brandao, A. (2019). Which visual programming language best suits each school level? A look at Alice, iVProg, and Scratch. 2019 IEEE World Conference on Engineering Education (EDUNINE). https://doi.org/10.1109/edunine.2019.8875788

Stewart, W., & Baek, K. (2023). Analyzing computational thinking studies in Scratch programming: A review of elementary education literature. International Journal of Computer Science Education in Schools6(1), 35–58. https://doi.org/10.21585/ijcses.v6i1.156

Sun, L., Hu, L., & Zhou, D. (2022). Programming attitudes predict computational thinking: Analysis of differences in gender and programming experience. Computers & Education181, 104457. https://doi.org/10.1016/j.compedu.2022.104457

Graczyńska, E. (2010). ALICE as a tool for programming at schools. Natural Science02(02), 124–129. https://doi.org/10.4236/ns.2010.22021

Al-Tahat, K. (2019). The Impact of a 3D Visual Programming Tool on Students’ Performance and Attitude in Computer Programming. Journal of Cases on Information Technology21(1), 52–64. https://doi.org/10.4018/jcit.2019010104

Teaching Programming with Minecraft Education: A Reflection

Introduction

Integrating innovative tools to enhance learning is essential in the dynamic landscape of computer science education. This term, I embarked on a collaborative journey to weave Minecraft Education into a Programming 11/12 course. Our objective was to enliven the curriculum by presenting programming concepts in a more engaging and interactive manner. This reflection delves into our experiences, with a particular focus on the concept of functions.

Lesson Overview

Our lesson was carefully prepared to guide students through the fundamentals of functions in programming via the Minecraft Education platform. This approach aimed to convert abstract concepts into concrete, relatable experiences, thus making learning both enjoyable and impactful.

The session began with a simple introduction to functions in Minecraft Education using MakeCode, drawing parallels with real-life scenarios to demystify these concepts. The goal was to underscore the significance of reusing code efficiently. For instance, we showcased a function that could construct various parts of a structure, such as walls, roofs, and fences. This hands-on demonstration helped students visualize the workings of functions, deepening their comprehension.

Subsequently, we organized the students into small teams for a series of Minecraft challenges. Each group applied their newfound knowledge to construct farm elements using coded functions. Encouraging students to build barns, animal enclosures, and residential structures, this immersive experience was crucial in reinforcing the lessons imparted and empowering students to explore coding within the game environment. While the MakeCode IDE is freely available online at https://minecraft.makecode.com/,  it is important to note that witnessing the code’s execution within Minecraft Education itself requires a paid subscription for each student (which we lacked for this iteration).

Following the building activities, groups presented their projects, explained their code, and engaged in Q&A sessions. This exercise culminated in the creation of a complete farm ecosystem (with a small amount of manual intervention), facilitating peer learning and evaluating their understanding of the lesson.

The lesson wrapped up with a debriefing segment, which focused the role of functions in streamlining complex coding tasks. We also distributed surveys to gauge the students’ experiences with the lesson.

Reflections and Learnings

Reflecting on the teaching process, I’ve recognized the crucial need for thorough preparation ahead of each class. Although the lesson itself was effective, there are areas where we could have utilized our time more judiciously.

Time Management:

Our planning meetings often veered towards administrative topics, detracting from the core lesson content. This experience has ingrained in me the importance of arriving at meetings well-prepared and with preliminary research completed, to maximize our collaborative efforts.

Technical Challenges:

Establishing a connection to the same Minecraft world across various platforms, such as PC and Mac, presented significant hurdles. This impacted our preparations and underscored the necessity for preemptive compatibility checks for future sessions. The tightly controlled environment of Minecraft Education by Microsoft impeded remote learning, suggesting that Minecraft Education is best suited to in-lab settings. Remote functionality was unreliable, as indicated by non-descriptive connection error messages like “timed out,” and support from Microsoft was less than helpful. The trial version of the software, supposedly available to schools with Microsoft logins, also failed to work, potentially necessitating IT intervention.

Student Engagement:

The lesson garnered positive feedback and high engagement levels, with the practical application of programming concepts within a familiar gaming environment being a key factor in its success. Nonetheless, some students noted that the inability to run the code hindered the debugging process. Ensuring every student has access to the necessary software and hardware will be a priority for future lessons.

The Power of Interactive Learning:

A major insight from this endeavour is the profound impact of interactive learning tools such as Minecraft in teaching intricate subjects like programming. Students were more engaged and assimilated the concept of functions more thoroughly compared to conventional teaching methods.

Conclusion

Incorporating Minecraft into our programming curriculum has been enlightening for students and educators. It has accentuated the significance of preparation, flexibility, and the assurance of technical compatibility to facilitate a seamless learning experience. The positive student feedback and evident boost in engagement and comprehension underscore our conviction in the power of interactive learning tools. As we progress, we are determined to refine our methods, confront the technical obstacles, and seek inventive strategies to render education more captivating and effective.

The Role of ChatGPT in Introductory Programming Courses

Introduction

Programming education is on the cusp of a major transformation with the emergence of large language models (LLMs) like ChatGPT. These AI systems have demonstrated impressive capabilities in generating, explaining, and summarizing code, leading to proposals for their integration into coding courses. Aligning with ISTE Standard 4.1e for coaches, which urges the “connection of leaders, educators, and various experts to maximize technology’s potential for learning,” this post examines how ChatGPT and similar tools can be effectively integrated into introductory programming classes. It covers the benefits of AI tutors, insights from educators on their use, and current best practices and trends for deployment in the classroom.

The Current State of AI in Computer Science Education

The current integration of AI in computer science education is showing promising results. ChatGPT excels in providing personalized and patient explanations of programming concepts, offering code examples and solutions tailored to students’ individual needs. Its interactive conversational interface encourages students to engage in a dialogue, solidifying their understanding through active participation and feedback. Students can present coding issues in simple terms and receive a comprehensive, step-by-step explanation from ChatGPT, clarifying fundamental principles throughout the process.

Such dynamic assistance clarifies misunderstandings more effectively than static textbooks or videos. ChatGPT’s round-the-clock availability as an AI tutor offers crucial support, bridging gaps when human instructors are unavailable. According to research by Kazemitabaar et al. (2023), using LLMs like ChatGPT can bolster students’ abilities to design algorithms and write code, reducing the stress often accompanying these tasks. The study also noted increased enthusiasm for learning programming among many students after exposure to LLM-based instruction.

Pros of Incorporating ChatGPT into the Classroom

The rapid advancement of AI systems such as ChatGPT offers many opportunities and poses some challenges in computing education. ChatGPT’s conversational interface and its capability to provide personalized content make it an exceptional asset for adaptive learning in AI-assisted teaching. Biswas (2023) identifies multiple applications for LLMs in educational settings, including their role in creating practice problems and code examples that enhance teaching. Furthermore, ChatGPT can anticipate and provide relevant code snippets tailored to the programming task and user preferences, accelerating development processes. It can also fill in gaps in code by analyzing the existing framework and project parameters. Additionally, LLM-facilitated platforms help with explanations, documentation, and resource location for troubleshooting and diagnosing issues from error messages, streamlining debugging and reducing the time spent on minor yet frustrating problems.

Cons of Incorporating ChatGPT in Education

Despite the advantages of ChatGPT, there is concern that its proficiency in solving basic programming tasks may lead to student overreliance on its code generation, potentially diminishing actual learning, as evidenced by Finnie-Ansley et al. (2022) and Kazemitabaar et al. (2023). Finnie-Ansley’s research indicates that, while LLMs can perform at a high level (scoring in the top quartile on CS1 exams), they are not without significant error rates. Moreover, the benefits attributed to ChatGPT, such as code completion, syntax correction, and debugging assistance, overlap with features already available in modern Integrated Development Environments (IDEs).

Concerns extend to ChatGPT facilitating ‘AI-assisted cheating,’ which threatens academic integrity and assessment validity (Finnie-Ansley et al., 2022). To counteract this, researchers suggest crafting more innovative, conceptual assignments beyond simple coding tasks (Finnie-Ansley et al., 2022; Kazemitabaar et al., 2023). Educators in computing must adopt careful strategies for integrating ChatGPT, using it as a scaffolded instructional tool rather than a crutch for solving exam problems, to maintain a focus on in-depth learning.

Instructors’ Perspectives and Experiences

In a study conducted in 2023, Lau and Guo interviewed 20 introductory programming instructors from nine countries regarding their adaptation strategies for LLMs like ChatGPT and GitHub Copilot. In the near term, most instructors intend to limit the use of LLMs to curb cheating on assignments, which they view as a potential detriment to learning. Their strategies range from emphasizing in-person examinations to scrutinizing code submissions for patterns indicative of LLM use and outright prohibiting certain tools. Some, however, are keen to explore the capabilities of ChatGPT, proposing its cautious application, such as demonstrating its limitations to students by having them assess its output against test cases.

In contemplating the future, these educators showed greater willingness to integrate LLMs as teaching tools, recognizing their congruence with real-world job skills, their potential to enhance accessibility, and their use in facilitating more innovative forms of coursework. For example, they discussed transitioning from having students write original code to evaluating and improving upon code produced by LLMs—a few envisioned LLMs functioning as custom-tailored teaching aids for individual learners.

Pedagogical Strategies and Opportunities for Future Research

Designing problems that demand a deep understanding of concepts rather than the execution of routine coding tasks, which LLMs easily handle, is a vital pedagogical shift proposed by Finnie-Ansley et al. (2022) and Kazemitabaar et al. (2023). Utilizing ChatGPT as an interactive educational tool to complement teaching—instead of as a mere solution provider—may strike an optimal balance between its advantages and potential drawbacks. Given the pace at which AI technology is being adopted in education, there’s a pressing need for further empirical research to identify the most effective ways to integrate these tools and assess their impact on student learning.

References

Biswas, S. (2023). Role of ChatGPT in Computer Programming. Mesopotamian Journal of Computer Science, 8–16. https://doi.org/10.58496/mjcsc/2023/002

Kazemitabaar, M., Chow, J., Carl, M., Ericson, B. J., Weintrop, D., & Grossman, T. (2023). Studying the effect of AI Code Generators on Supporting Novice Learners in Introductory Programming. https://doi.org/10.1145/3544548.3580919

Finnie-Ansley, J., Denny, P., Becker, B. A., Luxton-Reilly, A., & Prather, J. (2022). The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. Australasian Computing Education Conference. https://doi.org/10.1145/3511861.3511863

Lau, S., & Guo, P. (2023, August). From” Ban it till we understand it” to” Resistance is futile”: How university programming instructors plan to adapt as more students use AI code generation and explanation tools such as ChatGPT and GitHub Copilot. In Proceedings of the 2023 ACM Conference on International Computing Education Research-Volume 1 (pp. 106-121). https://doi.org/10.1145/3568813.3600138

Measuring Student Contribution in a Software Engineering Team

Introduction

In software engineering, there is very little consensus on how to measure an individual developer’s contribution. Although many measures have been proposed, their usefulness in the industry lacks validation, particularly from the perspectives of team leaders and managers (Lima et al., 2015). The lack of measurement also challenges educators (Gardner et al., 2003). This post will examine student developer contributions within the context of a software engineering project.

ISTE Standard 4.6 advocates for ed tech coaches to be data-driven decision-makers using qualitative and quantitative data to inform their decisions. Standard 4.6b states, “Support educators to interpret qualitative and quantitative data to inform their decisions and support individual student learning.” Techniques discussed in this article could be used to measure student engagement and fulfillment in a team project and give insight into where instruction can be altered in a software engineering course.

I will begin by examining the use of chat platforms like Discord to track individual student contributions. Next, I’ll discuss the role of peer evaluations in assessing team member input. Lastly, I’ll introduce repository mining techniques to quantify these contributions.

Live chat Activity

We’ll start with what I consider the least effective among the three metrics. In recent years, many modern developers have adopted Discord as a tool for real-time communication and collaboration in software engineering projects. Fundamentally, Discord channels serve as dedicated spaces for text, voice, and video communication. In educational contexts, these channels can be structured to reflect the various teams within a software project, facilitating organized, topic-specific discussions. Such channels can host various activities, from casual interactions and planning sessions to problem-solving discussions and code reviews, closely mirroring a real-world software development environment. Furthermore, Discord captures all these interactions, creating a comprehensive, searchable archive of every conversation and exchange.

Moreover, thanks to its bot-integration features, Discord is increasingly seen as an innovative tool for gauging student contributions in team-based projects. Analytical bots like Statbot offer detailed statistics on individual interactions on the platform, enabling the assessment of each student’s engagement. Chat histories also supply quantitative data on the quality of contributions in software engineering team projects.

However, while bots offer valuable quantitative and analytical insights, it’s important to complement this data with qualitative evaluations. Direct observations, feedback sessions, and individual discussions remain indispensable for grasping the subtleties of each student’s input. It’s also vital to address privacy concerns and uphold ethical standards in monitoring, ensuring clear guidelines and transparency from the instructor’s side.

Peer Evaluations

Gardner et al. (2003) conducted a study exploring the use of group member ratings to gauge relative contributions among students in a software engineering team project course. At the end of the project, students rate each team member’s contributions across four criteria using a five-point scale:

  • Attendance at team meetings.
  • Volunteering for and carrying out tasks.
  • Quality of work performed.
  • Effectiveness in communicating ideas.

The findings suggest that these anonymous peer ratings are reliable for ranking team members on their contributions. While students often rate themselves higher than their teammates, the relative contributions ranking remains consistent, which aligns with previous research (West, 2018, Ch.16).

This approach quantifies peer perceptions of engagement and effort. It motivates students to interact and collaborate and allows teams to self-manage contributions. However, limitations exist. Students may not accurately judge true contributions. Dominant personalities could influence ratings. Moreover, if grades hinge directly on these ratings, it might encourage score inflation.

Despite its limitations, peer ratings offer a systematic method to encourage and gauge participation in team projects. They represent the firsthand insights of teammates into individual efforts and team dynamics. Instructors should triangulate peer evaluations with other performance indicators to mitigate potential biases. When applied thoughtfully, group member ratings can be a scalable tool to enhance accountability and ensure equitable effort distribution within student engineering teams.

Using Git Repositories

While subjective peer evaluations are commonly used, analyzing data from git repositories provides an objective lens into individual contributions, revealing insights into aspects like collaboration patterns, subsystem ownership, and consistency of participation (Lima et al., 2015). Instructors can combine these repository-based metrics with subjective evaluations to assess student effort and engagement better.

A fundamental metric is examining each student’s number of commits over time, called code contribution (Lima et al., 2015). This helps reveal whether students contribute regularly throughout the project or make concentrated commits right before deadlines. Students with relatively few commits thinly spread across the weeks likely contributed minimally, while a student with a steady stream of commits each week demonstrates consistent engagement (Glassy, 2006).

Examining the content of commits also provides insights into contribution quality. The code complexity measure is also widely accepted as a good measure of contribution. The code complexity measure considers the complexity and difficulty of the sub-problem being solved. Complexity measures were proposed by McCabe in 1976 and are still widely used today to examine git repositories. The measures analyze code complexity before and after a team member has altered it. Low commit complexity suggests weaker contributions to the team’s software development processes.

A variation of the code complexity measure is the bug-related measures, which measure the contribution to bug introductions and bug-fixing. However, this measure has limitations because some bug fixes do not require writing code, mitigating the developer’s efforts (Lima et al., 2015). Also, advanced repository analysis can reveal collaboration patterns within student teams. Tools like FRASR and ProM introduced by Poncin et al. (2011) can extract event logs from student repository data (using FRASR) and subsequently analyze the development process (with ProM). This tool also incorporates developer roles and adherence to certain development models. 

Of course, reliance solely on git metrics has limitations. First, commits mainly represent coding contributions, overlooking other forms of participation like verbal collaboration and project leadership (Lima et al., 2015). Second, students can artificially inflate their repository activity metrics if they know the algorithm being used. Despite these drawbacks, analyzing git data provides valuable insights into individual participation on student software teams. Instructors should interpret repository metrics not as absolute contribution measures but as launching points for further investigation.

Conclusion

By balancing quantitative git data with qualitative peer evaluations, product assessments, and student interviews, instructors can obtain a more equitable evaluation of individuals. Nonetheless, there is a strong correlation between subject and objective measures of contribution to a project (Hundhausen et al., 2022). Software engineering courses require team projects, but assessing individual accountability remains vital. Combining subjective reviews and objective repository analysis helps reveal a more accurate picture of each student’s contributions and commitment.

References

Lima, J., Christoph Treude, Fernando Figueira Filho, & Kulesza, U. (2015). Assessing developer contribution with repository mining-based metrics. https://doi.org/10.1109/icsm.2015.7332509

Gardner, W. (2003). Assessing individual contributions to group software projects. In 8th Western Canadian Conference on Computing Education (WCCCE’03) (pp. 33-50).

Hundhausen, C. D., Conrad, P. T., Carter, A. S., & Adesope, O. (2022). Assessing individual contributions to software engineering projects: a replication study. Computer Science Education32(3), 335–354. https://doi.org/10.1080/08993408.2022.2071543

West, R. E. (2018). Foundations of Learning and Instructional Design Technology. https://doi.org/10.59668/3

Glassy, L. (2006). Using version control to observe student software development processes. Journal of Computing Sciences in Colleges21(3), 99–106.

McCabe, T. J. (1976). A Complexity Measure. IEEE Transactions on Software EngineeringSE-2(4), 308–320. https://doi.org/10.1109/tse.1976.233837

Poncin, W., Serebrenik, A., & Mark. (2011). Mining student capstone projects with FRASR and ProM. https://doi.org/10.1145/2048147.2048181

The Pros and Cons of Autograders in Programming Courses

Programming courses typically require assignments where students write code to fulfill specific specifications. In such courses, an autograder serves as an automated tool designed to assess student code submissions by conducting input and output tests. Autograders have been in existence since the inception of computer science as a field of study (Hollingsworth, 1960). More recently, with the increase of massive online programming courses hosting up to 500 students, autograders have gained popularity as an efficient means for grading programming assignments (Keuning et al., 2018). They are instrumental in student engagement (Iosup & Epema, 2014) and pivotal in providing students with constructive feedback (Keuning et al., 2018). However, like any educational technology, autograders come with their own set of advantages and disadvantages that warrant consideration. This post aims to explore the significant pros and cons of employing autograders for assessments in programming courses.

Several renowned proprietary programming autograders are currently available, including CodePost, CodeGrade, Codio, and Mimir. Each tool offers a wealth of academic programming resources, including built-in problems, user-friendly interfaces, flexible question setting, and code review capabilities. However, these companies impose a substantial annual fee on institutions, ranging from $20,000 to $100,000 CAD, for a standard school comprising 1000 students. Additionally, each student is required to pay a monthly fee between $10 and $50 CAD.

In my view, such pricing is excessive (and greedy) and contradicts the principles outlined in the computer science code of ethics, particularly when the software is intended to advance software development. As a result, many post-secondary institutions opt to develop and maintain autograders in-house, tailoring them to their specific preferences. This approach allows faculty to propose new features and enhancements, and students can also contribute suggestions for improvement.

Advantages of Autograders

One of the most compelling incentives for using an autograder is the significant time savings it offers instructors compared to manual grading. Studies indicate that autograders can assess assignments at least three to four times faster than human graders (Ihantola et al., 2010; Keuning et al., 2018). This substantial reduction in grading workload allows instructors to allocate more time to essential teaching tasks such as lesson planning, curriculum development, and providing student support and feedback. The time savings can be particularly substantial in large classes.

Autograders also benefit students by providing quicker feedback on their work. This is especially valuable in introductory programming classes, where receiving prompt results on smaller assignments can significantly enhance student learning and motivation (Keuning et al., 2018). Unlike human grading, which can take days or weeks, autograders can assess submissions within seconds or minutes and instantly inform students whether their code has passed or failed the test cases. This expedited feedback allows students to validate and refine their work much more rapidly than traditional grading methods permit.

A prevalent concern with human graders is the inconsistency in grading from one assignment to another, from one student to another, or even within a single assignment. Factors such as fatigue, emotional states, and biases can impact the quality of human grading, potentially leading to unfairness or errors. Autograders, by contrast, eliminate this subjectivity by applying uniform standards and tests to all submissions, ensuring consistent and equitable grading across the entire class, and thereby enhancing student satisfaction (Hagerer, 2021).

In courses that employ autograders, students quickly learn the necessity of writing code that meets all the autograder test cases to secure maximum assignment credit. While the efficacy of test-driven development (TDD) as a software testing methodology is debatable, this workflow provides students with experience in the TDD framework. Here, students continually run tests on their code to rectify errors and attain the desired functionality (Wang et al., 2011). Essentially, autograders compel students to consider testing as an integral part of coding, rather than merely striving to meet the minimal functional requirements.

Disadvantages of Autograders

A significant drawback of autograders, frequently cited in literature, is their inflexibility compared to human graders (Ihantola et al., 2010; Keuning et al., 2018; Wang et al., 2018). Autograders strictly apply identical test cases to all submissions without exception. Consequently, creative solutions that meet the assignment requirements but deviate from the expected implementation or output format are marked incorrect. Even a minor discrepancy such as a missing whitespace can be the difference between a pass and a fail. Unlike autograders, human graders can exercise judgment to accommodate alternative approaches.

Most autograders assess the functional correctness of student codes, evaluating output for given tests. However, programming courses also aim to instill good coding practices, such as readability, modularization, adherence to naming conventions, coherent design, and appropriate commenting, in students. Autograders do not adequately assess these crucial design and style aspects, leading students to neglect good design principles as long as their code passes the functionality tests.

Another concern is that while autograders are designed to offer students a structured means to advance their knowledge across multiple courses, achieving uniformity in their application across various courses is challenging, especially in larger institutions. Typically, post-secondary institutions employ autograders to maintain consistency across different courses, enabling students to track their progress effectively. However, in institutions where numerous faculty members teach diverse courses with varying requirements, achieving universal acceptance and use of autograders is complex. Faculty members may prefer different tools they are more comfortable with, and some might choose not to use autograders. This results in a lack of uniformity in tool usage from one course to another, creating a disjointed student experience.

Relying exclusively on autograders poses the risk of students learning to pass test cases without acquiring a deeper understanding of programming concepts and problem-solving skills. The emphasis on meeting the autograder’s criteria can lead students to adopt a procedural approach, focusing on achieving the correct output rather than understanding the underlying logic. Some might resort to a trial-and-error method, tweaking their program until it gains autograder approval. While this approach may secure the desired grades, it does not foster genuine understanding or long-term retention of knowledge. Baniassad et al. (2021) introduced a submission penalty at the University of British Columbia to discourage over-reliance on their in-house autograding tool. This adaptation exemplifies the flexibility of modifying tool requirements, a possibility uniquely available when the tool is developed in-house.

Finally, like any web-based software system, autograders can experience technical issues that lead to grading failures and student frustration. The UC Berkeley incident highlights the “single point of failure” risk where an autograder disruption blocks all grading capabilities. Unlike distributed human graders, a centralized automated grader represents a vulnerability to technical problems. Some may fail to meet deadlines through no fault of their own. Furthermore, if instructors refuse to make accommodations for autograder malfunctions, students can feel cheated and that the grading is unfairly disconnected from actual instruction. This speaks to larger concerns around over-reliance on algorithmic systems in education. Automated aids like autograders should not be seen as the sole means of assessment.

Conclusion

The existing body of research on autograders underscores that they are not a panacea for replacing human graders entirely. Instead, to optimize their advantages and mitigate their limitations, autograders are most effective when thoughtfully integrated into a course assessment strategy, complemented by manual grading where it is most beneficial. Below are some best practices for incorporating autograders effectively:

  • Employ autograders for basic functionality testing, while manually reviewing selected assignments for flexibility, creativity, and design.
  • Utilize autograders to assess the correctness of core logic, and rely on human graders to evaluate structure, style, and readability.
  • Complement autograder evaluations with human feedback on prevalent mistakes and areas requiring enhancement.
  • Impose penalties for excessive submissions to discourage over-reliance on the autograder.

Proper integration of autograders aligns with technology integration frameworks like SAMR, enhancing existing processes without entirely transforming the grading in programming courses. It also redefines the manner in which students engage with programming, introducing a more gamified approach. Like any educational technology, the value of autograders is derived from their strategic utilization within well-defined goals and contexts.

References

Hollingsworth, J. (1960). Automatic graders for programming classes. Communications of the ACM3(10), 528–529. https://doi.org/10.1145/367415.367422

Keuning, H., Jeuring, J., & Heeren, B. (2016). Towards a Systematic Review of Automated Feedback Generation for Programming Exercises. Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education. https://doi.org/10.1145/2899415.2899422

Iosup, A., & Epema, D. (2014). An experience report on using gamification in technical higher education. Proceedings of the 45th ACM Technical Symposium on Computer Science Education – SIGCSE ’14. https://doi.org/10.1145/2538862.2538899

Ihantola, P., Ahoniemi, T., Karavirta, V., & Seppälä, O. (2010). Review of recent systems for automatic assessment of programming assignments. Proceedings of the 10th Koli Calling International Conference on Computing Education Research – Koli Calling ’10. https://doi.org/10.1145/1930464.1930480

Hagerer, G. (2021). An Analysis of Programming Course Evaluations Before and After the Introduction of an Autograder. (n.d.). Ieeexplore.ieee.org.

 Wang, T., Su, X., Ma, P., Wang, Y., & Wang, K. (2011). Ability-training-oriented automated assessment in introductory programming course. Computers & Education56(1), 220–226. https://doi.org/10.1016/j.compedu.2010.08.003

Baniassad, E., Zamprogno, L., Hall, B., & Holmes, R. (2021). STOP THE (AUTOGRADER) INSANITY: Regression Penalties to Deter Autograder Overreliance. Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. https://doi.org/10.1145/3408877.3432430

Reflecting on a Study of Competitive Programming and Cultural Inclusion

Length of Study

The study is designed to take place over two academic terms, which provides adequate time to collect meaningful data. The inclusion of an initial summer term without competitive programming establishes a baseline for comparison. The second summer term incorporates competitive programming using standardized questions, allowing assessment of this pedagogical approach. The fall term offering adds the dimension of culturally relevant questions, enabling analysis of their impact. Extending the study over multiple terms enables more robust data collection and analysis.

Promoting Active and Engaged Learning

The core content is delivered through weekly lectures focused on programming concepts. The competitive programming contests complement the lectures by providing opportunities to practice applying concepts. Weekly competitive programming contests foster active learning in several key ways. Students must apply conceptual knowledge to solve concrete programming problems. This process reinforces their understanding and helps identify knowledge gaps. The contest format adds an engaging gamification element through scoring, feedback, and peer comparison. Using standardized questions initially assesses whether baseline content needs are being met.

Introducing culturally relevant questions aims to promote better integration of concepts by relating them to students’ cultural knowledge and experiences. Having students co-create contest questions in the fall term further activates learning. They must think critically to develop culturally relevant problems that integrate with the content. This approach promotes deeper engagement with the material and encourages collaboration with classmates, allowing students to take ownership of their learning.

Addressing Teachers’ Needs

The study aims to provide teachers with insight into using competitive programming and culturally relevant pedagogy. The data collected will help determine the effectiveness of these approaches in an international educational setting. Instructors will gain an understanding of how competitive programming engages students versus standardized practice problems. They will also see whether student-created culturally relevant questions increase participation and motivation. The study addresses teachers’ needs for effective and inclusive instructional strategies. They will gain practical knowledge from the comparative data on different contest designs.

Promoting Collaborative Participation

Collaboration is encouraged through the group development of culturally relevant contest questions. Students can brainstorm and build on each other’s ideas, which fosters teamwork. Producing questions from diverse cultural perspectives requires working together. Students are also given the choice of problem-solving in teams. Students can motivate each other and strategize in groups for the competitions. Their scores are tracked on a collective leaderboard which reinforces the collaborative element. The shift from individual to team contest creation necessitates and enables productive collaboration.

The multi-term study design, interactive contest format, customized problems, and collaborative elements demonstrate an interesting pedagogical approach that promotes engaged and inclusive learning. The results should provide valuable insights for computer science educators.

Culturally Responsive Computing Approaches

Introduction

Culturally responsive computing (CRC) is an approach to designing technology education programs and tools that responds to the cultural contexts of learners and represents an intersection between computer science, education, and sociocultural understanding. It has roots in the extensive and well-studied area of culturally responsive teaching (CRT), which argues that empowering diverse students requires building on the cultural assets they bring to the classroom. CRC translates fundamental principles of CRT to computer science education and ensures that the cultural experiences of learners, particularly those from underrepresented groups, are valued and used to enhance their learning experience. In this blog post, I will uncover some examples of research that has established the critical role CRC plays in promoting inclusion, diversity, and equity in the computer science classroom.

History of CRC

Foundational concepts for CRC were established between the early and mid-1990s. Henderson (1996) argued that instructional design models for teaching technology must consider diverse learners’ cultural orientations. Henderson proposed the Multiple Cultural Model for instructional design, which sheds light on the various dimensions that influence how diverse cultural groups interact with multimedia learning environments. For instance, some cultures might lean towards cooperative learning, while others favour competition.

In 1999, McLoughlin outlined features necessary for culturally appropriate online learning for Indigenous Australian students, emphasizing participatory tasks and problem-based dialogue. Subsequently, Lee (2003) presented a framework designed to ensure that computing tools and environments respond effectively to the prior knowledge, perspectives, and motivations of minority learners. This framework was shown through software that facilitated literacy development among African American students, thereby demonstrating the effectiveness of this approach.

Limitations of the CRC Framework

Drawing on their programs, Scott, Sheridan, and Clark (2014) implemented their unique CRC programs, critiquing the limitations of traditional asset-based approaches and advocating for direct cultural responsiveness. Their arguments highlighted the following points:

  1. All youth possess the capability for digital innovation, thereby challenging deficit perspectives.
  2. Learning environments should promote transformational uses of technology.
  3. Paying attention to intersectional identities can foster innovation in computing.
  4. Students should utilize technology to reflect on their complex identities.
  5. Success should be defined by creating for community benefit rather than merely acquiring skills.

They provided examples such as critiquing biased media representations and encouraging students to create media that affirmed their identities. The implications of their arguments include the need to revise methods and measures, conduct intersectional research, and promote collaboration between computer experts and communities. CRC can potentially address digital equity through innovation, especially when implementations consider students’ multifaceted identities.

Culturally Responsive Computing Tools

Reflecting on these limitations, Morales-Chicas et al. (2019) conducted a comprehensive study on the tools and strategies employed in K-12 computing education for CRC. They identified the following emergent themes:

The first was sociopolitical consciousness-raising, which pertains to lessons that address real-world issues and promote activism. For example, COMPUGIRLS is a CRC program for adolescent girls of colour from underserved communities. Drawing on principles of culturally responsive teaching, including asset building, connectedness, and reflection, the program equips girls with the technological skills needed to research and address community issues. Participants reported increased confidence, the development of identities as technology innovators, and a feeling of empowerment from creating projects that address social justice issues.

Another theme is incorporating heritage culture through artifacts, like designs and symbols. Examples include programs encouraging student-created media to challenge stereotypes and software that builds on cultural practices, such as hair braiding patterns (Eglash & Bennett, 2009). This builds community connections, which involve community members sharing cultural knowledge and motivating students to engage actively.

Vernacular culture employs local cultural practices that are relevant to students. An example is the American Distributed Multiple Learning Styles Systems (AADMLSS), a programming tool designed to engage African American students using math and characters representing their vernacular culture. Studies have shown a surge in youth engagement due to the high cultural relevance of this approach.

Lastly, the theme of lived experiences connects to students’ identities and real-world contexts. For instance, Scott & White (2013) argued that CRC should consider students’ lived experiences and encourage self-representation, evidenced by a youth exercise in COMPUGIRLS on identifying gender biases in avatar creation. Also, by introducing personalized elements into a course, students can analyze this aspect of the computing experience critically, further enabling the customization of computing projects.

Conclusions

Studies have scrutinized the implications of the developments in CRC. For assessment, this necessitates a move beyond narrow measures such as grades or test scores to capture complex identity outcomes (Scott & White, 2013). From a methodological perspective, it requires attention to intersectionality, considering how factors such as race, gender, and class shape technology experiences (Scott, Sheridan & Clark, 2014), more research is required to understand its effects on diverse populations and domains. In practice, CRC should adopt a multi-disciplinary stance, adopting collaboration between communities, social scientists, and computer scientists (Eglash et al., 2013).

Therefore, we call on computer science educators, tech companies, and community organizations to take the following actions:

  • Allow greater curriculum flexibility for CS instructors to adapt courses to their students’ cultures and identities, to discover the intersects for each student.
  • Develop alternative metrics focused on identity development, community impact, and equitable outcomes to complement skills-based measures.
  • Increase engagement of families and communities as partners in developing computing programs.
  • To exchange knowledge, Foster collaboration (through incentives) between tech companies, social scientists, and CS educators.

References

McLoughlin, C. (1999). Culturally responsive technology use: developing an on‐line community of learners. British Journal of Educational Technology30(3), 231–243. https://doi.org/10.1111/1467-8535.00112

Lee, C. D. (2003). Toward A Framework for Culturally Responsive Design in Multimedia Computer Environments: Cultural Modeling as a Case. Mind, Culture, and Activity10(1), 42–61. https://doi.org/10.1207/s15327884mca1001_05

Henderson, L. (1996). Instructional design of interactive multimedia: A cultural critique. Educational Technology Research and Development44(4), 85–104. https://doi.org/10.1007/bf02299823

Morales-Chicas, J., Castillo, M., Bernal, I., Ramos, P., & Guzman, B. (2019). Computing with Relevance and Purpose: A Review of Culturally Relevant Education in Computing. International Journal of Multicultural Education21(1), 125. https://doi.org/10.18251/ijme.v21i1.1745

Eglash, R., & Bennett, A. (2009). Teaching with Hidden Capital: Agency in Children’s Computational Explorations of Cornrow Hairstyles. Children, Youth and Environments19(1), 58–73. https://doi.org/10.1353/cye.2009.0024

Scott, K. A., & White, M. A. (2013). COMPUGIRLS’ Standpoint. Urban Education48(5), 657–681. https://doi.org/10.1177/0042085913491219

Scott, K. A., Sheridan, K. M., & Clark, K. (2014). Culturally responsive computing: a theory revisited. Learning, Media and Technology40(4), 412–436. https://doi.org/10.1080/17439884.2014.924966

Incorporating Competitive Programming into a Beginner Programming Course

Introduction

Driven by the increasing automation and digitalization of virtually every workflow, programming has become an indispensable part of our lives. As a result, introducing programming at the earliest stage of education has become a hot topic of discussion among educators and academics alike.

A particular area of interest is the concept of competitive programming (CP). Long viewed as a niche domain, a small group of enthusiasts often pursue CP to challenge their coding capabilities; many faculty have challenged the area as an unnecessary part of computer science. However, recent research underscores the potential of competitive programming as a useful pedagogical tool, especially in the context of introductory programming courses. This blog post will discuss the results of various studies that have been conducted on incorporating CP into a beginner’s programming course. I’ll review existing studies on integrating CP into intro-level programming courses, examining its effects on learning outcomes, student engagement, and skill acquisition. In addition, I will also propose some areas of CP that require further research.

Understanding Competitive Programming

Competitive programming is a mind sport, like chess and bridge, that involves participants competing to solve algorithmic problems as quickly and efficiently as possible. The ACM ICPC (Association for Computing Machinery – International Collegiate Programming Contest) is one of the world’s oldest, largest, and most prestigious programming contests, which started in the 1970s. Today, it has grown to involve tens of thousands of participants, attracting the world’s top Computer Science universities.

Several elements define each problem in the contest. First, there’s a problem statement describing the issue the team needs to solve. Next are the input and output specifications, which explain the type of data the team’s program should accept and produce. Thirdly, sample inputs and outputs are given to help the team understand the problem. Finally, constraints are provided to outline the maximum size or other limitations of the inputs and the required efficiency of the solution.

The contest is scored based on the number of problems solved and the time penalty. The number of problems solved is the most critical factor; the more problems a team solves, the higher their rank will be. Teams are primarily ranked by the number of problems they have solved. To break ties among teams who have solved the same number of problems, the ICPC uses a time penalty calculated from the beginning of the contest to the time of the first correct submission, with an additional penalty added for each incorrect submission. The team with the shortest total time is ranked highest.

The Impact of Competitive Programming on Beginners

Studies such as those conducted by Moreno et al. (2018) and Bandeira et al. (2019) employed this scoring system and contest setup to engage first-year students in programming classes. Both studies found that students introduced to competitive programming in their first year demonstrated a superior understanding of programming principles compared to those who did not. These students exhibited faster problem-solving abilities, improved code efficiency, and an increased capacity to work under pressure. Additionally, these students reported higher retention of material and reduced difficulty in grasping programming concepts.

However, not all studies concluded that CP led to improved performance. Coore and Fokum (2019), facing a lack of teaching assistants and quality feedback in first-year programming courses, employed a system of weekly competitive programming competitions to reinforce the week’s material. Their study found that while using competitive programming in assessments did increase student engagement and interest, it did not enhance the overall performance of the first-year students.

The Challenges

While CP introduces students to the rigours and excitement of coding under constraints, it’s important to recognize that CP cannot address every aspect of introductory programming. Also, certain facets of CP, such as its pace and competitive element, may only suit some learners.

Astrachan (2004) has pointed out that competitive programming only allows students to delve into key areas such as Object-Oriented Programming (OOP) design principles and enhancing code quality. CP emphasizes speed and efficiency, often overlooking the importance of well-structured, maintainable code, a crucial aspect in real-world development.

While competitive programming can inject a sense of competition into the classroom, it’s important to remember that it’s not a one-size-fits-all solution. The competitive aspect of CP may be intimidating for some students, leading to heightened anxiety and stress. This could, in turn, hinder learning and deter participation. Moreover, the pace of competitive programming, which requires swift comprehension of problem statements and speedy code implementation, may only cater to some learning styles. Some students may require more time to thoroughly grasp concepts and develop robust solutions, which could make the fast-paced environment of CP feel overwhelming.

Given these characteristics of CP, it’s clear that it should not be used as the sole determinant in course assessments. Relying too heavily on CP for grading could inadvertently favour students who possess abilities unrelated to computer science, such as high reading speed and fast typing. These intangibles can be advantageous in a competitive programming environment but have little relevance to a student’s understanding of computer science principles or their potential as a programmer.

Future of Competitive Programming in Classrooms

Although much research has been done involving introducing competitive programming into the classroom, little work explores the impact of cultural relevance in problem-setting, the role of artificial intelligence (AI) in integrating CP, and how CP interacts with various cultural and social intersections in the academic sphere.

The classroom is often characterized by a variety of cultural and social intersections. Incorporating CP in such a setting prompts us to consider how it might affect the likeability, acceptability, and academic performance across these intersections. Is CP equally appealing and accessible to students of different cultures, genders, or social backgrounds? How might the competitive nature of CP impact the dynamics of these intersections? Delving into these questions would allow us to devise strategies to ensure a more equitable and inclusive learning environment.

A unique feature of competitive programming is its creative liberty in problem-setting. This opens the possibility of integrating culturally relevant problems. Introducing programming problems referencing students’ home countries or cultures could make the learning experience more relatable and be a powerful tool to increase engagement among international students. However, the impact of such an approach is yet to be fully understood. How might culturally sensitive problems influence students’ interest and engagement? Could they enhance learning outcomes, or could they unintentionally alienate students who do not share the same cultural background?

Artificial Intelligence offers exciting possibilities in CP. For instance, large language models such as ChatGPT can assist in problem setting, which is typically a significant demand on an instructor’s time. AI-based tools could also serve as programming partners for first-year students, providing personalized assistance such as debugging help or hints for specific problems during a contest. This could supplement the responses from auto-grading judges, which is currently limited to categorized feedback that can sometimes be vague. This approach increases access to individualized learning support and mitigates common challenges associated with competitive programming, such as anxiety and intimidation. However, areas that require further exploration include the effectiveness of such tools and the best strategies for integrating them into the learning experience.

References

Moreno, J., & Pineda, A. F. (2018). Competitive programming and gamification as strategy to engage students in computer science courses. Revista ESPACIOS39(35).

Bandeira, I. N., Machado, T. V., Dullens, V. F., & Canedo, E. D. (2019, October 1). Competitive programming: A teaching methodology analysis applied to first-year programming classes. IEEE Xplore. https://doi.org/10.1109/FIE43999.2019.9028518

Astrachan, O. (2004). Non-competitive programming contest problems as the basis for just-in-time teaching. https://doi.org/10.1109/fie.2004.1408553

Coore, D., & Fokum, D. (2019). Facilitating Course Assessment with a Competitive Programming Platform. Proceedings of the 50th ACM Technical Symposium on Computer Science Education. https://doi.org/10.1145/3287324.3287511