CSC/ECE 517 Fall 2019 - E1993 Track Time Between Successive Tag Assignments

From Expertiza_Wiki
Jump to navigation Jump to search

Introduction

The Expertiza project takes advantage of peer-review among students to allow them to learn from each other. Tracking the time that a student spends on each submitted resources is meaningful to instructors to study and improve the teaching experience. Unfortunately, most peer assessment systems do not manage the content of students’ submission within the systems. They usually allow the authors submit external links to the submission (e.g. GitHub code / deployed application), which makes it difficult for the system to track the time that the reviewers spend on the submissions.

Special note

Note that we are not just "copying" from the project description, since I, Kai, wrote the final project description on Google doc at the first place and is a part of the mentoring team of this very project. So technically I wrote both documents and do not want to be accused copying.

Project Description

CSC/ECE 517 classes have helped us by “tagging” review comments over the past two years. This is important for us, because it is how we get the “labels” that we need to train our machine-learning models to recognize review comments that detect problems, make suggestions, or that are considered helpful by the authors. Our goal is to help reviewers by telling them how helpful their review will be before they submit it.

Tagging a review comment usually means sliding 4 sliders to either side, depending on which of four attributes it has. But can we trust the tags that students assign? In past semesters, our checks have revealed that some students appear not to be paying much attention to the tags they assign: the tags seem to be unrelated to the characteristic they are supposed to rate, or they follow a set pattern, like repeated patterns of one tag yes, then one tag no. Studies on other kinds of “crowdwork” have shown that the time spent between assigning each label indicates how careful the labeling (“tagging”) has been. We believe that students who tag “too fast” are probably not paying enough attention, and want to set their tags aside to be examined by course staff and researchers..

Proposed Approach

We would like to modify the model reflecting review tagging actions (answer_tag entity), adding new fields to track the time interval between each tagging action, as well as revise the controller to implement this interval tracking functionality. A few things to take into consideration: A user might not tag reviews in sequence, they may jump through them and only tag the one he or she is interested When annotator tags 1 rubric of all reviews then move onto the next, their behavior will be much different compared with those who tags 4 rubrics of each review. Sometimes an annotator could take long breaks to refresh and relax, some even take days off, these irregularities needs to be handled. A user may slide the slider back and forth for a number of times, then go to the next slider; they may also come back and revise a tag they made, these needs to be treated differently. The first step would be to examine the literature on measuring the reliability of crowdwork, and determine appropriate measures to apply to review tagging. The next step would be to code these metrics and call them from report_formatter_helper.rb, so that they will be available to the course staff when it grades answer tagging. There should also be a threshold set so that any record in the answer_tags table created by a user with suspect metrics would be marked as unreliable (you can add a field for that to the answer_tags table).

Behavior Diagram for Instructor/TA Diagram

Flow Chart

Proposed Test Plan

Automated Testing

Automated testing is not available for this specific project because:

  • To test the review tagging interval one has to
    • Create courses
    • Create assignment
    • Setup assignment dues dates as well as reviews and review tagging rubrics
    • Enroll more than 2 students to the assignment
    • Let students finish assignment
    • Change due date to the past
    • Have students review each other's work
    • Change review due date to the past
    • Have students tag each other's work, with time intervals (pause or sleep between each review comment, having at lease one interval longer than 3 minutes)
    • Generate review tagging report
    • Generate review tagging time interval line chart

And you cannot verify what's on the line chart with Rspec since it's a image, we wouldn't be able to see if the interval greater than 3 minutes is filtered

Each time this script runs, it would take minutes to test, let along having the intervals, and each time Expertiza runs a system level testing, this script would be included, adding who knows how long to the already too long testing process.

Thus we propose to verify this function in not automated ways

Team Information

  1. Anmol Desai(adesai5@ncsu.edu)
  2. Dhruvil Shah(dshah4@ncsu.edu)
  3. Jeel Sukhadia(jsukhad@ncsu.edu
  4. YunKai "Kai" Xiao(yxia28@ncsu.edu)

Mentor: Yunkai "Kai" Xiao (yxiao28@ncsu.edu) Mohit Jain(mjain6@ncsu.edu)

References

  1. Expertiza on GitHub
  2. RSpec Documentation