Design Document for E1632. Collusion detection
Introduction to Expertiza
Expertiza is an open source project developed on Ruby on Rails framework. It is primarily developed as a platform for instructors to create, assign and grade assignments and quizzes and for the students to view, submit assignments and take quizzes. Other activities involved in a course can also be done via Expertiza, like creating teams, reviewing assignments among others.
Problem Statement
Collusion detection is a feature that Expertiza needs. Collusion detection is supposed to detect when peer reviewers “return favors” to each other by giving each other higher scores than they deserve. There are 2 types of collusion that instructors may worry about: small-circle collusion and pervasive collusion. Small-circle collusion refers to the practice that a few teams/authors reach an agreement that they give each other high scores if they get chances to review each other’s work. The number of the teams/authors could be any number. Pervasive collusion refers to the practice that giving each other high scores becomes a trend in some assignments. This practice has been observed by Dr. Luca de Alfaro in his paper:
... Once a nucleus of students starts to assign top grades to all the submissions they review, other initially honest students see what is happening, and join the colluders, both to save work in reviewing, and to avoid being penalized in the review precision as the only honest students who disagree with their colluding peers.
Issues Identified
- The code in collusion_cycle.rb is not called anywhere. Even it were called, the code would not work (due to some nasty refactoring done previously). The algorithm was badly designed - creating separate methods for detecting x-node cycles.
Tasks Identified
The following tasks need to be accomplished at the end of this project:
- Remove all the methods in this model. Redesign and recode the whole model from scratch.
- In the new model we will create a new collusion cycle detection method which will take x as an input. Given the parameter x, the algorithm then can detect any collusion cycles consisting of x nodes.
- We will use spread and the ratio spread/(# of peer reviews done) to measure the pervasive collusion in one assignment.
Here spread is the peer-review score difference a reviewer (max peer review score given - min peer review score given) A higher spread is better, because it indicates that the reviewer is discriminating between good and bad work. [Dr. Gehringer’s survey, 2014] Spread/(# of peer-review done) should work better than spread alone if the # of review done varies between reviewers.
Implementation
- We will consider each student in an assignment a node and each peer-review record (namely, user A reviews user B, which is recorded in review_mappings table) as a directed edge. So finding small-circle collusion becomes a task of finding cycles within x nodes in a directed graph.
- This algorithm can use depth-first search.
- The idea of having a threshold :
If A gives B a low peer-review score (less than min_collusion_peer_review_score), we prune this A->B edge.
- We will create a report page (small collusion cycles report) under “view review report” icon for instructors. The instructor can input x and run this algorithm to detect small collusion cycles. The instructor can also set the min_collusion_peer_review_score on this report page. The default of min_collusion_peer_review_score can be 95%.
- We might also need a similar report page (pervasive collusion report) for pervasive collusion.
The Use Case diagram for such design is:
Proposed Code
We are planning to implement the modified DFS for finding a cycle of size "n". The basic algorithm for this looks like:
DFS(G) { for each u in V{ color[u]<-White pred[u]<-NULL } time = 0 for each u in V{ if(color[u]==White){ DFSVisit(u) } } } DFSVisit(u){ color[u]<-Grey d[u]=++time for each v in adj[u]{ if(color[v]==White){ pred[v]<-u DFSVisit(v) }else{ //we have a cycle between current node u and already visited node v cycle_list<-(u,v) //use later to find cycle length } } color[u]<-black f[u]= ++time } legend: color:: it stores the information whether the node has been visited so far or not, pred:: stores the parent information of the given node adj:: adjacency list d[u] :: gives you the discovery time f[u]:: gives you the finish time, i.e all the adjacent nodes of u have been visited
This algorithm has to be implemented in "collusion_cycle.rb", and the implementation will roughly look like:
def cycle_detection (graph, n) white_set = set.new gray_set = set.new black_set = set.new vertex = vertex (graph) # get all the vertex from graph vertex.each do |current| dfs_within_n (current, white_set, gray_set, black_set, [], n) end end def dfs_within_n (current, white_set, gray_set, black_set, recent_n_vertex, n) move_vertex (current, white_set, gray_set) neighbours = neighbours(current) neighbors.each do |neighbor| if black_set.contains(neighbor) continue end if gray_set.contains(neighbor) # a cycle detected if recent_n_vertex.contains(current) # a cycle within n vertex detected output_loop (recent_n_vertex, current) end end if recent_n_vertex.length==n recent_n_vertex.remove[0] end recent_n_vertex.add(neighbor) dfs_within_n (current, white_set, gray_set, black_set, recent_n_vertex, n) end move_vertex (current, gray_set, black_set) end def move_vertex(current, source_set, destination_set) source_set.remove(current) destination_set.add(current) end
Test Cases
The test case will be written as "rspec"
- We will need a test case, which returns the cycles detected in the graph, and confirms that the size of the cycles is no less or no more than the user specified size "n".
- We will need to check the performance of this algorithm. In some cases when the graph becomes too large the algorithm should still perform in acceptable time frame.
Visual Representation
This is a soft commit for the project. We are planning to have a visually asthetic way of representing the collusion detected. For this purpose we have identified a few ways:
- Visulization tool called viz.js
- Cytoscape.js
- D3.js