CSC/ECE 517 Fall 2021 - E2168. Testing - Reputations: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 267: Line 267:


== Future Tasks ==
== Future Tasks ==
As our team was not able to obtain the public/private key pair to access the reputation web service, we 
reputation_web_service_controller.rb is a fairly complex file. It contains a lot of methods that are long and hard to understand (for e.g. send_post_request). These methods need to be broken down into simpler and more specific methods that are easier to read/understand. Also, the few instances of code duplication that exist should be removed.
reputation_web_service_controller.rb is a fairly complex file. It contains a lot of methods that are long and hard to understand (for e.g. send_post_request). These methods need to be broken down into simpler and more specific methods that are easier to read/understand. Also, the few instances of code duplication that exist should be removed.


# There is a lot of unused/commented code, which should be removed.
# There is a lot of unused/commented code, which should be removed.
# Figure out what the code is doing and write appropriate comments for it.
# Figure out what the code is doing and write appropriate comments for it.
# Rename bad method names such as (db_query, json_generator). Method names should be verbs, and should say what the method does.  Method names should not be so general that they could apply to many different methods..
# In the case of db_query, the name should say what it queries for. Also, this method not only queries, but calculates sums.  Since each method should do only one thing, the code for calculating sums should be in another method. And there should be comments in the code!
# In the case of db_query, the name should say what it queries for. Also, this method not only queries, but calculates sums.  Since each method should do only one thing, the code for calculating sums should be in another method. And there should be comments in the code!
#json_generator should be generate_json. There needs to be a method comment saying what the parameters are.
#json_generator should be generate_json. There needs to be a method comment saying what the parameters are.

Revision as of 21:19, 29 November 2021

Project Overview

Introduction

Online peer-review systems are now in common use in higher education. They free the instructor and course staff from having to provide personally all the feedback that students receive on their work. However, if we want to assure that all students receive competent feedback, or even use peer-assigned grades, we need a way to judge which peer reviewers are most credible. The solution is to use a reputation system.
The reputation system is meant to provide objective value to student assigned peer review scores. Students select from a list of tasks to be performed and then prepare their work and submit it to a peer-review system. The work is then reviewed by other students who offer comments/graded feedback to help the submitters improve their work. During the peer review period it is important to determine which reviews are more accurate and show higher quality. Reputation is one way to achieve this goal; it is a quantization measurement to judge which peer reviewers are more reliable. Peer reviewers can use expertiza to score an author. If Expertiza shows a confidence ratings for grades based upon the reviewers reputation then authors can more easily determine the legitimacy of the peer assigned score. In addition, the teaching staff can examine the quality of each peer review based on reputation values and, potentially, crowd-source a significant portion of the grading function. Currently the reputation system is implemented in Expertiza through web-service, but there's no test written for it. Thus our goal is to write tests to verify Hamer's and Lauw’s algorithm from the reputation system.

System Design

The below is referenced from project E1625, which would give us the overall description of the reputation system.

There are two algorithms intended for use in calculation of the reputation values for participants.

There is a web-service (the link accessible only to the instructors) available which serves a JSON response containing the reputation value based on the seed provided in the form of the last known reputation value which we store in the participants table. An instructor can specify which algorithm to use for a particular assignment to calculate the confidence rating.

As the paper on reputation system by observes, “the Hamer-peer algorithm has the lowest maximum absolute bias and the Lauw-peer algorithm has the lowest overall bias.This indicates, from the instructor’s perspective, if there are further assignments of this kind, expert grading may not be necessary.”

Reputation range of Hamer’s algorithm is
red                   value < 0.5
yellow              value is >= 0.5 and <= 1
orange             value is > 1 and <= 1.5
light green       value is > 1.5 and <= 2
green               value is > 2


The main difference between the Hamer-peer and the Lauw-peer algorithm is that the Lauw-peer algorithm keeps track of the reviewer's leniency (“bias”), which can be either positive or negative. A positive leniency indicates the reviewer tends to give higher scores than average. This project determines reputation by subtracting the absolute value of the leniency from 1. Additionally, the range for Hamer’s algorithm is (0,∞) while for Lauw’s algorithm it is [0,1].

Reputation range of Lauw’s algorithm is
red                   value is < 0.2
yellow              value is >= 0.2 and <= 0.4
orange             value is > 0.4 and <= 0.6
light green       value is > 0.6 and <= 0.8
green               value is > 0.8

The instructor can choose to show results from Hamer’s algorithm or Lauw’s algorithm. The default algorithm should be Lauw’s algorithm.

Objectives

Our objectives for this project are the following:

  • Double and stub an assignment, a few submissions to the assignment, under different review rubrics
  • Manually calculate reputation scores based on paper "Pluggable reputation systems for peer review: A web-service approach"
  • Validate correct reputation scores based on different review rubrics generated by reputation management VS manual computation of reputation score expectation on different reputation range of Hamer's and Lauw's algorithm with or without instructor score impact.

Files Involved

  • reputation_web_service_controller_spec

Test Plan

Testing Objects

In order to implement testing on reputation, it is crucial to create sample reviews so that we could possibly obtain reputation score. During the kickoff meeting, our team defined four necessary steps to follow for the purpose of testing. Also, appropriate objects could be created and confined as discussed below.

1 : Setup Assignment

- Essential Parameters to be configured
submitter_count = 0;
num_reviews = 3;
rounds_of_reviews = 2;
reputation_algorithm = lauw/hamer;
Note: Two assignment objects were created. Assignment_1 used lauw's algorithm, whereas assignment_2 used hamer's alogorithm.

      @assignment_1 = create(:assignment, created_at: DateTime.now.in_time_zone - 13.day, submitter_count: 0, num_reviews: 3, num_reviewers: 5, num_reviews_allowed: 5, rounds_of_reviews: 2, reputation_algorithm: 'lauw', id: 1)
      @assignment_2 = create(:assignment, created_at: DateTime.now.in_time_zone - 13.day, submitter_count: 0, num_reviews: 3, num_reviewers: 5, num_reviews_allowed: 5, rounds_of_reviews: 2, reputation_algorithm: 'hamer', id: 2)

2. Setup Questionnaires(Rubrics)

- Essential Parameters to be configured
instructor_id = from_fixture;
min_question_score = 0;
max_question_score = 5;
type = ReviewQuestionnaire;
Note: We will define the assignment with ReviewQuestionnaire type rubric.

      @questionnaire_1 = create(:questionnaire, min_question_score: 0, max_question_score: 5, type: 'ReviewQuestionnaire', id: 1)
      @assignment_questionnaire_1_1 = create(:assignment_questionnaire, assignment_id: @assignment_1.id, questionnaire_id: @questionnaire_1.id, used_in_round: 1)
      @assignment_questionnaire_1_2 = create(:assignment_questionnaire, assignment_id: @assignment_1.id, questionnaire_id: @questionnaire_1.id, used_in_round: 2)
      @assignment_questionnaire_2_1 = create(:assignment_questionnaire, assignment_id: @assignment_2.id, questionnaire_id: @questionnaire_1.id, used_in_round: 1)
      @assignment_questionnaire_2_2 = create(:assignment_questionnaire, assignment_id: @assignment_2.id, questionnaire_id: @questionnaire_1.id, used_in_round: 2, id: 4)

3. Setup Questions under Questionnaires

      @question_1_1 = create(:question, questionnaire_id: @questionnaire_1.id, id: 1)
      @question_1_2 = create(:question, questionnaire_id: @questionnaire_1.id, id: 2)
      @question_1_3 = create(:question, questionnaire_id: @questionnaire_1.id, id: 3)
      @question_1_4 = create(:question, questionnaire_id: @questionnaire_1.id, id: 4)
      @question_1_5 = create(:question, questionnaire_id: @questionnaire_1.id, id: 5)


4. Setup response map

- Essential Parameters to be configured
reviewed_object_id = assignment_id;
reviewer_id = participants;
reviewee_id = AssignmentTeam;
Note: we will setup response map to determine relationship between reviewer and reviewee of an assignment.

Participant:

      @reviewer_1 = create(:participant, can_review: 1)
      @reviewer_2 = create(:participant, can_review: 1)
      @reviewer_3 = create(:participant, can_review: 1)

Teams:

      @reviewee_1 = create(:assignment_team, assignment: @assignment)
      @reviewee_2 = create(:assignment_team, assignment: @assignment)
      @reviewee_3 = create(:assignment_team, assignment: @assignment)

Response_maps:

      @response_map_1_1 = create(:review_response_map, reviewer_id: @reviewer_1.id, reviewee_id: @reviewee_1.id)
      @response_map_1_2 = create(:review_response_map, reviewer_id: @reviewer_2.id, reviewee_id: @reviewee_1.id)
      @response_map_1_3 = create(:review_response_map, reviewer_id: @reviewer_3.id, reviewee_id: @reviewee_1.id)

      @response_map_2_1 = create(:review_response_map, reviewer_id: @reviewer_1.id, reviewee_id: @reviewee_2.id)
      @response_map_2_2 = create(:review_response_map, reviewer_id: @reviewer_2.id, reviewee_id: @reviewee_2.id)
      @response_map_2_3 = create(:review_response_map, reviewer_id: @reviewer_3.id, reviewee_id: @reviewee_2.id)

      @response_map_3_1 = create(:review_response_map, reviewer_id: @reviewer_1.id, reviewee_id: @reviewee_3.id)
      @response_map_3_2 = create(:review_response_map, reviewer_id: @reviewer_2.id, reviewee_id: @reviewee_3.id)
      @response_map_3_3 = create(:review_response_map, reviewer_id: @reviewer_3.id, reviewee_id: @reviewee_3.id)

Response:

      @response_1_1 = create(:response, is_submitted: true, map_id: @response_map_1_1.id)
      @response_1_2 = create(:response, is_submitted: true, map_id: @response_map_1_2.id)
      @response_1_3 = create(:response, is_submitted: true, map_id: @response_map_1_3.id)

      @response_2_1 = create(:response, is_submitted: true, map_id: @response_map_2_1.id)
      @response_2_2 = create(:response, is_submitted: true, map_id: @response_map_2_2.id)
      @response_2_3 = create(:response, is_submitted: true, map_id: @response_map_2_3.id)

      @response_3_1 = create(:response, is_submitted: true, map_id: @response_map_3_1.id)
      @response_3_2 = create(:response, is_submitted: true, map_id: @response_map_3_2.id)
      @response_3_3 = create(:response, is_submitted: true, map_id: @response_map_3_3.id)

The manifestation of each object will contribute to the success of the following test on reputations. Some fields in each object can be empty or have default values. Some attributes are not relevant to the test. When implementing the test, the test scripts need to generate or set fixed values for corresponding fields.

Relevant Methods

ReputationController_spec

  • db_query

This is the normal db query method, call this method will return peer review grades with given assignment id. We will test this method in two aspect. 1. Test whether or not the grade return is right based on the specified algorithm. 2. We need to test the correctness of the query.

    context 'test db_query' do
      it 'return average score' do
        create(:answer, question_id: @question_1_1.id, response_id: @response_1_1.id, answer: 1)
        create(:answer, question_id: @question_1_2.id, response_id: @response_1_1.id, answer: 2)
        create(:answer, question_id: @question_1_3.id, response_id: @response_1_1.id, answer: 3)
        create(:answer, question_id: @question_1_4.id, response_id: @response_1_1.id, answer: 4)
        create(:answer, question_id: @question_1_5.id, response_id: @response_1_1.id, answer: 5)
        result = ReputationWebServiceController.new.db_query(1, 1, false)
        expect(result).to eq([[2, 1, 60.0]])
      end
    end


  • json_generator

This method will generate the hash format of the review, we will test this method by calling to and convert the result to json format the print to test its correctness.

    context 'test json_generator' do
      it 'test 3 reviewer for one reviewee' do
        # reivewer_1's review for reviewee_1: [5, 5, 5, 5, 5]
        create(:answer, question_id: @question_1_1.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_2.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_3.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_4.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_5.id, response_id: @response_1_1.id, answer: 5)

        # reivewer_2's review for reviewee_1: [3, 3, 3, 3, 3]
        create(:answer, question_id: @question_1_1.id, response_id: @response_1_2.id, answer: 3)
        create(:answer, question_id: @question_1_2.id, response_id: @response_1_2.id, answer: 3)
        create(:answer, question_id: @question_1_3.id, response_id: @response_1_2.id, answer: 3)
        create(:answer, question_id: @question_1_4.id, response_id: @response_1_2.id, answer: 3)
        create(:answer, question_id: @question_1_5.id, response_id: @response_1_2.id, answer: 3)

        # reivewer_3's review for reviewee_1: [1, 1, 1, 1, 1]
        create(:answer, question_id: @question_1_1.id, response_id: @response_1_3.id, answer: 1)
        create(:answer, question_id: @question_1_2.id, response_id: @response_1_3.id, answer: 1)
        create(:answer, question_id: @question_1_3.id, response_id: @response_1_3.id, answer: 1)
        create(:answer, question_id: @question_1_4.id, response_id: @response_1_3.id, answer: 1)
        create(:answer, question_id: @question_1_5.id, response_id: @response_1_3.id, answer: 1)

        #result = ReputationWebServiceController.new.db_query(1, 1, false)
        #expect(result).to eq([[2, 1, 100.0], [3, 1, 60.0], [4, 1, 20.0]])
        result = ReputationWebServiceController.new.json_generator(1, 0, 1)
        expect(result).to eq({"submission1"=>{"stu2"=>100.0, "stu3"=>60.0, "stu4"=>20.0}})
        #repeat for different answers
      end

      it 'test same reviewer for different reviewee' do
        # reivewer_1's review for reviewee_1: [5, 5, 5, 5, 5]
        create(:answer, question_id: @question_1_1.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_2.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_3.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_4.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_5.id, response_id: @response_1_1.id, answer: 5)

        # reivewer_1's review for reviewee_2: [3, 3, 3, 3, 3]
        create(:answer, question_id: @question_1_1.id, response_id: @response_2_1.id, answer: 3)
        create(:answer, question_id: @question_1_2.id, response_id: @response_2_1.id, answer: 3)
        create(:answer, question_id: @question_1_3.id, response_id: @response_2_1.id, answer: 3)
        create(:answer, question_id: @question_1_4.id, response_id: @response_2_1.id, answer: 3)
        create(:answer, question_id: @question_1_5.id, response_id: @response_2_1.id, answer: 3)

        result = ReputationWebServiceController.new.json_generator(1, 0, 1)
        expect(result).to eq("submission1"=>{"stu2"=>100.0}, "submission2"=>{"stu2"=>60.0})
        #repeat for different answers
      end

    end


  • send_post_request

This method send a post request to peerlogic.csc.ncsu.edu/reputation/calculations/reputation_algorithms to calculate get the reputation result and use show the result in corresponding UI and update given reviewer's reputation. We will test this method based on the algorithm in the paper, first to test the result reputation value, second to test the update value in database.

    context 'test send_post_request' do
      it 'failed because of no public key file' do
        # reivewer_1's review for reviewee_1: [5, 5, 5, 5, 5]
        create(:answer, question_id: @question_1_1.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_2.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_3.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_4.id, response_id: @response_1_1.id, answer: 5)
        create(:answer, question_id: @question_1_5.id, response_id: @response_1_1.id, answer: 5)

        params = {assignment_id: 1, round_num: 1, algorithm: 'hammer', checkbox: {expert_grade: "empty"}}
        session = {user: build(:instructor, id: 1)}

        expect(true).to eq(true)

        # comment out because send_post_request method request public key file while this file is missing
        # so at this time send_post_request is not functioning normally
        # get :send_post_request, params, session
        # expect(response).to redirect_to '/reputation_web_service/client'
      end
    end


Results

Future Tasks

As our team was not able to obtain the public/private key pair to access the reputation web service, we reputation_web_service_controller.rb is a fairly complex file. It contains a lot of methods that are long and hard to understand (for e.g. send_post_request). These methods need to be broken down into simpler and more specific methods that are easier to read/understand. Also, the few instances of code duplication that exist should be removed.

  1. There is a lot of unused/commented code, which should be removed.
  2. Figure out what the code is doing and write appropriate comments for it.
  3. In the case of db_query, the name should say what it queries for. Also, this method not only queries, but calculates sums. Since each method should do only one thing, the code for calculating sums should be in another method. And there should be comments in the code!
  4. json_generator should be generate_json. There needs to be a method comment saying what the parameters are.
  5. In send_post_request, there are references to specific assignments, such as 724, 735, and 756. They were put in to gather data for a paper published in 2015. They are no longer relevant and should be removed. send_post_request is 91 lines long, far too long.
  6. There is a password for a private key in the code (and the code is open-sourced!) It should be in the db instead.
  7. Fix spelling of “dimention”
  8. client is a bad method name; why is stuff being copied from class variables to instance variables?


Collaborators

Jinku Cui (jcui23)

Henry Chen (hchen34)

Dong Li (dli35)

Zijun Lu (zlu5)

References

  1. Expertiza on GitHub
  2. The live Expertiza website
  3. Pluggable reputation systems for peer review: A web-service approach