CSC/ECE 517 Fall 2025 - E2564. Review calibration
This page contains information about E2564.Review calibration, which was a project in CSC517 Fall 2025.
Please see below for a description of the design of the project.
Background
Calibration is a mechanism by which a review left by a student can be compared to the reviews left by instructors or course staff. The main idea behind calibration is that if a student's review of an assignment resembles the instructor's review, then the student is deemed to be a competent reviewer.
In Expertiza, calibration serves following purposes:
- Ensuring the quality and consistency of peer reviews by providing a benchmark for the reviews.
- Understanding which student-reviewers are reliable.
- Proving fairness in grading and enhancing the feedback process.
Problem Statement
Currently, Expertiza supports calibration, but the implementation can be improved, as it suffers from usability and reporting issues:
- In the calibration workflow, the current codebase lacks a clear backend service that does the following:
- Identifying which Response Maps are marked for calibration (for_calibration = True)
- Retrieving the instructors' calibrated review and all the students' reviews for a given submission (reviewed_object_id)
- Computing comparison metrics and sending them via JSON for the front end
- The calibration report is poorly structured, making it difficult to interpret and to draw meaningful insights from. The existing Calibration tab and associated flows do not allow easy identification of how a student’s review compares to an instructor’s review
Previous Implementation
The previous implementation for Calibration lets us know the following about the implementation:
- Instructors can enable calibration for any assignment by impersonating other users (phantom participants) to submit the documents that will later be used for calibration reviews. This sets the field calibrate_to = True in the Response Maps.
- After these submissions are created, the instructor returns to their own account and navigates to the Calibration tab in the assignment view where they can "Begin" their calibrated review. The instructor can also "View" and "Edit" the reviews.
- The Students then get to perform calibration reviews by accessing the instructor-provided submissions in "Others Works" section. After completing a review, the students can access "Show Calibration Results" to compare their own review against the instructor's review.
- Behind the scenes, each calibration review creates a ReviewResponseMap that links the student reviewer to the reviewee (calibration submission team). The instructor's reviews are stored as ReviewResponseMap records where the reviewer is the instructor. These mappings are distinguished from regular review mappings by setting the flag (calibrate_to = True). A Response object stores the actual review scores and comments.
- When a student views “Show calibration results”, the system retrieves the student’s Response for that calibration submission and the instructor’s corresponding Response. It then compares the scores rubric-by-rubric and the mode of how many students' reviews align with the instructor's.
Limitations of the previous implementation
- There are no dedicated API endpoints for delivering calibration review data or comparison results in JSON format.
- Calibration-related logic is intermixed with standard review functionality, reducing maintainability.
- The calibration report page is difficult to interpret and does not clearly show the comparison between student and instructor reviews.
Project Deliverables
Goals
The deliverables focus on creating robust controller methods to manage and process calibration data for both the instructor and the student interfaces.
1. Instructor Interface Functionality (Assignment Creation/Editing)
- The goal is to provide the instructor with the list of submissions that they need to review to create the calibration "answer key."
- Deliverable: Implement the controller method get_instructor_calibration_submissions (accessible via /calibration/assignments/{assignment_id}/submissions).
- Function: Retrieves a JSON list of all calibration submissions for a given assignment where the instructor is the reviewer.
- Data Included: The submitted hyperlinks, files, team name, reviewee ID, response map ID, and the current status of the instructor's review.
2. Student Calibration Report Generation
- The goal is to generate a detailed report for a specific student, comparing their reviews against the instructor's standard.
- Deliverable: Implement the controller method calibration_student_report (accessible via /calibration/calibration_student_report).
- Function: Compares the logged-in student's most recent calibration review for each assigned calibration submission against the corresponding instructor's "answer key" review.
- Data Included: A detailed breakdown for each question/rubric item, showing: The student's score and comments. The instructor's score and comments.The difference in scores and the direction of the difference (exact, over, under).
- Summary Statistics: Calculates and reports overall statistics for the student's performance, including: Exact Matches count and percentage (match_rate_percentage. Close Matches count (difference of 1 or 2 points). Average Absolute Difference across all questions.
3. Aggregate Class Report Generation
- The goal is to provide the instructor with an overall, class-wide summary of how students performed on a specific calibration submission.
- Deliverable: Implement the controller method calibration_aggregate_report (accessible via /calibration/assignments/:assignment_id/report/:reviewee_id).
- Function: Calculates aggregate statistics for all students who reviewed a specific calibration submission (identified by its reviewee_id).
- Summary Statistics: Reports overall class statistics for that submission, including: Total number of student reviews analyzed. Average Agreement Percentage (avg_agreement_percentage) across all questions.
- Question Breakdown: Provides a detailed breakdown for each question, including: The instructor's (answer key) score. The average student score. The percentage of students who gave an exact match score (match_rate). The distribution counts of student scores: exact, off_by_one, off_by_two, and off_by_three_plus points from the instructor's score.
4. Calibration Submission Summary (Metadata)
- The goal is to retrieve the necessary metadata about the submissions being reviewed in a calibration assignment.
- Deliverable: Implement the controller method summary (accessible via /calibration/assignments/:assignment_id/students/:student_participant_id/summary).
- Function: Fetches submission content and associated metadata for all calibration submissions reviewed by a given student.
- Data Included: Reviewee team ID, a list of reviewers (participants/team members who submitted the work), all submitted hyperlinks, and the for_calibration flag status.
5. Backend Utility/Helper Methods
- The project requires the creation of several robust private methods to centralize database interaction and complex calculations, ensuring code quality and reusability.
- Utility Methods: Functions to retrieve common data, such as get_instructor_participant_id, get_submitted_content, get_team_submitted_files, and get_instructor_review_for_reviewee.
- Comparison Logic: Implementation of core comparison methods, notably instructor_review_better? and calculate_aggregate_question_stats, to perform the score comparisons required for the student and aggregate reports.
Scope and Assumptions
Scope includes backend behavior for:
- CalibrationController#get_instructor_calibration_submissions, CalibrationController#summary, CalibrationController#calibration_aggregate_report, and CalibrationController#calibration_student_report
- Helper Methods in CalibrationController used for comparison/aggregation
Front-end UI design and implementation are out of scope; the responsibility of this project is solely to expose correct and well-structured JSON data that the front end can consume.
Workflow Visualization
The following image provides an insight into the calibration workflow. The calibration controller is implemented to facilitate the functions that are needed for the calibration workflow.
Implementation Details
Code Changes
Main Methods in CalibrationController (Public API Endpoints)
These four public methods form the core API for managing and reporting on calibration assignments, serving both instructor and student data needs.
1. get_instructor_calibration_submissions
This endpoint, accessed via: GET /calibration/assignments/{assignment_id}/submissions
is designed for the instructor interface. Its purpose is to retrieve the list of submissions for a specific assignment that the instructor must review to create the calibration "Answer Key."
Core Logic:
- Identifies the instructor's participant ID via
get_instructor_participant_id. - Queries the
ResponseMapmodel for records where:- reviewer is the instructor, and
for_calibration = true.
- For each map:
- Retrieves the associated Team info.
- Gets submitted hyperlinks/files using
get_submitted_content.
JSON Response: Returns an object with the key calibration_submissions, an array of objects containing:
team_nameandreviewee_id.response_map_id.submitted_content(hyperlinks and files including size + download metadata).review_status(not_started, in_progress, completed).
2. summary
Endpoint: GET /calibration/assignments/:assignment_id/students/:student_participant_id/summary
Provides metadata essential for displaying the calibration context to a student.
Core Logic:
- Queries
ReviewResponseMapfor all calibration reviews (wherefor_calibration = true) for the given student participant ID. - Resolves the reviewee (Team object).
- Determines team members (reviewers).
- Retrieves submitted hyperlinks via
get_submitted_content.
JSON Response: Returns an object containing IDs and the key submissions, an array where each object includes:
reviewee_team_id- Array of reviewers (each with
participant_idandfull_name) - Array of hyperlinks
- The
for_calibrationflag
3. calibration_student_report
Endpoint: GET /calibration/calibration_student_report
This provides a detailed report comparing one student's calibration performance against the instructor's Answer Key.
Core Logic:
- Fetches all calibration
ResponseMaprecords for the student. - For each:
- Retrieves the student's latest
Response. - Gets the instructor's key response via
get_instructor_review_for_reviewee. - Compares the two using
instructor_review_better?, which compares scores/comments item-by-item.
- Retrieves the student's latest
JSON Response: Returns an object containing:
student_participant_idassignment_idcalibration_reviews(array). Each review includes:reviewee_nameandreviewee_idcomparisonobject:questionsarray (per-item comparison of scores/comments)statsobject:exact_matchesclose_matches(difference of 1 or 2)match_rate_percentageaverage_difference
4. calibration_aggregate_report
Endpoint:GET /calibration/assignments/:assignment_id/report/:reviewee_id
Generates class-wide statistical data for all students reviewing a particular calibration submission.
Core Logic:
- Retrieves the instructor's key
Responseand Answer records. - Retrieves all student
ResponseMaprecords for the same reviewee (excluding the instructor). - Collects all student responses and their Answer records.
- For each rubric item, calls
calculate_aggregate_question_statsto compute:- exact matches
- off-by-one
- off-by-two
- off-by-three-plus
JSON Response: Returns an object containing:
reviewee_idandreviewee_nameaggregate_stats, which includes:total_reviewsavg_agreement_percentagequestion_breakdown(per-question stats), each containing:instructor_scoreavg_student_scorematch_ratecounts: exact, off_by_one, off_by_two, off_by_three_plus
Helper Methods
| Private Method | Description | Used By |
|---|---|---|
| get_instructor_participant_id(assignment_id) | Locates the instructor's user_id from the Assignment and finds the corresponding AssignmentParticipant ID for use in ResponseMap lookups. | get_instructor_calibration_submissions, calibration_aggregate_report, get_instructor_review_for_reviewee. |
| get_submitted_content(team_id) | Fetches the team's latest submitted hyperlinks from SubmissionRecord and retrieves actual files from the local filesystem submission directory. | get_instructor_calibration_submissions, summary. |
| get_team_submitted_files(team, assignment) | Constructs the filesystem path using assignment.id and team.directory_num to list the actual submitted files, calculating their size and generating download URLs. | get_submitted_content. |
| format_file_size(bytes) | Converts a file size in bytes into a human-readable string (e.g., "1.5 MB"). | get_team_submitted_files. |
| build_hyperlinks_from_records(hyperlink_records) | Maps SubmissionRecord objects containing hyperlink content into a standardized array format for the JSON response. | get_submitted_content. |
| truncate_url(url) | Shortens a URL string to a maximum of 50 characters for clean display in the front end. | build_hyperlinks_from_records. |
| get_review_status(response_map_id) | Determines if a review has been started, is in progress, or is completed by looking at the latest Response object's is_submitted flag. | get_instructor_calibration_submissions. |
| get_instructor_review_for_reviewee(assignment_id, reviewee_id) | Locates the ResponseMap where the instructor reviewed the given reviewee_id and returns the most recent Response object (the Answer Key). | calibration_student_report, calibration_aggregate_report. |
| instructor_review_better?(instructor_review, student_review) | The core logic that iterates through Answer records to compare the student's score/comments against the instructor's, generating per-question differences and summary stats. | calibration_student_report. |
| get_question_text(item_id) | Retrieves the question text from the Item model based on the provided item_id, defaulting to a placeholder string on error. | instructor_review_better?, calculate_aggregate_question_stats. |
| student_review_higher?(student_score, instructor_score) | A simple utility to classify the score relationship as 'exact', 'over', or 'under'. | instructor_review_better?. |
| calculate_average_difference(comparisons) | Computes the mean of the absolute score differences across all questions in a comparison set. | instructor_review_better?. |
| calculate_aggregate_question_stats(instructor_answer, student_answers_list) | Calculates the statistical distribution for a single question across all student reviewers, counting exact matches, off-by-one, off-by-two, and off-by-three-plus differences. | calibration_aggregate_report. |
Design Principles Followed
In order to ensure that our code is maintainable and is aligned with the best design practices, we ensured that we follow these design principles:
1. Don't Repeat Yourself (DRY):
The use of private helper methods eliminates redundant code across the public endpoints, which is crucial since multiple endpoints access similar data structures.
- Reusable Data Retrieval: Methods like get_instructor_participant_id and get_instructor_review_for_reviewee are used by both student and aggregate report methods. Without these helpers, the complex queries to find the instructor's ID and their review map would be duplicated in at least two different public actions.
- Reusable Utility: Helper methods like get_submitted_content (which itself relies on get_team_submitted_files, etc.) encapsulate file path construction and parsing, preventing submission logic from being rewritten for get_instructor_calibration_submissions and summary.
2. Single Responsibility Principle (SRP): While the controller as a whole has the single responsibility of managing calibration data and generating the reports, the private methods adhere to SRP at a lower level, making them easier to test and maintain.
- Focused Helpers: Each helper method is highly specialized:
- format_file_size: Solely responsible for formatting byte size into a readable string.
- get_review_status: Solely responsible for deriving the string status (completed, not_started, etc.) from a ResponseMap.
- calculate_aggregate_question_stats: Solely responsible for tallying the distribution of scores for one question.
3. Code Readability and Maintainability: By applying the principles above, the code becomes easier to read, understand, and debug.
- Explanatory Method Names: The methods are named clearly (e.g., calibration_student_report, instructor_review_better?), making their intent obvious without needing to read their implementation details.
- Lean Public Actions: The public methods are short and focused on calling the correct sequence of helpers, making the overall flow of the API endpoint easy to follow.
These design choices lead to a robust and scalable controller that can easily integrate new reporting features by leveraging existing helper functions.
Output/Results
The following is the expected JSON output for each API call present in the calibration_controller.rb
1. GET /calibration/assignments/:assignment_id/submissions
Method: get_instructor_calibration_submissions
{"calibration_submissions": [
{
"team_name": "Team Alpha",
"reviewee_id": 12,
"response_map_id": 44,
"submitted_content": {
"hyperlinks": [
{
"url": "https://github.com/team_alpha/project",
"display_text": "https://github.com/team_alpha/project",
"submitted_by": "student1",
"submitted_at": "2025-02-02T14:21:00Z"
}
],
"files": [
{
"filename": "report.pdf",
"size": 204800,
"size_human": "200 KB",
"type": "pdf",
"modified_at": "2025-02-02T15:00:00Z",
"download_url": "/submitted_content/download?team_id=12&filename=report.pdf"
}
]
},
"review_status": "in_progress"
},
{
"team_name": "Jane Doe",
"reviewee_id": 14,
"response_map_id": 47,
"submitted_content": {
"hyperlinks": [],
"files": []
},
"review_status": "not_started"
}
]}
2. GET /calibration/calibration_student_report?student_participant_id=X&assignment_id=Y
Method: calibration_student_report
{"student_participant_id": 33,
"assignment_id": 5,
"calibration_reviews": [
{
"reviewee_name": "Team Alpha",
"reviewee_id": 12,
"comparison": {
"questions": [
{
"item_id": 101,
"question_text": "Clarity of the explanation",
"instructor": {
"score": 4,
"comments": "Good explanation."
},
"student": {
"score": 3,
"comments": "Mostly clear."
},
"difference": 1,
"direction": "under"
},
{
"item_id": 102,
"question_text": "Technical correctness",
"instructor": {
"score": 5,
"comments": "Perfect."
},
"student": {
"score": 5,
"comments": "Looks correct."
},
"difference": 0,
"direction": "exact"
}
],
"stats": {
"total_questions": 2,
"exact_matches": 1,
"close_matches": 1,
"match_rate_percentage": 50.0,
"average_difference": 0.5
}
}
}
]}
3. GET /calibration/assignments/:assignment_id/students/:student_participant_id/summary
Method: summary
{"student_participant_id": 33,
"assignment_id": 5,
"submissions": [
{
"reviewee_team_id": 12,
"reviewers": [
{
"participant_id": 88,
"full_name": "John Smith"
},
{
"participant_id": 89,
"full_name": "Mary Jones"
}
],
"hyperlinks": [
{
"url": "https://example.com/demo",
"display_text": "https://example.com/demo",
"submitted_by": "John Smith",
"submitted_at": "2025-02-01T12:00:00Z"
}
],
"for_calibration": true
}
]}
4. GET /calibration/assignments/:assignment_id/report/:reviewee_id
Method: calibration_aggregate_report
{"reviewee_id": 12,
"reviewee_name": "Team Alpha",
"assignment_id": 5,
"aggregate_stats": {
"total_reviews": 18,
"avg_agreement_percentage": 62.5,
"question_breakdown": [
{
"item_id": 101,
"question_text": "Clarity of the explanation",
"instructor_score": 4,
"avg_student_score": 3.6,
"match_rate": 44.44,
"counts": {
"exact": 8,
"off_by_one": 6,
"off_by_two": 3,
"off_by_three_plus": 1
}
},
{
"item_id": 102,
"question_text": "Technical correctness",
"instructor_score": 5,
"avg_student_score": 4.7,
"match_rate": 72.22,
"counts": {
"exact": 13,
"off_by_one": 4,
"off_by_two": 1,
"off_by_three_plus": 0
}
}
]
}}
Test Plan
Objectives for testing:
- Verify that calibration data (submissions, instructor reviews, student reviews) is correctly retrieved for assignments where ReviewResponseMap.for_calibration = True.
- Verify that the comparison and aggregation logic correctly summarizes how student reviews align with the instructor’s review for each rubric item.
- Ensure the new calibration endpoints and helpers do not break existing review and calibration flows in Expertiza.
RSpec Testing
Added request specs with RSpec and RSwag to hit each Calibration API endpoint, seeding the database with users, participants, teams, response maps, and responses needed for each scenario. The test file was added "reimplementation-back-end/spec/requests/api/v1/calibration_controller_spec.rb."
The tests were split up based on each of the API calls, and each of the tests is as follows:
/calibration/assignments/:assignment_id/submissions– Verifies that the API call can get the calibration submission when given a validassignmentId./calibration/calibration_student_report– Verifies that the API call outputs the report correctly when comparing a valid calibration and user review./calibration/calibration_student_report– Verifies that the API call correctly identifies when one of the reviews it is given is missing needed data to create the calibration report./calibration/calibration_student_report– Verifies that the API call correctly identifies when there are noResponseMaprecords, returning an empty array./calibration/assignments/:assignment_id/students/:student_participant_id/summary– Verifies that the API call correctly returns all reviews of a given validassignmentIdwith its correlary validparticipantId./calibration/assignments/:assignment_id/students/:student_participant_id/summary– Verifies that the API call correctly identifies when one of the values such as theparticipantIdis invalid and correctly relays the following message:"Assignment or student not found."/calibration/assignments/:assignment_id/students/:student_participant_id/summary– Verifies that the API call correctly identifies when there are no hyperlinks/files connected to a specific review, but still outputs an empty array as specified./calibration/assignments/:assignment_id/report/:reviewee_id– Verifies that the API call correctly generates the report when given a validassignmentIdandrevieweeId./calibration/assignments/:assignment_id/report/:reviewee_id– Verifies that the API call correctly identifies when the instructor review is missing and outputs the correct error:"Instructor review not found. Cannot generate report."
Future Scope
- In the current implementation, there is no dedicated frontend for visualizing calibration reports for students or instructors. A major future enhancement would be developing an intuitive UI that consumes the JSON returned by calibration_aggregate_report and calibration_student_report, presenting metrics, question-level breakdowns, and performance trends in a clear, interactive dashboard tailored separately for instructors and students.
- The existing CalibrationController performs multiple responsibilities: retrieving reviews, computing calibration statistics, assembling submission data, and returning final payloads. This makes the controller harder to test, reuse, and maintain. A future improvement would be restructuring the logic into dedicated service objects—such as CalibrationReportService, SubmissionContentService, and InstructorComparisonService, allowing the controller to act purely as a lightweight coordinator.
- The current calibration report includes basic metrics such as match rate, average score differences, and score distribution. There is substantial room for enhancing the analytical depth of the system. Future work could introduce additional insights like weighted scoring, variance and deviation metrics, rubric-specific reliability indicators, longitudinal trends of a student’s calibration accuracy across assignments, and even machine-learning-assisted anomaly detection for outlier reviews. These refinements would provide instructors with a much richer understanding of student grading behavior and calibration accuracy.
- During development, the SubmittedContentController could not be fully integrated because its existing behavior and data retrieval flow did not function correctly within the calibration pipeline, as the pull request was quite old and the schema has changed since then. As a result, calibration endpoints currently retrieve submitted content using stubs. A future enhancement would involve refactoring or rebuilding the SubmittedContentController so that it reliably returns standardized artifact data (hyperlinks, files, and other submission metadata).
Team Members
- Siddhi Khaire (skhaire@ncsu.edu)
- Spencer Kersey (sgkersey2@ncsu.edu)
- Surabhi Nair (snair3@ncsu.edu)
Mentor: Dr. Ed Gehringer (efg@ncsu.edu)