CSC/ECE 517 Spring 2023 - E2300. Refactor E1858. Github metrics integration

From Expertiza_Wiki
Jump to navigation Jump to search

About Expertiza

Expertiza is an open source project based on Ruby on Rails framework. Expertiza allows the instructor to create new assignments and customize new or existing assignments. It also allows the instructor to create a list of topics the students can sign up for. Students can form teams in Expertiza to work on various projects and assignments. Students can also peer review other students' submissions. Expertiza supports submission across various document types, including the URLs and wiki pages.

Introduction

Expertiza provides teammate reviews to gauge how much each team member contributed, but we would like to augment this data with data from external tools like Github (for example, number of commits, number of lines of code modified, number of lines added, number of lines deleted.) from each group’s submitted repo link. This information should prove useful for differentiating the performance of team members for grading purposes. Overall data for the team, like number of committers and number of commits may also help instructors to predict which projects are likely to be merged.

Project Overview and Mission

  • The first step is to extract Github metadata of the submitted repos and pull requests.
  • The metadata should be stored in the local Expertiza DB. For each participant, record should include at least include:
  -Committer id
  -Total number of commits
  -Number of files changed
  -Lines of code changed
  -Lines of code added
  -Lines of code removed
  -Lines of code added that survived until final submission [if available from Github]
  • The code should sync the data with Github whenever someone (student or instructor) looks at a view that shows Github data.
  • The data for teams should be shown in the instructor’s View Scores window, in a new tab, probably between Reviews and Author Feedback:
  -Design a good view for showing data on individuals
  -This should be on the Teammate Reviews tab, right below the grid for teammate reviews.  The reason for this is that we’d like to see all the data on an individual
   in a single view.  For teams, by contrast, there is already a pretty large grid, and potentially multiple grids for multiple rounds, so adding a new grid is more
   likely to clutter the display
  • Create a bar chart for the # of lines changed for each assignment team on “view_submissions” page. The x-axis should be the time starting from the assignment creation time, till the last deadline of the assignment, or current time, whichever is earlier.



Current Implementation

  • Because we encountered errors while integrating the previous implementation into the expertiza application, we had to spend a significant amount of time resolving them. However, now that the integration is complete, we can finally shift our focus back to the current requirements and make progress on them.
  • We added test cases and refactored metrics_controller.rb

Changes in metrics_controller.rb

  • Instead of controller: 'assignments', we used the list_submissions_assignment_path helper method to generate the correct path for redirect_to.
  • The unless statement in require_instructor_privileges are simplified to a one-liner.
  • Removed the only: [:action_allowed?] from require_instructor_privileges, since there is no action_allowed? method defined in the controller.
  • Removed the comments that were not providing any additional value.
  • Extracted some code to private methods to improve readability and organization.
  • Removed unnecessary instance variables and local variables.
  • Simplified the retrieve_github_data and create_github_metrics methods.
  • Removed unnecessary @total_* instance variables.
  • Removed the @check_statuses instance variable, which isn't used in this method.
  • The retrieve_github_data method has been refactored to use partition instead of select, simplifying the code.
  • In query_all_pull_requests, we've combined two lines of code to simplify the creation of the head_refs hash. We've also passed hyperlink_data to parse_pull_request_data instead of github_data to match the method's expected argument.
  • We've refactored extract_hyperlink_data to use slice and zip to create the hash, which is shorter and easier to read.
  • In pull_request_data, We replaced the while loop with a loop do construct and used break to exit the loop instead of modifying a flag variable. We also used the safe navigation operator (&.) to avoid NoMethodError exceptions when accessing nested hashes that may be nil.
  • In parse_pull_request_data, we used the safe navigation operator to avoid NoMethodError exceptions when accessing nested hashes that may be nil. We also added some nil checks to skip over commits or commit data that is missing or invalid. Finally, we simplified the date-slicing logic by chaining the to_s and slice methods.
  • retrieve_repository_data now calls parse_hyperlink to extract the owner and repository names from the GitHub repository URL. It also uses a loop instead of a while to simplify the code and avoid unnecessary variables.
  • query_and_parse_repository_data now returns the page info hash instead of modifying it in place, to make it clearer and less error-prone.
  • parse_repository_data now uses dig to access nested hashes safely, avoiding errors if any of the expected keys are missing. It also simplifies the date string extraction by using a slice instead of a regex.
  • Used the dig method to access nested keys in the hash, which returns nil if any intermediate key is nil instead of raising an exception.
  • Removed unnecessary comments and made the existing ones more concise.
  • Used ternary operator to simplify the if statement.
  • Used dig instead of nested hash indexing with square brackets. This makes the code more concise and avoids NoMethodError exceptions when a hash key is missing.
  • Used return to exit early from count_github_authors_and_dates if the author is a collaborator. This avoids unnecessary code execution and improves performance.
  • Used Hash.new(0) to initialize the hash for each author's commits. This simplifies the code and avoids the need for if...else logic.
  • Used default to set the default value for each author's commits to 0. This avoids the need for if...else logic when incrementing the commit count.
  • Used to_h to convert the sorted array of commits back to a hash. This is more concise than building a new hash with Hash[].
  • Used dig to access nested hash keys in team_statistics. This avoids NoMethodError exceptions and makes the code more concise.
  • Used ternary operator to simplify
  • Used the find_by method instead of where and first for simplicity.
  • Used the safe navigation operator (&.) to avoid errors when trying to call a method on a nil object.
  • Used present? instead of nil? to check if a record exists.
  • Used update instead of assigning values and calling save separately for readability.
  • Used string interpolation to concatenate the @ncsu.edu suffix to the email prefix.
  • Removed unnecessary comments and code.

Use Github Metrics Functionality

This functionality gives the instructor the option to use Github metrics when creating an assignment.

Future Implementation

  • Add the functionality to turn on and turn off the GitHub usage function.

Project Mentor

Ed Gehringer (efg@ncsu.edu)

Contributors to this project

Soham Bapat (sbapat2@ncsu.edu) Srihitha Reddy Kaalam (skaalam@ncsu.edu) Sukruthi Modem (smodem@ncsu.edu)

Pull Request Link

https://github.com/expertiza/expertiza/pull/2524