CSC/ECE 517 Spring 2019 - Project E1925. Refactor E1858. Github metrics integration

From Expertiza_Wiki
Jump to navigation Jump to search

Introduction

Problem Statement

Expertiza provides teammate reviews to gauge how much each team member contributed, this information could be generated from data from external tools like Github (for example, number of commits, number of lines of code modified, number of lines added, number of lines deleted.) from each group’s submitted repo link. Currently, Expertiza provides Teammate Reviews under View Scores functionality for each assignment.

1. Teammate Reviews functionality in the View Scores page gauges teammate views on how much other team members contributed to the project. We need to augment this data with data from external tools like GitHub in order to validate that feedback. New metrics will be appended under each student data under the same functionality.
2. Github Metrics under View Submissions page should include a bar chart that shows number of commits by the team throughout the assignment timeline. This will help instructors to get a team overview, and aid grading process.

While this data will not have marks associated directly, it will prove useful to the instructor in differentiating the performance of team members and hence awarding marks as per contribution. Overall data for the team, like the number of committers and number of commits may also help instructors to predict which projects are likely to be merged.

This task has been partially implemented by another group for project E1858. Github Metrics Integration in 2018 Fall semester. However, their work has been rejected with the feedback "They have integrated the github metrics into expertiza to show the number of commits, pull requests status, etc against every project. They have also integrated it into the metrics. Looks like they covered the edge cases. The code looks good but needs comments as it is pretty complex. The documentation feels like it is flooded with code, if there was a description of the changes, it would have been better.Extensive tests, but it might be good to see if additions to existing tests really belong in those same tests". The goal of our current project is to resolve issues existing in their previous work, refactor codes their created and modify their code following "DRY" principles. The ultimate goal is to have the Github Metric Integration work in Expertiza.

Project Task

Extract Github metadata of the submitted repos and pull requests.

  1. The metadata should be stored in the local Expertiza DB (we have a sample design here, which you may use or modify). For each participant, record should include at least include:
    • Committer id
    • Total number of commits
    • Number of files changed
    • Lines of code changed
    • Lines of code added
    • Lines of code removed
    • Lines of code added that survived until final submission [if available from Github]
  2. The code should sync the data with Github whenever someone (student or instructor) looks at a view that shows Github data.
  3. The data for teams should be shown in the instructor’s View Scores window, in a new tab, probably between Reviews and Author Feedback.
    • Design a good view for showing data on individuals. Please discuss this with the project mentor(s).
    • It seems to me that this should be on the Teammate Reviews tab, right below the grid for teammate reviews. The reason for this is that we’d like to see all the data on an individual in a single view. For teams, by contrast, there is already a pretty large grid, and potentially multiple grids for multiple rounds, so adding a new grid is more likely to clutter the display.
  4. Create a bar chart for the # of lines changed for each assignment team on “view_submissions” page. The x-axis should be the time starting from the assignment creation time, till the last deadline of the assignment, or current time, whichever is earlier.

Project E1858 Issues

Detailed implementation document can be found from E1858 wiki page.

The Projecr E1858 was rejected by the semester. Here is the feedback from expertiza manager group:

  • "They have integrated the github metrics into expertiza to show the number of commits, pull requests status, etc against every project. They have also integrated it into the metrics. Looks like they covered the edge cases. The code looks good but needs comments as it is pretty complex. The documentation feels like it is flooded with code, if there was a description of the changes, it would have been better.Extensive tests, but it might be good to see if additions to existing tests really belong in those same tests."

Refactors and Improvements

To refactor and improve project E1858, we will achieve the following tasks:

  • Run the E1858 code on the lastest Expertiza environment.
  • Clean the redundant and inaccurate code.
  • Refactor the old code into the correct controller(could be a new controller).
  • Move the Javascript code into the correct position.
  • Writing the new helper code.
  • Design the new interface page.
  • Wrting the new test case.

<Refactor>: grades_controller

File location: app/controllers/grades_controller.rb

Project E1858 Issues

Issues description here.

Project E1925 Improvements

Detialed improvemnts work here.

<Refactor>: grades_helper

File location: app/helpers/grades_helper.rb

Project E1858 Issues

Issues description here.

Project E1925 Improvements

Detialed improvemnts work here.

Project E1858 Document [1]

This project document is from the Project E1858 group. This is just a quick view about how they finished the project.

Problem Statement

Expertiza provides Teammate Reviews under View Scores functionality for each assignment. Purpose of this project is to augment existing assignment submissions with data that can give a more realistic view of the work contribution of every team member using external tools like GitHub. This external data may include: number of commits, number of lines of code modified, number of lines added, number of lines deleted from each group’s submitted repository link from GitHub.

1. Teammate Reviews functionality in the View Scores page gauges teammate views on how much other team members contributed to the project. We need to augment this data with data from external tools like GitHub in order to validate that feedback. New metrics will be appended under each student data under the same functionality.
2. Github Metrics under View Submissions page should include a bar chart that shows number of commits by the team throughout the assignment timeline. This will help instructors to get a team overview, and aid grading process.

While this data will not have marks associated directly, it will prove useful to the instructor in differentiating the performance of team members and hence awarding marks as per contribution. Overall data for the team, like the number of committers and number of commits may also help instructors to predict which projects are likely to be merged.

Current Scenario

Previously, view submission does not show work contribution per teammate.


Teammate review shows peer review amongst teammates. Currently, however, there is no way to validate and verify these reviews.


Checking commits performed by each team member on GitHub is a solution, but that is inefficient from instructor's/reviewer's perspective as there are many assignments, submissions, and tight deadlines.

Use Case Diagram

1. Use Case diagram of two approaches to append 'GitHub contribution metric' in teammate review. 2. Use Case diagram explaining approach to add new column 'GitHub contribution metric' in 'View submission
Use Case Diagram Details

Actors:

  • Instructor: This actor is responsible for viewing GitHub metrics of teams and team members of an assignment.

Pre-Conditions:

  • The Team should have submitted the assignment with a PR link or GitHub repository.

Primary Sequence:

  • The instructor should login.
  • The instructor should browse teams for an assignment.

Post Conditions:

  • Instructor will be able to see the team contribution done by each team member in 'View Submissions' page using graph diagrams, as shown in the figure.
  • Instructor will be able to see the work done by each student in 'Teammate Review Tab' with new metrics table appended at the end, as shown in the figure.

Design Considerations

  • The first thing was to determine what metrics we are looking for exactly. These are what the solution supports:
    1. Number of commits per user and total per team.
    2. Lines of Code added
    3. Lines of code deleted.
  • The next thing was to narrow down what hosting service for version control we would use. For now, we only support GitHub integration due to its popularity, ease-of-use and API documentation. Future projects could add in support for Gitlab and others, though it is far easier to just require that all students use GitHub.
    1. The main impact of this change will be that all submission repositories need to be made public as we need access to pull in the data.
    2. We also considered whether to ask students for GitHub repository link separately (changes to views) or to parse all the uploaded links and determine the correct one (extra logic, students uploading multiple links or not giving links at all). We decided to go with parsing the links as giving the link to PR is anyway mandatory.
  • An important question was whether we needed to store metric information in our own db at all.
    1. An older investigation came up with this schema, but this would likely cause issues with stale information and would have been difficult to maintain.
    2. Having a db was redundant as every time a user wants to see the metrics, we would need to sync the db with GitHub and then show the results. So we end up hitting GitHub API anyway.
    3. An alternative to the above approach was to take snapshots of metrics and store them in the db right on the submission deadline of projects. This would allow for fairer grading by making sure we pull in data at the correct moment. Unfortunately, doing this for so many projects would put a lot of load on the server. Also, for open source projects, this would mean that we don’t have the latest data to work with (people will keep committing past the deadline). Thus, this approach might have been good for grading purposes but wouldn't have helped with determining the current status of a project.
    4. We have decided against using our own tables for this data and will be getting the GitHub data on-demand directly using the GitHub API.
  • We also considered if we needed to account for different branches. We only consider the master branch.
  • With respect to showing GitHub metrics in the View scores page, it would have been very difficult to map Expertiza users and their names to public GitHub profiles as students may use a different name. So instead of appending GitHub data to Teammate reviews table, we will be showing a new table below it to display the metrics. This will allow the instructor full view of how teammate rated each other and how that maps to factual information from GitHub.
  • The instructors will need to spell out exact guidelines for committing to the project's repositories (like everyone should commit their own code, keep the master as PR branch, commit regularly, be mindful of squashing too many commits for one user), so that we can have proper and correct data and, also so that students can’t weasel their way out later claiming they worked but forgot or didn’t know.

Design Principles

  • MVC – The project is implemented in Ruby on Rails that uses MVC architecture. It separates an application’s data model, user interface, and control logic into three distinct components (model, view and controller, respectively). We intend to follow the same when implementing our end-point for pulling GitHub data.
  • Dry Principle – We are trying to reuse the existing functionalities in Expertiza, thus avoiding code duplication. Whenever possible, code modification based on the existing classes, controllers, or tables will be done instead of creating the new one.

Solution Design

  • The Github metrics that need to be integrated with Expertiza were finalized as below. These metrics are captured on a per-user basis:
    1. Total number of commits.
    2. Lines of Code added
    3. Lines of code deleted.
    4. Pull Request Status ( includes code climate and Travis CI Build status)
    5. User Github metrics:
      1. Committer ID
      2. Committer Name
      3. Committer email ID
  • A new link "Github Metrics" is provided under “View Submissions” for an assignment in the instructor view.This link opens a new tab and shows a stacked bar chart for number of commits per user vs submission timeline from assignment creation date to the deadline.
  • In "View Scores" for an assignment in the instructor view, under Teammate Reviews tab, a new table for Github Metrics is added, which shows following Github metrics per user:
Student Name/ID, Email ID, lines of code added, lines of code deleted, number of commits
  • For GitHub integration, we have used GitHub GraphQL API v4. We have used github-omniauth gem for authentication/authorization purposes.
  • We parse the link to PR to get data associated with it. We have also handled projects which do not have PR link, but just a link to the repository. We excluded expertiza and servo projects as right now a PR link is expected. Future enhancements can look into getting separate GitHub submission links.
  • We also show the status of check runs in the View Github metrics view to help instructors view the status of various tools on the repos/PRs without having to go to the actual GitHub page.



References

Project E1858 wiki

Expertiza_wiki

E1815:_Improvements_to_review_grader

Expertiza_PR_1179

Expertiza_PR_1179_Video

GitHub API documentation