CSC/ECE 517 Fall 2024 - E2457. GitHub metrics integration
About Expertiza
Expertiza is a web-based, open-source educational platform built with Ruby on Rails by students and faculty at NC State University. It enables instructors to create flexible assignments that students can select based on their interests and supports team-based projects with a robust peer review system. Through peer evaluations, students gain critical feedback, enhancing self-improvement skills. With support for diverse submission types, Expertiza adapts easily to various assignment formats and teaching approaches.
Problem Statement
Expertiza currently has team reviews to measure how much each team member contributed to a project. However, it would be useful to instructors to be able to see github metrics on expertiza for each group to corroborate the team reviews. These metrics include number of commits, number of lines of code modified, number of lines added, and number of lines deleted.
Background
The main objective of the project is to refactor the code written by the Spring 2023 team for displaying github metrics. The main issue with this team's code is that metrics_controller.rb is too bloated as it has logic that should be in the model as well as having too many github specific methods. Since other metrics could be used in the future, a metrics_controller should be general. In addition, several method names are confusing and need to be renamed, and several methods adhere to the Single Responsibility Principle and so need to be split into multiple methods. Lastly, testing and comments on tests need to be improved.
Feature Implementation
Use Case Diagram
New Github Metrics Functionality
The “Manage Users” page includes a link to import GitHub associations.
The “Import GitHubAssociation List” page allows configuration of import settings and file upload.
A preview page displays the GitHub associations from the uploaded file before finalizing the import.
The edit page for assignments allows enabling GitHub metrics for projects.
The pre-login page displays the interface to obtain the GitHub access token.
The post-authentication page shows the interface after successfully logging in with a GitHub account.
The GitHub metrics page displays visualizations of code frequency and total contributions by author, along with detailed team statistics.
Proposed Changes
Changes to metrics_controller.rb
Renaming metrics_controller.rb to github_metrics_controller.rb
Firstly, metrics_controller will be renamed to github_metrics_controller to reflect its focus on GitHub-related functionality, as no other metrics are currently anticipated for this controller.
Moving Methods out of the Controller
The following functions do not belong in the controller and should live in either model or helper classes.
- github_metrics_for_submission: This method uses the session information to submit the github metrics. However, we can pass the session to the model to perform the same function.
- retrieve_github_data: This method takes a link and determines whether it's a link for a pull request or a repository link which determines how the metrics are returned. This should not be a controller method because the method is only concerned with how to handle the link submitted and not what to show the user.
- query_all_pull_requests: This method iterates over multiple pull request links and fetches the data from github. Because it is not concerned with what to show the user, it should live in the model.
- pull_request_data: This method iterates over pages of commits, preparing the metrics pulled from the commits for the parser. Since this is handling data collection, this method should be in the model.
- parse_pull_request_data: This method parses the metrics from the pull request and prepares it for the view. This means that it should be a model method instead of a controller method
- query_all_merge_statuses: This method iterates through all the pull request links and retrieves the status of each link. This is more data retrieval and so should be in a model instead of the controller
- retrieve_repository_data: This method fetches metrics from the github repository which means that it should be in a model instead of a controller method.
- parse_repository_data: This method parses the metrics retrieved from the repository and parses it for the view. This means that it should a model method instead of controller method
- count_github_authors_and_dates: This method processes the metrics from the commits and so should be a model method rather than a controller method.
- find_user_by_github_email: This method receives an email and then parses it to find a user. This is not a controller method because it is parsing data which means that it should either be in a model or service class. Moreover, this method has the false assumption that every email used will be an NC State email address, and so has to be reimplemented.
Renaming Methods
- retrieve_github_data: The name of this method is ambiguous because it is unclear what ‘data’ is. This method takes the links that a group submitted and determines whether it is a repository or pull request link and will then call the appropriate method to parse the metrics from the link. For this reason this method should be renamed to retrieve_github_metrics
- query_all_pull_requests: The name of this method is ambiguous as it is unclear what is being queried about for each pull request. This method iterates over all of the links and parses out the relevant information by getting the name of the owner from the hyperlink as well as retrieving the github metrics from the pull request. For this reason this method should be renamed parse_all_pull_requests
- pull_request_data: The name is ambiguous because it is unclear what ‘data’ is. This method retrieves the github metrics from the pull request link that it was passed. For this reason the method should be retrieve_pull_request_metrics
- parse_pull_request_data: The name is ambiguous because it is unclear what ‘data’ is. This method parses the metrics returned from the github API for the given pull request link. For this reason the method should be parse_pull_request_metrics
- retrieve_repository_data: The name is ambiguous because it is unclear what ‘data’ is. This method retrieves github metrics for the given github repository link. For this reason the name should be retrieve_repository_metrics
- parse_repository_data: The name is ambiguous because it is unclear what ‘data’ is. This method parses the metrics returned from the github API for the given repository link. For this reason the method should be parse_repository_metrics
Other Changes
Removing GitHub Metrics Uses Model and Controller
To simplify the implementation and improve efficiency, github_metric_uses.rb model and github_metrics_uses_controller.rb will be removed and replaced with a new boolean attribute, use_github_metrics
, was added to the assignment.rb model. This change allows assignments to specify whether GitHub metrics should be utilized without the need for a separate model or controller, streamlining the codebase and enhancing maintainability.
Visualization
The following UML diagrams illustrate the system’s evolution before and after refactoring the GitHub metrics integration feature in the Expertiza project. Initially, the metrics_controller was overloaded with responsibilities, handling data retrieval, parsing, and presentation, violating the Single Responsibility Principle and causing tight coupling. After refactoring, the github_metrics_controller focuses solely on coordination, with data handling moved to a dedicated GithubMetricsModel. Redundant components, such as github_metric_uses.rb, were removed, and the assignment.rb model was streamlined with a boolean attribute to manage GitHub metrics usage. These changes improve modularity, clarity, and maintainability.
Before proposed changes
After proposed changes
Implemented Changes
Github Metric Controller (github_metrics_controller.rb)
This new controller now only manages the collection and display of GitHub metrics for assignments, handling OAuth authentication with GitHub, processing repository metrics for both individual teams and entire assignments, and providing functionality to view GitHub-related statistics. The controller includes error handling and ensures that only users with instructor privileges can access its functions. It works in conjunction with GitHub's OAuth system to maintain authenticated sessions for API requests.
class GithubMetricsController < ApplicationController
include AuthorizationHelper
include AssignmentHelper
include GithubMetricsHelper
def action_allowed?
...
end
def initialize_oath
...
end
def query_assignment_statistics
...
end
def callback
...
end
def show
...
end
private
def create_team_metric
...
end
def error_redirect
...
end
end
Github Metric Model (github_metric.rb)
This model handles the retrieval and processing of GitHub metrics for team submissions. It interacts with GitHub's GraphQL API to collect detailed statistics about pull requests, including commit information, author details, and repository status. The model tracks various metrics such as additions, deletions, commit counts, file changes, and merge status. Pagination for large pull requests are also handled and comprehensive error handling for API interactions are provided.
class GithubMetric
attr_reader :participant, :assignment, :team, :token
attr_accessor :head_refs, :parsed_metrics, :authors, :dates, :total_additions,
:total_deletions, :total_commits, :total_files_changed,
:merge_status, :check_statuses
def initialize(participant_id, assignment_id = nil, token = nil)
...
end
def build_metric
...
end
def pull_query(hyperlink_metrics)
...
end
private
def initialize_metrics
...
end
def retrieve_metrics
...
end
def parse_all_pull_requests(pull_links)
...
end
def retrieve_pull_request_metrics(hyperlink_metrics)
...
end
def parse_pull_request_metrics(github_metrics)
...
end
def merge_status?
...
end
def count_authors_and_dates(author_name, author_email, author_login, commit_date)
...
end
def commit_statistics(metrics)
...
end
def pull_request_status(pr_object)
...
end
def team_statistics(github_metrics, metrics_type)
...
end
def set_unavailable_statistics
...
end
def sort_commit_dates
...
end
def parse_hyperlink_metrics(hyperlink)
...
end
end
Testing
Our test plan will follow closely to that of the Spring 2023 team (link to their test plan here).
In general, to ensure the functionality and reliability of the GitHub metrics integration, we will conduct testing using RSpec and UI tests. This involves adding RSpec test commands for github_metric_controller and github_metric to verify that each component functions as expected within the refactored structure. Additionally, we will run UI tests to ensure that the GitHub metrics display accurately and integrate seamlessly within the Expertiza interface.
Deployment
In order to deploy our changes to expertiza, an Oath app needs to be created for the production build of expertiza. This link has the instructions needed to create the app. An important note is that the Authorization callback URL needs to be the homepageURL/auth/github/callback in order for our changes to work. Moreover, an environment variable named 'GITHUB_KEY' needs to be defined with the Client ID of the Oath app, and 'GITHUB_SECRET' needs to be defined with the Client secret you create in the app.
Team
Members
- Brandon Walia
- Kevin Dai
- Manav Patel
Mentor
- Dr. Ed Gehringer