CSC/ECE 517 Spring 2018- Project E1805: Convolutional data extraction from Github
Introduction
Expertiza is a web based open source tool developed and maintained by current and past students of North Carolina State University. Instructors
Problem Statement
The objective for this project is: Integrate Github metrics into Expertiza. The tasks required to complete this were broken down as follows.
- Retrieve data on a Git pull request
- Calculate meaningful metrics from the data for teams and individual students
- Display metrics to instructors
- Store metrics in the Expertiza database
Previous Design
Previous work done on this project utilized the Github statistics API, which was unable to retrieve information on individual students. The design was able to accurately fetch and display data covering a team of contributors, but could not create and display individual metrics to instructors.
Proposed Design
The proposed design utilizes the new GraphQL Git API which allows for simple extraction of metrics on a pull request. The API call returns information on multiple commits with only a single fetch command. The returned data is then parsed into team level information and individual level information. Next, the metrics are calculated from the data and displayed to the user.
Improvements
By manipulating GraphQL API v4, we don't need to pull out all the information from GitHub, which is not customizable. Instead, we can appoint what kind of data we want to fetch so that it's much more efficient than Github Pull Requests API and also we don't need to parse bulk information in order to get the portion we need.
After having the data we want, we start to divide them based on dates and users so that reviewer can understand the contribution of each member clearly.
Results
The output of this module shows the users github data in various metrics through charts and tables. The images below show the metrics gathered, and the charts created when looking at this pull request.
Environment Setup
To test this project manually on your local machine, you need to:
1. Clone from this repository https://github.com/rpot/expertiza/tree/E1805GithubDataExtraction
2. Go to your GitHub account -> Settings -> Developer Settings -> Personal access tokens -> Generate new token -> name the token and check the repo and admin:repo_hook option -> copy the token generated
3. Run GITHUB_ACCESS_TOKEN=<token> rails server, then you can manually test the functionality by going to Assignments as instructor6 -> view submissions -> Show Submission Records -> Show Github Data
Important note: If the project is added to the production environment of Expertiza, an access token must be created for the system and stored internally by some method!
Testing Plan
Tests for this project include testing two methods in app/models/github_datum.rb and one method in app/controllers/github_data_controller.rb by spec/models/github_datum_spec.rb and spec/controllers/github_data_controller_spec.rb.