CSC/ECE 517 Spring 2021 - E2111. Refactor github metrics integration: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 59: Line 59:
==Important Data Structure==
==Important Data Structure==
Example of function get_pull_request_details return value:
Example of function get_pull_request_details return value:
<nowiki>
 
{"data"=>{"repository"=>{"pullRequest"=>{"number"=>23281,
{"data"=>{"repository"=>{"pullRequest"=>"{""number"=>23281,
                                         "additions"=>183,
                                         "additions"=>183,
                                         "deletions"=>119,
                                         "deletions"=>119,
Line 96: Line 96:
           }
           }
}
}
</nowiki>

Revision as of 18:31, 29 March 2021

Abstract

Github repository and pull request links are submitted as part of many projects in several NC State CSC courses. Instructors would like to compile statistics on these Github links on a per-project, per-team basis, and integrate display of these data within Expertiza (similar to Github Insights). Code was introduced to add this functionality in a 2018 project, E1858, which was not merged, and another team re-visited this project in 2019, which was not merged. Our team aims to tackle the issues we've found that prevented these past team's code from being merged.

Functionality

ADD SECTION INTRODUCING THE FUNCTIONALITY (Description and Maybe one screenshot. Past group went overboard with copy-pasting old screenshots.)

TALK ABOUT OMNIAUTH FOR API AND HOW IT ALL WORKS

Methods

The team has synced the original 2018 project code with the current status of the Expertiza/beta branch and begun analyzing the existing code. Prior customer feedback has advised the team to look for coupling, lack of comments/opaque code, and testing issues. We are performing our own end-to-end analysis of the code in addition to this feedback, and have so far found numerous action items. We have separated these action items into Primary and Secondary objectives: Primary objectives are "must-haves" that need to be addressed before the existing code is mergeable. Secondary objectives are "nice-to-haves," or features that were either left out of the original project or would make the integration display more complete.

Primary Objectives (Immediate Action Items)

These action items must be addressed before any feature improvements can be attempted.

API Changes Breaking Functionality

The Github API has had minor changes since 2018 which are causing the previously working code to either not display, or throw a NoMethodError. We have already analyzed and applied bugfixes to address these minor API changes, but would like to further refine how the API calls are set-up to decouple our code from the API and make it more robust.

ADD CODEBLOCKS TALKING ABOUT THE TWO BUGFIXES

Coupling to the Grades Controller

CONTINUED

Views Contain Large Javascript Blocks

DESCRIBE MOVE TO HELPERS

Methods Out-of-Order

CHANGE TO CALLING ORDER OR BETTER RE_ORG

Lack of Comments / Opaque Code

CONTINUED

Testing

SHORT COMMENTARY ON COUPLED TESTING TO GRADES/INSTRUCTOR VIEW. SEE TEST PLAN FOR AT_LENGTH DISCUSSION.

Secondary Objectives

The team would like to address one or more of these secondary objectives which were not addressed by prior teams, as long as they do not break existing functionality. These are listed in order of importance.

Data Permanence (Model)

The existing code polls Github every time the data is displayed. This makes the rendering code unnecessarily complex and introduces the potential for future data loss. For performance reasons, decoupling, and the benefit of future developers and data permanence, an improvement to this method would be creating a model and database table to hold these data, so that the display is querying the local DB instead of querying out to Github each time the display is rendered. The team would like to extend the proposed new controller to include a new model that will take this data, and only insert the diff into the database. One point of discussion among the team and our mentor will be, should an update query be run each time the page is displayed, or should a burst of queries be run at the expiration of a deadline to populate the database table? Another option may be to create a manual "Github Update" link, but reject all data after a deadline. Yet another option would be to keep all Github data and use the "late data" to call out number of late submissions on the statistics page.

New Feature: Hide Expertiza Development Team

At this time, if a team merges in many old commits by the Expertiza Development team, the bar graph statistics are thrown off by including the development team ("Winbobob" et al). If the data permanence objective is met, creating more complex queries should be easier, and the team would like to explore adding a feature to the Github integration display to "hide known contributors," or similar. This would hide Github data from any user included on the "blacklist" which can be manually curated with the short list of Expertiza developers. While this would not address edge cases where an Expertiza developer later participates in a course using Expertiza, this edge case should be quite rare.

Test Plan

The 2018 project wrote numerous tests, several of which are deeply coupled into the grades_controller tests. The team proposes several changes to the testing strategy.

Decouple Tests / Pass Builds

Builds never passed for this project because, in the current code, a user must be logged in and authorized to Github to render grades#view, and this broke several (expected) test cases. As part of decoupling Github from the grades controller, we expect to decouple and revert tests for the grades controller such that the tests become independent, permitting the build to pass if our Github controller tests pass as expected.

Detailed Analysis of Github Testing

The previous team wrote numerous intricate tests for the new functionality. Our team intends to go through these tests line-by-line and ensure that having so many tests is necessary, and the most elegant way to address the issue. We believe that some of the Github refactoring should move methods to become private, which may mean that specific tests for that method shouldn't be used. Alternately, particularly for API authentication which can be tricky, it may be prudent to have discrete tests. Finally, we want to ensure that the tests are actually covering the code -- we are receiving a TravisCI warning about tests without expectations, which needs to be resolved before the code can be merged.

New Tests for Secondary Objectives

If a new model is created along with associated database tables, new tests for this model will need to be written first in keeping with TDD principles.

Resources

ADD PERTINENT LINKS

Important Data Structure

Example of function get_pull_request_details return value:

{"data"=>{"repository"=>{"pullRequest"=>"{""number"=>23281,

                                        "additions"=>183,
                                        "deletions"=>119,
                                        "changedFiles"=>13,
                                        "mergeable"=>"UNKNOWN",
                                        "merged"=>true,
                                        "headRefOid"=>"58f027468c8908160196c34681ff5d224a84d75c",
                                        "commits"=>{"totalCount"=>2,
                                                    "pageInfo"=>{"hasNextPage"=>false,
                                                                 "startCursor"=>"MQ",
                                                                 "endCursor"=>"Mg"
                                                                 },
                                                    "edges"=>[{"node"=>{"id"=>"MDE3OlB1b...",
                                                                        "commit"=>{"author"=>{"name"=>"Manish Goregaokar"
                                                                                             },
                                                                                   "additions"=>10, "deletions"=>10,
                                                                                   "changedFiles"=>1,
                                                                                   "committedDate"=>"2019-04-30T02:44:08Z"
                                                                                  }
                                                                        }
                                                               },
                                                               {"node"=>{"id"=>"MDE3OlB1b...",
                                                                         "commit"=>{"author"=>{"name"=>"Maria Sable"},
                                                                                    "additions"=>173,
                                                                                    "deletions"=>109,
                                                                                    "changedFiles"=>12,
                                                                                    "committedDate"=>"2019-04-30T23:54:52Z"
                                                                                    }
                                                                         }
                                                                }
                                                               ]
                                                    }
                                        }
                        }
         }

}