E1858 Github metrics integration: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
 
(44 intermediate revisions by 3 users not shown)
Line 3: Line 3:
===Problem Statement===
===Problem Statement===


Expertiza provides '''view submission''' and '''teammate review''' (under view score) functionality for each assignment.
Expertiza provides '''View Submissions''' and '''Teammate Reviews''' (under View Scores) functionality for each assignment.
Purpose of this project is to augment existing assignment submissions with data that can give more realistic view on work contribution of every team member using external tools like Github.
Purpose of this project is to augment existing assignment submissions with data that can give more realistic view on work contribution of every team member using external tools like Github.
This external data may include: number of commits, number of lines of code modified, number of lines added, number of lines deleted from each group’s submitted repository link from Github.  
This external data may include: number of commits, number of lines of code modified, number of lines added, number of lines deleted from each group’s submitted repository link from Github.  
Line 13: Line 13:
This information should prove useful for differentiating the performance of team members for grading purposes. Overall data for the team, like number of committers and number of commits may also help instructors to predict which projects are likely to be merged.
This information should prove useful for differentiating the performance of team members for grading purposes. Overall data for the team, like number of committers and number of commits may also help instructors to predict which projects are likely to be merged.


=='''Overall Description==
=='''Current Scenario'''==


[[File:Ssdt_22.png|none|frame|
== At present, group assignments are submitted as a single submission that shows work done as a whole team. This does not show work contribution done by each teammate. ==
]]


===Current Scenario===


At present, group assignments are submitted as a single submission that shows work done as a whole team. This does not show work contribution done by each teammate.
[[File:Ssdt_123.png|none|frame|
== Teammate review shows views of each teammate on every other teammates work. That might leads to discrepancy while validating the reviews. ==


Assignment_sub_ss_1.png
]]


Also teammate review shows views of each teammate on every other teammates work. That might leads to discrepancy while validating the reviews.
Teammatereview_ss_1.png


Checking commits performed by each team member on Github is one of the solution, but that does not seems efficient from instructor/reviews perspective considering huge number of assignments, number of submissions and tight deadlines.
Checking commits performed by each team member on Github is one of the solution, but that does not seems efficient from instructor/reviews perspective considering huge number of assignments, number of submissions and tight deadlines.
Line 31: Line 31:


* The first thing is to determine what metrics we are looking for exactly. These are what the requirements document suggests:
* The first thing is to determine what metrics we are looking for exactly. These are what the requirements document suggests:
** Total number of commits.
*# Total number of commits.
** Number of files changed.
*# Number of files changed.
** Lines of Code added
*# Lines of Code added
** Lines of code modified.
*# Lines of code modified.
** Lines of code deleted.
*# Lines of code deleted.
** LOCs that survived until final submission - (exclude from MVP due to complexity and lower priority).
*# LOCs that survived until final submission - (exclude from MVP due to complexity and lower priority).


We are assuming these metrics are needed on a per-user basis.
:We are assuming these metrics are needed on a per-user basis.
The requirement document expects a view to be created for viewing the metrics under a new tab “Github Metrics” under “View Submissions” in the instructor view.
:The requirement document expects a view to be created for viewing the metrics under a new tab “Github Metrics” under “View Submissions” in the instructor view.
A line chart of number of lines vs time needs to be included in the view.
:A line chart of number of lines vs time needs to be included in the view.


* The next thing is to narrow down what hosting service for version control we will use. For now, the plan is to only support Github integration due to it popularity, ease-of-use and API documentation. Future projects could add in support for Gitlab and others, though it is far easier to just require that all students use Github.
* The next thing is to narrow down what hosting service for version control we will use. For now, the plan is to only support Github integration due to it popularity, ease-of-use and API documentation. Future projects could add in support for Gitlab and others, though it is far easier to just require that all students use Github.
** The main impact of this change will be that all submission repositories need to be made public as we need access to pull in the data.
*# The main impact of this change will be that all submission repositories need to be made public as we need access to pull in the data.
** We also need to consider whether to ask students for github repo link separately(changes to views) or to parse all the uploaded links and determine the correct one (extra logic, students uploading multiple links or not giving links at all).
*# We also need to consider whether to ask students for github repo link separately(changes to views) or to parse all the uploaded links and determine the correct one (extra logic, students uploading multiple links or not giving links at all).


* An important question is whether we need to store metric information in our own db at all.  
* An important question is whether we need to store metric information in our own db at all.  
** An older investigation came up with this schema, but this is likely to cause issues with stale information and will be difficult to maintain.
*# An older investigation came up with this schema, but this is likely to cause issues with stale information and will be difficult to maintain.
** Having a db is redundant as every time a user wants to see the metrics, we would need to sync the db with Github and then show the results. So we end up hitting Github api anyway.
*# Having a db is redundant as every time a user wants to see the metrics, we would need to sync the db with Github and then show the results. So we end up hitting Github api anyway.
** An alternative to the above approach is to take snapshots of metrics and store them in the db right on the submission deadline of projects. This will allow for fairer grading by making sure we pull in data at the correct moment. Unfortunately, doing this for so many projects could put a lot of load on server. Also, for open source projects this would mean that we don’t have the latest data to work with(people will keep committing past deadline). Thus, this approach might be good for grading purposes but doesn’t help with determining the current status of a project.
*# An alternative to the above approach is to take snapshots of metrics and store them in the db right on the submission deadline of projects. This will allow for fairer grading by making sure we pull in data at the correct moment. Unfortunately, doing this for so many projects could put a lot of load on server. Also, for open source projects this would mean that we don’t have the latest data to work with(people will keep committing past deadline). Thus, this approach might be good for grading purposes but doesn’t help with determining the current status of a project.
** All that said, we might need to maintain some meta-data in our db- investigation pending.
*# All that said, we might need to maintain some meta-data in our db- investigation pending.
 
* We also need to consider if we need to account for different branches. The initial plan is to only consider the master branch.
* We also need to consider if we need to account for different branches. The initial plan is to only consider the master branch.
* A suggestion was also to make sure that there isn’t a lot of padding in the tables we show.
* A suggestion was also to make sure that there isn’t a lot of padding in the tables we show.
* The instructors need to spell out exact guidelines for committing to the projects repos(like everyone should commit their own code, keep the master as PR branch, commit regularly, be mindful of squashing too many commits for one user), so that we can have proper and correct data and, also so that students can’t weasel their way out later claiming they worked but forgot or didn’t know.


=='''Application Flow'''==
* With respect to showing Github metrics in the View scores page, it will be very difficult to map Expertiza users and their names to public Github profiles as students may use different name. So instead of appending Github data to Teammate reivews table, it would be easier to just add a new tab called "Github Metrics" and show the results there.


[[File:E1799_design_1.jpg]]
* The instructors will need to spell out exact guidelines for committing to the projects repos(like everyone should commit their own code, keep the master as PR branch, commit regularly, be mindful of squashing too many commits for one user), so that we can have proper and correct data and, also so that students can’t weasel their way out later claiming they worked but forgot or didn’t know.


==Use case Diagram==
=='''Plan of action'''==


* The first thing is to investigate if there are any rails gems(Octokit/Unicorn maybe?) to get Github data and if we are allowed to use them. The idea is to always get the latest data from Github and show it to the user. For grading purposes, we could alternate a view between latest data and data till submission deadlines.
* If there are no such gems or if we aren’t allowed to use gems, we need to write to mock endpoints to understand out how to get data from github. There is [https://developer.github.com/v3/ ample] documentation for this. The github api returns JSON which can be easily parsed.
* We then need to figure out how to load this data(AJAX? or gem?) when instructor selects “Github Rubrics”(if we do this) or "Teammate Review" in the View Scores page. Similar data needs to be loaded in the View Submissions page.
* We will need a mock up view to display this data in a table.
* We then implement the view. For charting, suggestion is to use a previous project for charting- [https://www.youtube.com/watch?v=HHdta64VHcY 1] [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Spring_2018_E1815:_Improvements_to_review_grader 2] [https://github.com/expertiza/expertiza/pull/1179 3].
=='''Use case Diagram'''==
[[File:Ssdtusecasemerge.png|none|frame|
== Use Case diagram of two approaches to append 'Github contribution metric' in teammate review. ==
]]
[[File:Ssdtusecase323.png|none|frame|
== Use Case diagram explaining approach to add new column 'Github contribution metric' in 'View submission' ==
]]


[[File:E1799_design_2.jpg]]


Actors:
Actors:
Line 105: Line 118:
=='''Database Design==
=='''Database Design==


1.The self-review data is getting stored in the database. This data is fetched from the database and shown in the view with the other peer-review columns.
As of now, we do not have plans for database modifications.
 
2.The peer-review data and self-review data is fetched from the database and passed to the method. The method computes the composite score and is shown in the view.
 
=='''Method to calculate the composite score==
 
 
[[File:E1799_design_3.png]]
 
 
The above data is taken from the actual reviews done for the third OSS project.
 
1.The average has been calculated for the reviews done by the other teams.
 
2.The weighted average is calculated by dividing average by maximum score.
 
3.The weighted self-review is calculated by dividing self-review score by maximum score.
 
4.Difference value is calculated by subtracting weighted average and weighted self review.
 
5.The sum of differences of all the criteria is calculated which gives the deviation from the total of review scores(Difference score).
 
6.The difference between the number of questions and the Difference score gives us the composite score. And divide the composite score by the number of questions and multiply by 100 to get the final composite score.
 
{|border=1
|Range
|Self Review Quality
|-
|<30
|Improper assessment
|-
|30 to 60 
|Moderate
|-
|60 to 80
|Can do better
|-
|80 to 100
|Perfect
|}
=='''Design Pattern==
 
1.MVC – The project is implemented in Ruby on Rails that uses MVC architecture.  It separates an application’s data model, user interface, and control logic into three distinct components (model, view and controller, respectively). 
 
2.Dry Principle – We are trying to reuse the existing functionalities in Expertiza,thus avoiding code duplication.  Whenever possible, code modification based on the existing classes, controllers, or tables will be done instead of creating the new one.


=='''Code Modifications==
=='''Design Principles'''==


In app/views/grades/view_team.html.erb, we will add the html code to display one more column with the peer review columns. And the code will be added to display the composite score.
* MVC – The project is implemented in Ruby on Rails that uses MVC architecture. It separates an application’s data model, user interface, and control logic into three distinct components (model, view and controller, respectively). We intend to follow the same when implementing out end-point for pulling Github data.


In app/models/vm_question_response.rb, we will add the method to calculate the composite score.
* Dry Principle – We are trying to reuse the existing functionalities in Expertiza,thus avoiding code duplication. Whenever possible, code modification based on the existing classes, controllers, or tables will be done instead of creating the new one.


In app/models/response_map.rb, we will be including the self review logic in the condition statement mainly checking for the type of response map.
=='''Proposed solutions'''==
* The first change would would be under the "Teammate Review" tab in the "View Scores" page. We either add new rows to the table, one for each of the Github metrics, and display the results for every student in the project as mocked up here:


In app/models/vm_question_response_row.rb, we will be including the method to calculate the weighted average, which is used to calculate the overall composite score.
[[File:Ssdtviewscore.png|none|frame|caption]]


Or we might add a new tab called "Github Metrics" in the View scores page and then show the metrics and the results in tabular format as mocked up:
[[File:Ssdtviewsub.png|none|frame|caption]]


=='''Implementation==
* The second change is in the View Submissions page, where we intend to add a new column to the table that shows a chart per assignment team. This chart will show proportion of contributions per team member.


The last column has the self review scores. The composite score is calculated and is displayed at the bottom of the table.
[[File:Ssdtfinalss11.png|none|frame|caption]]


=='''Test Plan'''==


[[File:proper view.png]]
'''Subtask 1: Github metrics in teammate reviews'''


The composite score is displayed beside the average peer review score.
'''Test plan for proposed solution 1:'''


[[File:alternate_view.png]]
1) Log in as an instructor


2) Navigate to assignments through Manage --> Assignments


=='''Test Cases==
3) Select "View scores" icon for the assignment of your choice


{|border=1
4) Select the team for which you wish to view scores
|Test case ID
|Test Objective
|PreConditions
|Steps
|Test Data
|Expected Result
|Post Conditions
|-


|E1799_Test_Case_1
5) Go to "Github metrics" tab
|Successfully displaying self review scores in the heat map view
|1. The student should do the self review.
2. The scores of the self review should be present in the database.
|1. Login to the Expertiza Application.
2. Submit the assignment created by the instructor by browsing it.


3. Choose the self review option and complete the self review.
6) View data based on different Github metrics (e.g. lines of code added/changed/removed etc.) for each teammate


4. Go to view scores option.
'''Test plan for proposed solution 2:'''
|Submit the relevant assignment before doing the self review.
|The self review column is displayed together with the other peer reviews in the end.
|The student is able to see the self review column.
|-


1) Log in as an instructor


|E1799_Test_Case_2
2) Navigate to assignments through Manage --> Assignments
|Successfully calculating the proper composite score.
|1. The student should do the self review.
2. The score of the self review should be present in the database.


3. The function is implemented to calculate the composite score using average of the peer reviews and the self review.
3) Select "View scores" icon for the assignment of your choice
|1. Login to the Expertiza Application.
2. Submit the assignment created by the instructor by browsing it.


3. Choose the self review option and complete the self review.
4) Select the team for which you wish to view scores


4. Go to view scores option.
5) Go to "Teammate reviews" tab
|Submit the relevant assignment before doing the self review.
|The composite score is returned by the method.
|The method is able to return the expected composite score.
|-


6) Select the student for whom you wish to view teammate review


|E1799_Test_Case_3
7) Below the usual criteria, view criteria for different Github metrics (e.g. lines of code added/changed/removed etc.) portrayed in different color scheme (light blue)
|Successfully displaying of composite score in the heat map view.
|1. The student should do the self review.
2. The scores of the self review should be present in the database.
|1. Login to the Expertiza Application.
2. Submit the assignment created by the instructor by browsing it.


3. Choose the self review option and complete the self review.
'''Subtask 2: Line chart for # of lines changed by overall team'''


4. Go to view scores option.
1) Log in as an instructor
|Submit the relevant assignment before doing the self review.
|The composite score is displayed in the heat map view.
|The student is able to see the composite score.
|-


2) Navigate to assignments through Manage --> Assignments


3) Select "View submissions" icon for the assignment of your choice


4) Select the team whose submissions you wish to view


|E1799_Test_Case_4
5) A newly added Github metrics column is added to show # of lines changed since the start of the assignment
|Successfully displaying of composite score in alternate view.
|1. The student should do the self review.
2. The scores of the self review should be present in the database.
|1. Login to the Expertiza Application.
2. Submit the assignment created by the instructor by browsing it.


3. Choose the self review option and complete the self review.
=='''References'''==


4. Go to alternate scores option.
[http://wiki.expertiza.ncsu.edu/index.php/Main_Page Expertiza_wiki]
|Submit the relevant assignment before doing the self review.
|The composite score is displayed in the alternate scores view.
|The student is able to see the composite score.
|-


|}
[http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Spring_2018_E1815:_Improvements_to_review_grader E1815:_Improvements_to_review_grader]


[https://github.com/expertiza/expertiza/pull/1179 Expertiza_PR_1179]


=='''References==
[https://www.youtube.com/watch?v=HHdta64VHcY Expertiza_PR_1179_Video]


http://wiki.expertiza.ncsu.edu/index.php/Main_Page
[https://developer.github.com/v3/ Github API documentation]

Latest revision as of 01:53, 14 November 2018

Introduction

Problem Statement

Expertiza provides View Submissions and Teammate Reviews (under View Scores) functionality for each assignment. Purpose of this project is to augment existing assignment submissions with data that can give more realistic view on work contribution of every team member using external tools like Github. This external data may include: number of commits, number of lines of code modified, number of lines added, number of lines deleted from each group’s submitted repository link from Github.

1. View submission provides list of submission of all the teams for that particular assignment. We can add new column to show work distribution among the whole team from external tools like Github. This new column will provide work distribution on whole team using bar graphs and other statistics.
2. Teammate Reviews functionality gauges team mates views on how much each team member contributed, but we would like to augment this data with data from external tools like Github. New metrics will be appended under each student data.

for example, This information should prove useful for differentiating the performance of team members for grading purposes. Overall data for the team, like number of committers and number of commits may also help instructors to predict which projects are likely to be merged.

Current Scenario

At present, group assignments are submitted as a single submission that shows work done as a whole team. This does not show work contribution done by each teammate.


Teammate review shows views of each teammate on every other teammates work. That might leads to discrepancy while validating the reviews.


Checking commits performed by each team member on Github is one of the solution, but that does not seems efficient from instructor/reviews perspective considering huge number of assignments, number of submissions and tight deadlines.

Design Considerations

  • The first thing is to determine what metrics we are looking for exactly. These are what the requirements document suggests:
    1. Total number of commits.
    2. Number of files changed.
    3. Lines of Code added
    4. Lines of code modified.
    5. Lines of code deleted.
    6. LOCs that survived until final submission - (exclude from MVP due to complexity and lower priority).
We are assuming these metrics are needed on a per-user basis.
The requirement document expects a view to be created for viewing the metrics under a new tab “Github Metrics” under “View Submissions” in the instructor view.
A line chart of number of lines vs time needs to be included in the view.
  • The next thing is to narrow down what hosting service for version control we will use. For now, the plan is to only support Github integration due to it popularity, ease-of-use and API documentation. Future projects could add in support for Gitlab and others, though it is far easier to just require that all students use Github.
    1. The main impact of this change will be that all submission repositories need to be made public as we need access to pull in the data.
    2. We also need to consider whether to ask students for github repo link separately(changes to views) or to parse all the uploaded links and determine the correct one (extra logic, students uploading multiple links or not giving links at all).
  • An important question is whether we need to store metric information in our own db at all.
    1. An older investigation came up with this schema, but this is likely to cause issues with stale information and will be difficult to maintain.
    2. Having a db is redundant as every time a user wants to see the metrics, we would need to sync the db with Github and then show the results. So we end up hitting Github api anyway.
    3. An alternative to the above approach is to take snapshots of metrics and store them in the db right on the submission deadline of projects. This will allow for fairer grading by making sure we pull in data at the correct moment. Unfortunately, doing this for so many projects could put a lot of load on server. Also, for open source projects this would mean that we don’t have the latest data to work with(people will keep committing past deadline). Thus, this approach might be good for grading purposes but doesn’t help with determining the current status of a project.
    4. All that said, we might need to maintain some meta-data in our db- investigation pending.
  • We also need to consider if we need to account for different branches. The initial plan is to only consider the master branch.
  • A suggestion was also to make sure that there isn’t a lot of padding in the tables we show.
  • With respect to showing Github metrics in the View scores page, it will be very difficult to map Expertiza users and their names to public Github profiles as students may use different name. So instead of appending Github data to Teammate reivews table, it would be easier to just add a new tab called "Github Metrics" and show the results there.
  • The instructors will need to spell out exact guidelines for committing to the projects repos(like everyone should commit their own code, keep the master as PR branch, commit regularly, be mindful of squashing too many commits for one user), so that we can have proper and correct data and, also so that students can’t weasel their way out later claiming they worked but forgot or didn’t know.

Plan of action

  • The first thing is to investigate if there are any rails gems(Octokit/Unicorn maybe?) to get Github data and if we are allowed to use them. The idea is to always get the latest data from Github and show it to the user. For grading purposes, we could alternate a view between latest data and data till submission deadlines.
  • If there are no such gems or if we aren’t allowed to use gems, we need to write to mock endpoints to understand out how to get data from github. There is ample documentation for this. The github api returns JSON which can be easily parsed.
  • We then need to figure out how to load this data(AJAX? or gem?) when instructor selects “Github Rubrics”(if we do this) or "Teammate Review" in the View Scores page. Similar data needs to be loaded in the View Submissions page.
  • We will need a mock up view to display this data in a table.
  • We then implement the view. For charting, suggestion is to use a previous project for charting- 1 2 3.

Use case Diagram

Use Case diagram of two approaches to append 'Github contribution metric' in teammate review.

Use Case diagram explaining approach to add new column 'Github contribution metric' in 'View submission'


Actors:

Instructor: This actor is responsible for creating assignments and adding students to the assignment.
Student: This actor is responsible for submitting, self-reviewing and viewing the scores.

Database:

The database where all the data of Expertiza is getting stored.

All the other use cases are implemented except “View Scores with self-review column and composite score”

Use Case: View score with self-review column and composite score

Pre Conditions:

1.The Student should submit the assignment and self-review.

2.The other students should submit the reviews of the work submitted.

Primary Sequence:

1.The student should login.

2.The student should browse and upload the assignment.

3.The student should submit the assignment.

4.The student should submit the self-review.

5.The student should choose your scores to view the score.

6.The student should be able to see the peer-review score with the self-review scores and composite score.

Post Conditions:

1.The self-review and composite score will be visible on the page.

Database Design

As of now, we do not have plans for database modifications.

Design Principles

  • MVC – The project is implemented in Ruby on Rails that uses MVC architecture. It separates an application’s data model, user interface, and control logic into three distinct components (model, view and controller, respectively). We intend to follow the same when implementing out end-point for pulling Github data.
  • Dry Principle – We are trying to reuse the existing functionalities in Expertiza,thus avoiding code duplication. Whenever possible, code modification based on the existing classes, controllers, or tables will be done instead of creating the new one.

Proposed solutions

  • The first change would would be under the "Teammate Review" tab in the "View Scores" page. We either add new rows to the table, one for each of the Github metrics, and display the results for every student in the project as mocked up here:
caption

Or we might add a new tab called "Github Metrics" in the View scores page and then show the metrics and the results in tabular format as mocked up:

caption
  • The second change is in the View Submissions page, where we intend to add a new column to the table that shows a chart per assignment team. This chart will show proportion of contributions per team member.
caption

Test Plan

Subtask 1: Github metrics in teammate reviews

Test plan for proposed solution 1:

1) Log in as an instructor

2) Navigate to assignments through Manage --> Assignments

3) Select "View scores" icon for the assignment of your choice

4) Select the team for which you wish to view scores

5) Go to "Github metrics" tab

6) View data based on different Github metrics (e.g. lines of code added/changed/removed etc.) for each teammate

Test plan for proposed solution 2:

1) Log in as an instructor

2) Navigate to assignments through Manage --> Assignments

3) Select "View scores" icon for the assignment of your choice

4) Select the team for which you wish to view scores

5) Go to "Teammate reviews" tab

6) Select the student for whom you wish to view teammate review

7) Below the usual criteria, view criteria for different Github metrics (e.g. lines of code added/changed/removed etc.) portrayed in different color scheme (light blue)

Subtask 2: Line chart for # of lines changed by overall team

1) Log in as an instructor

2) Navigate to assignments through Manage --> Assignments

3) Select "View submissions" icon for the assignment of your choice

4) Select the team whose submissions you wish to view

5) A newly added Github metrics column is added to show # of lines changed since the start of the assignment

References

Expertiza_wiki

E1815:_Improvements_to_review_grader

Expertiza_PR_1179

Expertiza_PR_1179_Video

Github API documentation