CSC/ECE 517 Spring 2014/oss E1404 mnp: Difference between revisions
No edit summary |
|||
(11 intermediate revisions by the same user not shown) | |||
Line 7: | Line 7: | ||
* HTML for the whole page is generated, largely by controller methods, before anything is displayed. | * HTML for the whole page is generated, largely by controller methods, before anything is displayed. | ||
This code needs to be made faster. | This code needs to be made faster. | ||
== Design and implementation == | == Design and implementation == | ||
Line 25: | Line 15: | ||
* Participant already has its own participant id. Each user has a different participant for each assignment so we do not need to take participant id as the parameter. | * Participant already has its own participant id. Each user has a different participant for each assignment so we do not need to take participant id as the parameter. | ||
* This also means it should return a hash that will contain the scores of all the assignments in which this user is a participant since participant id changes each time. | * This also means it should return a hash that will contain the scores of all the assignments in which this user is a participant since participant id changes each time. | ||
* Thus, the above requirements need to be changed. Instead of having a Participant.find(participant id).scores, we wrote an all_scores method in the grades_controller (since grades controller does score calculation and display for other similar features such as view_my_scores). | |||
==== Implementation ==== | ==== Implementation ==== | ||
Line 35: | Line 26: | ||
[[File:View_all_scores.jpg]] | [[File:View_all_scores.jpg]] | ||
===== | ===== Method ===== | ||
We needed to write a method and add a view that computed and displayed the scores of a user across all the assignments user has participated in. We added a new method all_scores in the grades_controller.rb. This method executes the following steps: | We needed to write a method and add a view that computed and displayed the scores of a user across all the assignments user has participated in. We added a new method all_scores in the grades_controller.rb. This method executes the following steps: | ||
* :Executes the below sql query: | * :Executes the below sql query: | ||
SELECT assignments.name, questionnaire_type, avg(score), min(score), max(score) FROM | <pre> | ||
(SELECT r.id as response_id, team_id, tu.user_id as user_id, (SUM(weight*score)*100)/(sum(weight)*max_question_score) as score, parent_id as assignment_id, qs.type as questionnaire_type | SELECT | ||
FROM scores s ,responses r, response_maps rm, questions q, questionnaires qs , teams_users tu , teams t | assignments.name, questionnaire_type, avg(score), min(score), max(score) | ||
WHERE rm.id = r.map_id AND r.id=s.response_id AND q.id = s.question_id AND qs.id = q.questionnaire_id AND tu.team_id = rm.reviewee_id | FROM | ||
WHERE assignments.id = participant_score_aggregate.assignment_id | ( | ||
SELECT | |||
r.id as response_id, team_id, tu.user_id as user_id, | |||
(SUM(weight*score)*100)/(sum(weight)*max_question_score) as score, | |||
parent_id as assignment_id, qs.type as questionnaire_type | |||
FROM | |||
scores s ,responses r, response_maps rm, questions q, | |||
questionnaires qs , teams_users tu , teams t | |||
WHERE | |||
rm.id = r.map_id AND r.id=s.response_id AND q.id = s.question_id | |||
AND qs.id = q.questionnaire_id AND tu.team_id = rm.reviewee_id | |||
AND tu.team_id = t.id group by r.id | |||
) | |||
as participant_score_aggregate, assignments | |||
WHERE | |||
assignments.id = participant_score_aggregate.assignment_id AND user_id = ? | |||
GROUP BY assignment_id, questionnaire_type | |||
</pre> | |||
The user_id is of the current user is passed as a parameter to this query. | The user_id is of the current user is passed as a parameter to this query. | ||
Line 64: | Line 71: | ||
==== Issues ==== | ==== Issues ==== | ||
We observed the following issues in the previous code and took the following design decisions: | |||
* The previous code looped through the questionnaires of the assignment and collected the related questions in an array. This array is used to compute the score for each question. Instead, we get the scores in the query itself thereby reducing the complexity of the code. | |||
* The previous code was calling several methods that computed the different parts of the score in a loop. These methods seemed similar in their naming and looked to be a candidate of yo-yo effect. We tried to reduce this by having the database query (wherever possible) do most of the work for the method and the method just populate the hash. | |||
* Even though the single query approach works faster, for assignments that have a lot of responses, we noticed that the code still took some time to complete. The query in itself runs fast (when executed directly on the SQL server). However, this seems to be bottleneck for the rails interface to the database itself. | |||
==== Implementation ==== | ==== Implementation ==== | ||
Line 79: | Line 89: | ||
This speeds up the code since there is only a single SQL query that will get executed. The loop runs only for the number of rows in the returned result set. This means that the loop runs only for the assignment submissions that got scored. The query is outside the loop and hence reduces the time required. | This speeds up the code since there is only a single SQL query that will get executed. The loop runs only for the number of rows in the returned result set. This means that the loop runs only for the assignment submissions that got scored. The query is outside the loop and hence reduces the time required. | ||
==== Performance gain ==== | |||
The old code and the new code were tested against the same database, on the same machine and network and it was observed that there was a significant reduction in time taken to get the scores. | |||
Snapshot of the time taken for the previous method: | |||
[[File:Old_timing.png]] | |||
Snapshot of the time taken for the method after the above changes: | |||
[[File:New_timing.png]] | |||
As can be observed, the improvement is significant. | |||
=== Objective 3: Get scores of all the users for a given course === | === Objective 3: Get scores of all the users for a given course === | ||
Line 88: | Line 109: | ||
==== Implementation ==== | ==== Implementation ==== | ||
==== View ==== | ===== View ===== | ||
A new view was required since this view did not exist previously. This view will be displayed on clicking on the View Scores option in the Actions panel of the courses as highlighted below: | A new view was required since this view did not exist previously. This view will be displayed on clicking on the View Scores option in the Actions panel of the courses as highlighted below: | ||
[[File:Course_scores.jpg]] | [[File:Course_scores.jpg]] | ||
Line 95: | Line 116: | ||
[[File:Course_scores_view.jpg]] | [[File:Course_scores_view.jpg]] | ||
===== Method ===== | |||
We needed to write a method and add a view that computed and displayed the scores of all users that have enrolled in a course. We added a method named view_course_scores in the grades_controller.rb. This method does the following steps: | We needed to write a method and add a view that computed and displayed the scores of all users that have enrolled in a course. We added a method named view_course_scores in the grades_controller.rb. This method does the following steps: | ||
* Creates an empty hash. | * Creates an empty hash. | ||
* Runs a single SQL query as below: | * Runs a single SQL query as below: | ||
SELECT assignments.name, user_id, questionnaire_type, avg(score), min(score), max(score) | <pre> | ||
(SELECT r.id as response_id, team_id, u.name as user_id, (SUM(weight*score)*100)/(sum(weight)*max_question_score) as score, t.parent_id as assignment_id, qs.type as questionnaire_type | SELECT | ||
FROM scores s ,responses r, response_maps rm, questions q, questionnaires qs , users u, teams_users tu , teams t | assignments.name, user_id, questionnaire_type, avg(score), min(score), max(score) | ||
WHERE | FROM | ||
) as participant_score_aggregate, assignments | ( | ||
WHERE assignments.id = participant_score_aggregate.assignment_id | SELECT | ||
r.id as response_id, team_id, u.name as user_id, | |||
(SUM(weight*score)*100)/(sum(weight)*max_question_score) as score, | |||
t.parent_id as assignment_id, qs.type as questionnaire_type | |||
FROM | |||
scores s ,responses r, response_maps rm, questions q, questionnaires qs , users u, teams_users tu , teams t | |||
WHERE | |||
rm.id = r.map_id AND r.id=s.response_id AND q.id = s.question_id | |||
AND qs.id = q.questionnaire_id AND tu.team_id = rm.reviewee_id | |||
AND tu.team_id = t.id AND tu.user_id=u.id group by r.id | |||
) | |||
as participant_score_aggregate, assignments | |||
WHERE | |||
assignments.id = participant_score_aggregate.assignment_id | |||
AND assignments.course_id = ? group by assignment_id, questionnaire_type | |||
</pre> | |||
The course id is passed as a parameter to this query. | The course id is passed as a parameter to this query. | ||
Line 117: | Line 152: | ||
== Future work == | == Future work == | ||
* The SQL queries could possibly be made to work faster if we could come up with a better query plan or use indexing. | |||
* The SQL queries could be compiled into database views. We did not do it right now since we wanted to adhere to the existing hashes that were created and keep the code change as simple and minimal as possible. However, if such is required, the query can be made into a database view. | |||
* The view_my_scores method in the grades_controller can also be made similarly fast by replacing multiple database queries and loops with a single query. |
Latest revision as of 22:37, 31 March 2014
Improvement to View Scores
Introduction
Background
The scores for different users, participants and teams for several assignments belong to different courses are calculated for multiple views. This code is very slow because:
- Separate db queries are used for each rubric that has been filled out by anyone associated with the assignment; these queries are made sequentially while the HTML page is being written
- HTML for the whole page is generated, largely by controller methods, before anything is displayed.
This code needs to be made faster.
Design and implementation
Objective 1: Get scores of all the assignments for a given user
Requirement: Participant.find(participant id).scores # should return a Hash or new object, with a single database query. This method should return the scores of a participant from all the assignments in which the participant has submitted.
Issues
- Participant already has its own participant id. Each user has a different participant for each assignment so we do not need to take participant id as the parameter.
- This also means it should return a hash that will contain the scores of all the assignments in which this user is a participant since participant id changes each time.
- Thus, the above requirements need to be changed. Instead of having a Participant.find(participant id).scores, we wrote an all_scores method in the grades_controller (since grades controller does score calculation and display for other similar features such as view_my_scores).
Implementation
View
A new view was required since this view did not exist previously. A new link has been added to the user's homepage:
On clicking on this link, the user will be able see his scores for all the assignments he/she has participated in. This is a new view and will look as below:
Method
We needed to write a method and add a view that computed and displayed the scores of a user across all the assignments user has participated in. We added a new method all_scores in the grades_controller.rb. This method executes the following steps:
- :Executes the below sql query:
SELECT assignments.name, questionnaire_type, avg(score), min(score), max(score) FROM ( SELECT r.id as response_id, team_id, tu.user_id as user_id, (SUM(weight*score)*100)/(sum(weight)*max_question_score) as score, parent_id as assignment_id, qs.type as questionnaire_type FROM scores s ,responses r, response_maps rm, questions q, questionnaires qs , teams_users tu , teams t WHERE rm.id = r.map_id AND r.id=s.response_id AND q.id = s.question_id AND qs.id = q.questionnaire_id AND tu.team_id = rm.reviewee_id AND tu.team_id = t.id group by r.id ) as participant_score_aggregate, assignments WHERE assignments.id = participant_score_aggregate.assignment_id AND user_id = ? GROUP BY assignment_id, questionnaire_type
The user_id is of the current user is passed as a parameter to this query.
- From the result obtained from the above query, we create a hash of the format:
assignment name => questionnaire type
questionnaire type is again a hash that is of the form: questionnaire type => score_type
The score type is a hash that takes any of the three values [avg, max and min] as key the value will be the actual score. Once the hash is created and populated accordingly, this hash is used by the view to display to the user.
Objective 2: Improve the speed of getting scores for a given assignment
Requirement: assignment.scores # should return a Hash or new object, with a single database query. This method should return scores of all the participants in an assignment. This can be done by using following methods:
- Use joins on database tables participants and response_maps
- Create a join on above mentioned tables to retrieve grade for participants in an assignment.
Issues
We observed the following issues in the previous code and took the following design decisions:
- The previous code looped through the questionnaires of the assignment and collected the related questions in an array. This array is used to compute the score for each question. Instead, we get the scores in the query itself thereby reducing the complexity of the code.
- The previous code was calling several methods that computed the different parts of the score in a loop. These methods seemed similar in their naming and looked to be a candidate of yo-yo effect. We tried to reduce this by having the database query (wherever possible) do most of the work for the method and the method just populate the hash.
- Even though the single query approach works faster, for assignments that have a lot of responses, we noticed that the code still took some time to complete. The query in itself runs fast (when executed directly on the SQL server). However, this seems to be bottleneck for the rails interface to the database itself.
Implementation
The scores are computed in the get_scores method of the assignment.rb. Previously this code worked as follows:
- For each participant the participant.get_scores is called for each question. For all quiz questionnaires taken by the participant, get all the quiz responses. Create a hash and store the score computed by calling Score.compute_quiz_scores. The total score is then calculated and stored in the hash by calling compute_total_score for each participant.
- For all teams that participated in this assignment, get the scores into the hash by calling Score.compute_scores for each assessment and questions.
This method is specifically slow because the first step consists of several substeps, performed inside a big loop. Each of them might query the database adding to the time taken.
We refactored this code to have the following steps:
- Created an empty hash before hand.
- Run a single SQL query that will return all the data required by the hash.
- Loop through the result set to populate the hash.
This speeds up the code since there is only a single SQL query that will get executed. The loop runs only for the number of rows in the returned result set. This means that the loop runs only for the assignment submissions that got scored. The query is outside the loop and hence reduces the time required.
Performance gain
The old code and the new code were tested against the same database, on the same machine and network and it was observed that there was a significant reduction in time taken to get the scores. Snapshot of the time taken for the previous method:
Snapshot of the time taken for the method after the above changes:
As can be observed, the improvement is significant.
Objective 3: Get scores of all the users for a given course
Requirements: Add a method course.scores which should return a Hash. This method should return scores of all the participants enrolled in a course.
Issues
- We needed to get the scores of all users for all assignments for a given course.
Implementation
View
A new view was required since this view did not exist previously. This view will be displayed on clicking on the View Scores option in the Actions panel of the courses as highlighted below:
On selecting the highlighted option above, the following page will be displayed:
Method
We needed to write a method and add a view that computed and displayed the scores of all users that have enrolled in a course. We added a method named view_course_scores in the grades_controller.rb. This method does the following steps:
- Creates an empty hash.
- Runs a single SQL query as below:
SELECT assignments.name, user_id, questionnaire_type, avg(score), min(score), max(score) FROM ( SELECT r.id as response_id, team_id, u.name as user_id, (SUM(weight*score)*100)/(sum(weight)*max_question_score) as score, t.parent_id as assignment_id, qs.type as questionnaire_type FROM scores s ,responses r, response_maps rm, questions q, questionnaires qs , users u, teams_users tu , teams t WHERE rm.id = r.map_id AND r.id=s.response_id AND q.id = s.question_id AND qs.id = q.questionnaire_id AND tu.team_id = rm.reviewee_id AND tu.team_id = t.id AND tu.user_id=u.id group by r.id ) as participant_score_aggregate, assignments WHERE assignments.id = participant_score_aggregate.assignment_id AND assignments.course_id = ? group by assignment_id, questionnaire_type
The course id is passed as a parameter to this query.
- The result returned from the query is used to populate the hash. This hash is then used to display the different parts of the view.
Guidelines followed
The following guidelines were followed wherever possible to achieve the above objectives, especially keeping in mind that they must make the implementation fast:
- Single query instead of multiple queries.
- Queries extracted outside loops.
- Avoided method calls where values could be obtained in the same SQL query itself.
Future work
- The SQL queries could possibly be made to work faster if we could come up with a better query plan or use indexing.
- The SQL queries could be compiled into database views. We did not do it right now since we wanted to adhere to the existing hashes that were created and keep the code change as simple and minimal as possible. However, if such is required, the query can be made into a database view.
- The view_my_scores method in the grades_controller can also be made similarly fast by replacing multiple database queries and loops with a single query.