CSC/ECE 517 Spring 2015/oss E1510 FLP: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
 
(77 intermediate revisions by 3 users not shown)
Line 1: Line 1:
<font size="6"><b> Expertiza - Fix Instructor Login Performance Issue </b></font>
<font size="6"><b> Expertiza - Fix Instructor Login Performance Issue </b></font>


Expertiza is a web application where students can submit and peer-review learning objects (articles, code, web sites, etc). It is used in select courses at NC State and by professors at several other colleges and universities.<ref>[https://github.com/expertiza/expertiza Expertiza on GitHub]</ref>
Expertiza is a web application where students can submit and peer-review learning objects (articles, code, web sites, etc). It is used in select courses at NC State and by professors at several other colleges and universities.<ref>[https://github.com/expertiza/expertiza Expertiza on GitHub (low performance)]</ref>
__TOC__
__TOC__


==Project Description==
==Project Description==
Currently when an Instructor logs into Expertiza,there a lot of select* from assignments queries being fired on database which would have an adverse effect on performance. We analyzed the source of this issue and made some changes, which reduced the number of select queries executed. The performance is high improved.
Currently when an Instructor logs into Expertiza,there are a lot of select* from assignments queries being fired on database which would have an adverse effect on performance. We analyzed the source of this issue and made some changes, which reduced the number of select queries executed. The performance is high improved.
The mission involved is tracing the source of the issue and modify the code to fix the issue.
The mission involved is tracing the source of the issue and modify the code to fix the issue.


===Performance issue before modification===
===Performance issue before modification===
Before the improvement is achieved, each time an instructor logs into Expertiza, there would be several database queries, which are more than needed and take more time to render the view. Below is the console outputs when an instructor logs in.
[[File:before_modify.png|center]]
There are 7 assignment queries after load <code>_row_header.html.erb</code>, 3 assignment queries after load <code>_shared_actions.html.erb</code>, and 3 assignment queries after load <code>_assignments_actions</code>. All of these queries are same. These duplicate queries have an adverse effect on the performance.
==Locate source of issues==
To locate the files that generate the duplication of queries, we use a Gem called QueryReviewer<ref>https://github.com/nesquena/query_reviewer</ref> to help trace the datebase queries. QueryReviewer is an advanced SQL query analyzer<ref>https://github.com/nesquena/query_reviewer</ref>. It accomplishes the following goals:
* View all <code>EXPLAIN</code> output for all <code>SELECT</code> queries to generate a page
* Rate a page's SQL usage into one of three categories: <code>OK, WARNING, CRITICAL</code>
* Attach meaningful warnings to individual queries, and collections of queries
* Display interactive summary on page
===Install query_reviewer===
Installing query_reviewer is simple. All you have to do to install it into your Rails 2 or 3 project is
<pre>
<pre>
   Assignment Load (0.1ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1
   gem install query_reviewer
  CACHE (0.0ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
</pre>
  CACHE (0.0ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
 
  CACHE (0.0ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
Right now if you use bundler, simply add this to your <code>Gemfile</code>:
  CACHE (0.2ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
  CACHE (0.1ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
  CACHE (0.1ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
  Rendered tree_display/_row_header.html.erb (17.4ms)


  CACHE (0.0ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
<pre>
   Assignment Load (0.2ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1
   # Gemfile
   CACHE (0.0ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
   gem "query_reviewer"
  Rendered tree_display/actions/_shared_actions.html.erb (2.6ms)
</pre>


  CACHE (0.0ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
or to use the latest from github:
   SignUpTopic Load (0.1ms)  SELECT  `sign_up_topics`.* FROM `sign_up_topics`  WHERE (assignment_id = 1)  ORDER BY `sign_up_topics`.`id` ASC LIMIT 1
<pre>
   CACHE (0.0ms)  SELECT  `assignments`.* FROM `assignments`  WHERE `assignments`.`id` = 1 LIMIT 1  [["id", 1]]
   # Gemfile
  Rendered tree_display/actions/_assignments_actions.html.erb (8.8ms)
   gem "query_reviewer", :github => "nesquena/query_reviewer"
</pre>
</pre>
we can see that there are 7 assignment queries after load <code>_row_header.html.erb</code>, 3 assignment queries after load <code>_shared_actions.html.erb</code>, 3 assignment queries after load <code>_assignments_actions</code>. All these have an adverse effect on the performance.


==Locate bug==
After installing and running this Gem on the Rails application, it would pop-up database query information on the webpage, as is shown below.
 
[[File:Query_Reviewer.jpg|center]]
 
We can follow the trace and locate the source of the issue in <code>assignment_node.rb</code>, also the source code in <code>_assignment_action.html.erb</code>


===Install query_reviewer===
===Analyze code===
===Analyze code===
We log into Expertiza as an instructor with the QueryReviewer activate and watch the output of QueryReviewer. The above figure shows the tracing result of the QueryReviewer and it tells us that we should pay more attention to the three files: assignment_node.rb, _row_header.html.erb and _entry.html.erb.
==== <code>_assignment_actions.html.erb</code> ====
==== <code>_assignment_actions.html.erb</code> ====
In file <code>_assignment_actions.html.erb</code>, we can find in 3 places use <code>Assignment.find(node.node_object_id)</code>, which is actually a database query request.
In file [https://github.com/expertiza/expertiza/blob/rails4/app/views/tree_display/actions/_assignments_actions.html.erb<code>_assignment_actions.html.erb</code>], we can find in 3 places use <code>Assignment.find(node.node_object_id)</code>, which is actually a database query request.
<pre>
<pre>
1  <% owner_id = Object.const_get('Assignment').find(node.node_object_id).instructor_id %>
1  <% owner_id = Object.const_get('Assignment').find(node.node_object_id).instructor_id %>
Line 47: Line 61:


==== <code>_row_header.html.erb</code> and <code>assignment_node.rb</code> ====
==== <code>_row_header.html.erb</code> and <code>assignment_node.rb</code> ====
In <code>_row_header.html.erb</code>:
In [https://github.com/expertiza/expertiza/blob/rails4/app/views/tree_display/_row_header.html.erb <code>_row_header.html.erb</code>]:
<pre>
<pre>
26  <% if parent_node.get_name == 'Courses'&& session[:user].role_id == 6 %>
26  <% if parent_node.get_name == 'Courses'&& session[:user].role_id == 6 %>
Line 68: Line 82:
* <code>parent_node.get_creation_date</code> method
* <code>parent_node.get_creation_date</code> method
* <code>parent_node.get_modified_date</code> method
* <code>parent_node.get_modified_date</code> method
However, in <code>assignment_node.rb</code>:
However, in [https://github.com/expertiza/expertiza/blob/rails4/app/models/assignment_node.rb <code>assignment_node.rb</code>]:
<pre>
<pre>
6  class AssignmentNode < Node
6  class AssignmentNode < Node
Line 94: Line 108:
82 end
82 end
</pre>
</pre>
We find all these method called <code>Assignment.find(self.node_object_id)</code> method, which is actually a database query request.
We find all these method called <code>Assignment.find(self.node_object_id)</code> method, which is actually a database query request. These 'find's are actually redundant for a specific Assignment instance. So we will figure out a way to eliminate those redundant method calls.


==Modifications==
==Modifications==


===_assignment_actions.html.erb===
===_assignment_actions.html.erb===
An instance variable <code>@assignment</code> is used here instead of multiple times of calling <code>.find</code> method, thus reducing database queries.
<pre>
<pre>
+ <% @assignment = Assignment.find(node.node_object_id) %>
+ <% @assignment = Assignment.find(node.node_object_id) %>
- <% owner_id = Object.const_get('Assignment').find(node.node_object_id).instructor_id %>
+ <% owner_id = @assignment.instructor_id %>
+ <% owner_id = @assignment.instructor_id %>
...
...
- <% @assignment = Assignment.find(node.node_object_id) %>
...
- <% if Assignment.find(node.node_object_id).course_id %>
+ <% if @assignment.course_id %>
+ <% if @assignment.course_id %>
...
...
Line 108: Line 127:


===assignment_node.rb===
===assignment_node.rb===
In this class, we define a new variable <code>@assign_node</code>, and in every <code>get_</code> method, instead of using <code>Assignment.find(self.node_object_id)</code> to find the AssignmentNode object, we first search if <code>@assign_node</code> variable exist, if not, we define <code>@assign_node = Assignment.find(self.node_object_id)</code>, else we directly use <code>@assign_node</code>. Because <code>_row_header.html.erb</code> file uses many <code>get_</code> method, after modification, the number of assignment queries decreases much.
<pre>
<pre>
class AssignmentNode < Node
# Gets the name from the associated object
  def get_name
    unless @assign_node
    @assign_node = Assignment.find(self.node_object_id)
    end
    @assign_node.name
  end
 
  # Gets the directory_path from the associated object
  def get_directory
    #Assignment.find(self.node_object_id).directory_path
    unless @assign_node
    @assign_node = Assignment.find(self.node_object_id)
    end
    @assign_node.directory_path
  end
 
  # Gets the created_at from the associated object
  def get_creation_date
    #Assignment.find(self.node_object_id).created_at
    unless @assign_node
    @assign_node = Assignment.find(self.node_object_id)
    end
    @assign_node.created_at
  end
 
  # Gets the updated_at from the associated object
  def get_modified_date
    #Assignment.find(self.node_object_id).updated_at
    unless @assign_node
  @assign_node = Assignment.find(self.node_object_id)
    end
    @assign_node.updated_at
  end
end
</pre>
</pre>


== Performance Improvment==
== Performance Improvement==
 
* Originally there are 13 queries after loading these related pages, and now there are only 5 of them. As shown below.
* The time consumed for rendering the view is 943 sec before and 284 after the modification.
We can see the performance has been highly improved. If we have a larger database, the improvement would be more significant.
[[File:after_modify.png|center]]


==Conclusion==
Before our modification, when an instructor open the expertiza website and log in, it takes much time to show the page after log in, especially when a larger database hiding behind. We saw the problem on the console windows which emerged as a duplication of unnecessary queries. After our modification, the duplication disappeared and the time consumption is reduced significantly. Now when an instructor log in expertiza, it takes shorter time to get to page after log in. Please check our new version of [https://github.com/fwu8/expertiza expertiza] on github.


==References==
==References==
<references/><br>
<references/><br>

Latest revision as of 03:11, 24 March 2015

Expertiza - Fix Instructor Login Performance Issue

Expertiza is a web application where students can submit and peer-review learning objects (articles, code, web sites, etc). It is used in select courses at NC State and by professors at several other colleges and universities.<ref>Expertiza on GitHub (low performance)</ref>

Project Description

Currently when an Instructor logs into Expertiza,there are a lot of select* from assignments queries being fired on database which would have an adverse effect on performance. We analyzed the source of this issue and made some changes, which reduced the number of select queries executed. The performance is high improved. The mission involved is tracing the source of the issue and modify the code to fix the issue.

Performance issue before modification

Before the improvement is achieved, each time an instructor logs into Expertiza, there would be several database queries, which are more than needed and take more time to render the view. Below is the console outputs when an instructor logs in.

There are 7 assignment queries after load _row_header.html.erb, 3 assignment queries after load _shared_actions.html.erb, and 3 assignment queries after load _assignments_actions. All of these queries are same. These duplicate queries have an adverse effect on the performance.

Locate source of issues

To locate the files that generate the duplication of queries, we use a Gem called QueryReviewer<ref>https://github.com/nesquena/query_reviewer</ref> to help trace the datebase queries. QueryReviewer is an advanced SQL query analyzer<ref>https://github.com/nesquena/query_reviewer</ref>. It accomplishes the following goals:

  • View all EXPLAIN output for all SELECT queries to generate a page
  • Rate a page's SQL usage into one of three categories: OK, WARNING, CRITICAL
  • Attach meaningful warnings to individual queries, and collections of queries
  • Display interactive summary on page

Install query_reviewer

Installing query_reviewer is simple. All you have to do to install it into your Rails 2 or 3 project is

  gem install query_reviewer

Right now if you use bundler, simply add this to your Gemfile:

  # Gemfile
  gem "query_reviewer"

or to use the latest from github:

  # Gemfile
  gem "query_reviewer", :github => "nesquena/query_reviewer"

After installing and running this Gem on the Rails application, it would pop-up database query information on the webpage, as is shown below.

We can follow the trace and locate the source of the issue in assignment_node.rb, also the source code in _assignment_action.html.erb

Analyze code

We log into Expertiza as an instructor with the QueryReviewer activate and watch the output of QueryReviewer. The above figure shows the tracing result of the QueryReviewer and it tells us that we should pay more attention to the three files: assignment_node.rb, _row_header.html.erb and _entry.html.erb.

_assignment_actions.html.erb

In file _assignment_actions.html.erb, we can find in 3 places use Assignment.find(node.node_object_id), which is actually a database query request.

1  <% owner_id = Object.const_get('Assignment').find(node.node_object_id).instructor_id %>
...
11 <% @assignment = Assignment.find(node.node_object_id) %>
...
52 <% if Assignment.find(node.node_object_id).course_id %>
...

_row_header.html.erb and assignment_node.rb

In _row_header.html.erb:

26  <% if parent_node.get_name == 'Courses'&& session[:user].role_id == 6 %>
27  <%else%>
28    <%= parent_node.get_name %>
29    <% if parent_node.get_directory != nil%>
30      <BR/><I>Directory:</I> <%= parent_node.get_directory %>
31    <% end %>
32    <% if parent_node.get_creation_date != nil %>
33      <BR/><I>Creation Date:</I> <%= parent_node.get_creation_date %>
34    <% end %>
35    <% if parent_node.get_modified_date != nil %>
36      <BR/><I>Updated Date:</I> <%= parent_node.get_modified_date %>
37    <% end %>
38  <%end%>

we can see in this view, parent_node is a AssignmentNode object, it execute:

  • parent_node.get_name method
  • parent_node.get_directory method
  • parent_node.get_creation_date method
  • parent_node.get_modified_date method

However, in assignment_node.rb:

6  class AssignmentNode < Node
   ...   
58   # Gets the name from the associated object
59   def get_name
60     Assignment.find(self.node_object_id).name
61   end
62 
63   # Gets the directory_path from the associated object
64   def get_directory
65     Assignment.find(self.node_object_id).directory_path
66   end
67 
68   # Gets the created_at from the associated object
69   def get_creation_date
70     Assignment.find(self.node_object_id).created_at
71   end
72
73   # Gets the updated_at from the associated object
74   def get_modified_date
75     Assignment.find(self.node_object_id).updated_at
76   end
   ...
82 end

We find all these method called Assignment.find(self.node_object_id) method, which is actually a database query request. These 'find's are actually redundant for a specific Assignment instance. So we will figure out a way to eliminate those redundant method calls.

Modifications

_assignment_actions.html.erb

An instance variable @assignment is used here instead of multiple times of calling .find method, thus reducing database queries.

+ <% @assignment = Assignment.find(node.node_object_id) %>
- <% owner_id = Object.const_get('Assignment').find(node.node_object_id).instructor_id %>
+ <% owner_id = @assignment.instructor_id %>
...
- <% @assignment = Assignment.find(node.node_object_id) %>
...
- <% if Assignment.find(node.node_object_id).course_id %>
+ <% if @assignment.course_id %>
...

assignment_node.rb

In this class, we define a new variable @assign_node, and in every get_ method, instead of using Assignment.find(self.node_object_id) to find the AssignmentNode object, we first search if @assign_node variable exist, if not, we define @assign_node = Assignment.find(self.node_object_id), else we directly use @assign_node. Because _row_header.html.erb file uses many get_ method, after modification, the number of assignment queries decreases much.

class AssignmentNode < Node
# Gets the name from the associated object
  def get_name
    unless @assign_node
    	@assign_node = Assignment.find(self.node_object_id)
    end
    @assign_node.name
  end

  # Gets the directory_path from the associated object
  def get_directory
    #Assignment.find(self.node_object_id).directory_path
    unless @assign_node
    	@assign_node = Assignment.find(self.node_object_id)
    end
    @assign_node.directory_path
  end

  # Gets the created_at from the associated object
  def get_creation_date
    #Assignment.find(self.node_object_id).created_at
    unless @assign_node
    	@assign_node = Assignment.find(self.node_object_id)
    end
    @assign_node.created_at
  end

  # Gets the updated_at from the associated object
  def get_modified_date
    #Assignment.find(self.node_object_id).updated_at
    unless @assign_node
   	@assign_node = Assignment.find(self.node_object_id)
    end
    @assign_node.updated_at
  end
end

Performance Improvement

  • Originally there are 13 queries after loading these related pages, and now there are only 5 of them. As shown below.
  • The time consumed for rendering the view is 943 sec before and 284 after the modification.

We can see the performance has been highly improved. If we have a larger database, the improvement would be more significant.

Conclusion

Before our modification, when an instructor open the expertiza website and log in, it takes much time to show the page after log in, especially when a larger database hiding behind. We saw the problem on the console windows which emerged as a duplication of unnecessary queries. After our modification, the duplication disappeared and the time consumption is reduced significantly. Now when an instructor log in expertiza, it takes shorter time to get to page after log in. Please check our new version of expertiza on github.

References

<references/>