Use of bullet gem and .includes to speed up db accesses: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
Line 25: Line 25:
For eg. current_user.participants.includes(:assignment, assignment: [:course]).each{|p| course_ids << p.assignment.course.id if p.assignment and p.assignment.course
For eg. current_user.participants.includes(:assignment, assignment: [:course]).each{|p| course_ids << p.assignment.course.id if p.assignment and p.assignment.course


Here participants.includes(:assignment, assignment: [:course]) will retrieve all the assignment and course records relating to each participant using 2 queries(1 for participants, 1 for assignments and courses relating to participants), so that when each loop accesses the assignment and course for a participant using 'p.assignment.course' or 'p.assignment', no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries even if we have 1000 participants as compared to 2000 queries.
Here participants.includes(:assignment, assignment: [:course]) will retrieve all the assignment and course records relating to each participant using 2 queries(1 for participants, 1 for assignments and courses relating to participants), so that when each loop accesses the assignment and course for a participant using 'p.assignment.course' or 'p.assignment', no new query will be fired. This reduces the number of queries using 'includes' method to 2 queries even if we have 1000 participants as compared to 2000 queries.


===Bullet Gem===
===Bullet Gem===

Revision as of 17:10, 20 December 2021

Introduction

Team

Dr. Gehringer (mentor),

  • Harsh Kachhadia (hmkachha)
  • Tirth Patel (tdpatel2)

Background

N+1 problem

In terms of performance, one of the most common problems faced in ORMs is the N+1 query problem. This query problem is a result when a query is made on the result of previous query. That is, when our application tries to retrive data from its database and then loops through the resulting data to get various results. This problem is normally not seen in small applications where the data is small and there are less requests and queries. But if we want to make a scalable application, it becomes important to handle the N+1 query problem from start otherwise the performance of the application will decrease as the application codebase gets bigger and more queries gets fired.

Solution

How eager loading solves the problem

To solve the N+1 problem, we use a process called 'eager loading'. In general, whenever code requests for a resource using db query, a query is fired to the database and the particular resource is fetched. In lazy loading, only the resources required at the moment are fetched and if in future more resources or entities are needed, then additional queries are fired into database. This significantly increases the number of queries fired whenever there is some loop construct in the code. This results into slow page load speed. As opposed to lazy loading, by using eager loading, we load additional required entities which we might require in near future, along with the main resource required by the query. Especially, during loops such as .each, eager loading significantly lowers the page load time, due to less queries fired resulting from additional data loaded from a single main resource query.

For instance, if we are firing a query to fetch all participants, and then loop through all the participants using .each to display their course name, then using eager loading, we can load all the course names for all participants in the same query in which all the participants are fetched from the database, as opposed to individual queries fired to access course names for individual participants in lazy loading.

How 'includes' method helps in eager loading

In Rails, 'includes' method specifies model associations to be included in the result set of the query being fired to the database.

For eg. current_user.participants.includes(:assignment, assignment: [:course]).each{|p| course_ids << p.assignment.course.id if p.assignment and p.assignment.course

Here participants.includes(:assignment, assignment: [:course]) will retrieve all the assignment and course records relating to each participant using 2 queries(1 for participants, 1 for assignments and courses relating to participants), so that when each loop accesses the assignment and course for a participant using 'p.assignment.course' or 'p.assignment', no new query will be fired. This reduces the number of queries using 'includes' method to 2 queries even if we have 1000 participants as compared to 2000 queries.

Bullet Gem

How bullet gem helps in solving N+1 problem

The Bullet Gem has been programmed in such a way that it helps in reducing the number of queries the application makes. This helps in increasing the application performance. When the user runs their application, the bullet gem will watch the queries that the application makes and will notify the user/programmer of the places where eager loading (N+1 queries) can be applied. That is, it will show places where eager loading is needed and where it is not needed.

How to use bullet gem in rails

Adding bullet gem:

To Install:

You can install it as a gem: gem install bullet

or add this into a Gemfile (Bundler): gem 'bullet', group: 'development'

To enable the Bullet gem with generate command: bundle exec rails g bullet:install

The generate command will auto generate the default configuration and may ask to include in the test environment as well. See below for custom configuration in development.rb

Once everything is set, you can run your application. The bullet gem will find places in the code (only the one being used while accessing the application) where eager loading is needed. The user can find the bullet logs at 'logs/bullet.log'

Overview of changes we made during this project

With the use of above mentioned bullet gem, we found the locations in the expertiza code, where eager loading needs to be implemented, by looking at the stack trace recorded in bullet.log file by bullet gem. Bullet gem locates these code locations, as we navigate throughout the website.

For this project, we targeted the following areas of expertiza, those which loads very slow and needs eager loading.

tree_display_controller

  • Courses
  • Assignments
  • Questionnaires
  • Users

grades_controller

  • view (View scores)

review_mapping_controller

  • list_mappings

reports_controller

  • Review report
  • Author-feedback report
  • Teammate-review report
  • Answer-tagging report

student_task_controller

  • Impersonating a student

We navigated through the UI, that use the code in above mentioned controllers, during which bullet gem located the code locations in these files where eager loading has to be implemented using 'includes' method.

Files that are modified in this project

  • tree_display_controller.rb
  • feedback_response_map
  • instructor.rb
  • student_task.rb
  • user.rb
  • _feedback_report.html.erb
  • _flash_notifications.html.erb
  • development.rb
  • Gemfile

Code changes to resolve N+1 Problem in Expertiza

  • tree_display_controller.rb

Explanation: Here FolderNode.includes(:folder) will retrieve all the folder records relating to each FolderNode using 2 queries(1 for FolderNode, 1 for folder associations relating to folder nodes), so that when each loop accesses the folder for a folder node , no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries as compared to large number of queries which slows down the system.

  • feedback_response_map

Explanation: Here AssignmentTeam.includes(:users) will retrieve all the user records relating to each team using 2 queries(1 for team, 1 for users associations relating to Team), so that when each loop accesses the user for a team , no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries as compared to large number of queries which slows down the system.

  • instructor.rb

Explanation: Here assignments.includes(:participants) will retrieve all the participant records relating to each assignment using 2 queries(1 for assignment, 1 for participants associations relating to assignment), so that when each loop accesses the participant for a assignment , no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries as compared to large number of queries which slows down the system.

Here p_s.includes(:user, user: [:role]) will retrieve all the user and role records relating to each participant using 2 queries(1 for participant, 1 for user and role associations relating to participant), so that when each loop accesses the user and role record for a participant , no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries as compared to large number of queries which slows down the system.

  • student_task.rb

Explanation: Here assignment_participants.includes([:assignment, :topic]) will retrieve all the assignment and topic records relating to each participant using 2 queries(1 for participant, 1 for assignment and topic associations relating to participant), so that when each loop accesses the assignment and topic record for a participant , no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries as compared to large number of queries which slows down the system.


  • user.rb

Explanation: Here User.includes(:parent, :role, parent: [:parent, :role]) will retrieve all the parent and role records relating to each User using 2 queries(1 for User, 1 for parent and role associations relating to User), so that when each loop accesses the parent and role record for a User , no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries as compared to large number of queries which slows down the system.

  • _feedback_report.html.erb

Explanation: Here @review_responses.includes([:response_map]) will retrieve all the response map records relating to each response using 2 queries(1 for response, 1 for response map associations relating to each response), so that when each loop accesses the response map for a response, no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries as compared to large number of queries which slows down the system.

  • _flash_notifications.html.erb

Explanation: Here participants.includes(:assignment, assignment: [:course]) will retrieve all the assignment and course records relating to each participant using 2 queries(1 for participants, 1 for assignments and courses relating to participants), so that when each loop accesses the assignment and course for a participant using 'p.assignment.course' or 'p.assignment', no new query will be fired. Thus, reducing the number of queries using 'includes' method to 2 queries even if we have 1000 participants as compared to 2000 queries.

Testing plan

We aim to perform automatic and manual testing for this project in order to achieve better reliability for this implementation.

As for this project, very few lines of code have been modified in the above mentioned files and also as this is more of a refactoring project, automatic testing was done by checking whether the existing tests pass or not.

Manual testing was done by navigating the UI to the locations using the code from the files that were modified. All manual tests are passed.

Important Links