E1835 Refactor delayed mailer and scheduled task

From Expertiza_Wiki
Jump to navigation Jump to search

About Expertiza

Expertiza is an open source project based on Ruby on Rails framework. Expertiza allows the instructor to create new assignments, modify existing assignments and set deadlines. The instructor can also create a list of project topics that the students can sign up for. Expertiza allows students to peer review submissions of other students as well as do teammate reviews. Students can form team on Expertiza to work on assignments and projects.

Problem Statement

The following tasks were accomplished in this project:

  • Used Sidekiq gem for asynchronous processing of email jobs
  • Created a new mailer class- MailWorker that uses Sidekiq's queue to hold and process jobs
  • Defined the perform method to extract the Email IDs of all participants of the corresponding assignment and send an email reminder
  • Replaced the code that uses the existing DelayedMailer queue to incorporate Sidekiq's queue
  • Added RSPEC test cases to test the creation of assignment deadlines and background Sidekiq email job

About Sidekiq

Sidekiq is a background processing framework for Ruby. It allows to scale our application by performing work in the background. In particular it consists of three parts: Client, Redis and Server. The client runs in Ruby process and allows us to create jobs for processing later. Once a job is created, a Hash representing that job is created and serialized to a Json String. This String is pushed into a queue in Redis. Redis provides data storage for Sidekiq and holds all the job data. Each Sidekiq server process pulls jobs from the queue in Redis and processes them.

Current Implementation

  • Uses the 'delayed_job_active_record' gem for asynchronous processing of delayed jobs
The current implementation uses Delayed::Job for implementing a Database based asynchronous priority queue system for storing the job queue of email reminders corresponding to various deadline types.
  • Email reminders were sent to users corresponding to each deadline type
The DelayedMailer class includes a perform method that finds Email IDs of the target users corresponding to the deadline type. The subject and body of the email message is then constructed and the delayed_message method of Mailer class is used for sending the emails out to the users.
  • Uses DelayedJob to enqueue email jobs
The DelayedMailer object is enqueued into the DelayedJob queue using enqueue method. These jobs are meant to be executed as the deadline of the corresponding Email job approaches.

Drawbacks and Solutions

Problem 1: The perform method in the DelayedMailer class contains a big case statement to check each kind of deadline

The perform method in DelayedMailer class checks the deadline type individually using multiple if statements and populates the Email IDs of the target users by querying the database. Auxiliary methods like, for example, mail_signed_up_users is invoked which in turn calls the email_reminder method. This makes the implementation very cumbersome and does not follow the principles of DRY code as the same email_reminder method is called multiple times under different conditions.

def perform
assignment = Assignment.find(self.assignment_id)
   if !assignment.nil? && !assignment.id.nil?
     if self.deadline_type == "metareview"
       mail_metareviewers
       team_mails = find_team_members_email
       email_reminder(team_mails, "teammate review") unless team_mails.empty?
     end
     if self.deadline_type == "review"
       mail_reviewers # to all reviewers
     end
     if self.deadline_type == "submission"
       mail_signed_up_users # to all signed up users
     end
     if self.deadline_type == "drop_topic"
       if sign_up_topics?
         mail_signed_up_users # reminder to signed_up users of the assignment
       end
     end
     if self.deadline_type == "signup"
       if sign_up_topics?
         mail_assignment_participants # reminder to all participants
       end
     end
     if self.deadline_type == "team_formation"
       emails = get_one_member_team
       email_reminder(emails, self.deadline_type)
     end
     if self.deadline_type == "drop_one_member_topics"
       drop_one_member_topics if assignment.team_assignment?
     end
     drop_outstanding_reviews if self.deadline_type == "drop_outstanding_reviews"
     perform_simicheck_comparisons(self.assignment_id) if self.deadline_type == "compare_files_with_simicheck"
   end
 end

Solution: The implementation has been changed in such a way that all Participants belonging to the assignment under consideration receive emails irrespective of the deadline type. This is found out by querying the Participant table using assignment_id and retrieving the Email ID by linking it to the User table.

  • The Participant table is queried using assignment_id
  • For each participant, the Email ID is retrieved by linking the Participant and User table.
  • The email reminders are sent collectively to all the email addresses from the previous step.

Problem 2: Scalability Issue: Delayed Job uses SQL database for storing the jobs and processes them in a single-threaded process. It's simple to set up but the performance and scalability are not great. It is not suitable for processing 100,000s of jobs a day.

Solution: Sidekiq uses redis for storage which implements a distributed in-memory key-value database and processes jobs in a multithreaded process. It is efficient in terms of processing speed. The worker code needs to be thread safe.

New Implementation

  • In the new implementation a new class MailWorker in app/mailers/mail_worker.rb is created that uses the Sidekiq worker for the mailers queue. The three instance variables- assignment id, deadline type and due date are used in this class as well.
  • The perform method takes the above three parameters and calls the auxiliary method find_participant_emails. This method queries the Participant table to extract the Email IDs of the users participating in an assignment and returns the email list.
  • The email reminder method is unchanged from the previous implementation.
require 'sidekiq'
class MailWorker
 include Sidekiq::Worker
 # ActionMailer in Rail 4 submits jobs in mailers queue instead of default queue. Rails 5 and onwards
 # ActionMailer will submit mailer jobs to default queue. We need to remove the line below in that case!
 sidekiq_options queue: 'mailers'
 attr_accessor :assignment_id
 attr_accessor :deadline_type
 attr_accessor :due_at
 def perform(assignment_id, deadline_type, due_at)
   self.assignment_id = assignment_id
   self.deadline_type = deadline_type
   self.due_at = due_at
   assignment = Assignment.find(self.assignment_id)
   participant_mails = find_participant_emails
   if ["drop_one_member_topics", "drop_outstanding_reviews", "compare_files_with_simicheck"].include?(self.deadline_type)
     drop_one_member_topics if self.deadline_type == "drop_outstanding_reviews" && assignment.team_assignment
     drop_outstanding_reviews if self.deadline_type == "drop_outstanding_reviews"
     perform_simicheck_comparisons(self.assignment_id) if self.deadline_type == "compare_files_with_simicheck"
   else
     # Can we rename deadline_type(metareview) to "teammate review". If, yes then we donot need this if clause below!
     deadlineText = if self.deadline_type == "metareview"
       "teammate review"
     else
       self.deadline_type
     end
     email_reminder(participant_mails, deadlineText) unless participant_mails.empty?
   end
 end
 def email_reminder(emails, deadline_type)
   assignment = Assignment.find(self.assignment_id)
   subject = "Message regarding #{deadline_type} for assignment #{assignment.name}"
   body = "This is a reminder to complete #{deadline_type} for assignment #{assignment.name}. \
   Deadline is #{self.due_at}.If you have already done the  #{deadline_type}, Please ignore this mail."
   emails.each do |mail|
     Rails.logger.info mail
   end
   @mail = Mailer.delayed_message(bcc: emails, subject: subject, body: body)
   @mail.deliver_now
 end
 def find_participant_emails
   emails = []
   participants = Participant.where(parent_id: self.assignment_id)
   participants.each do |participant|
     emails<<participant.user.email unless participant.user.nil?
   end
   emails
 end
  • The app/models/assignment_form.rb is modified to make use of the Sidekiq's perform method to queue the job and process it when the deadline approaches.
  • The add_delayed_job method now uses Sidekiq's perform_in method to store the email job in the job queue and executes the email job after the time (in seconds) as specified in the first argument has elapsed.
  • The corresponding job_id is returned back for further processing.
  • The get_time_diff_btw_due_date_and_now method calculates the time after which the email has to be sent and is unchanged.
def add_to_delayed_queue
   duedates = AssignmentDueDate.where(parent_id: @assignment.id)
   duedates.each do |due_date|
     deadline_type = DeadlineType.find(due_date.deadline_type_id).name
     diff_btw_time_left_and_threshold, min_left = get_time_diff_btw_due_date_and_now(due_date)
     next unless diff_btw_time_left_and_threshold > 0
     delayed_job_id = add_delayed_job(@assignment, deadline_type, due_date, diff_btw_time_left_and_threshold)
     due_date.update_attribute(:delayed_job_id, delayed_job_id)
     # If the deadline type is review, add a delayed job to drop outstanding review
     add_delayed_job(@assignment, "drop_outstanding_reviews", due_date, min_left) if deadline_type == "review"
     # If the deadline type is team_formation, add a delayed job to drop one member team
     next unless deadline_type == "team_formation" and @assignment.team_assignment?
     add_delayed_job(@assignment, "drop_one_member_topics", due_date, min_left)
   end
 end

 def get_time_diff_btw_due_date_and_now(due_date)
   due_at = due_date.due_at.to_s(:db)
   Time.parse(due_at)
   due_at = Time.parse(due_at)
   time_left_in_min = find_min_from_now(due_at)
   diff_btw_time_left_and_threshold = time_left_in_min - due_date.threshold * 60
   [diff_btw_time_left_and_threshold, time_left_in_min]
 end

 # add DelayedJob into queue and return it
 def add_delayed_job(assignment, deadline_type, due_date, min_left)
   delayed_job_id = MailWorker.perform_in(min_left*60, due_date.parent_id, deadline_type, due_date.due_at )
   change_item_type(delayed_job_id)
   delayed_job_id
 end

Test Plan

Testing using RSPEC

We modified the existing test cases to replace the delayed job queue with Sidekiq, to test all modifications we have done to the code of assignment_form class. We also added RSpec test for Sidekiq mailer.

describe MailWorker do
  let(:assignment) { build(:assignment, id: 1, name: "no assignment", participants: [participant], teams: [team]) }
  let(:participant) { build(:participant, id: 1, parent_id: 1, user: user) }
  let(:team) { build(:assignment_team, id: 1, name: 'no team', users: [user], parent_id: 1) }
  let(:user) { build(:student, id: 1, email: 'psingh22@ncsu.edu') }
  
  
  before(:each) do
    allow(Assignment).to receive(:find).with('1').and_return(assignment)
    allow(Participant).to receive(:where).with(parent_id: '1').and_return([participant])
  end

  describe 'Tests mailer with sidekiq' do
    
    it "should send email to required email address with proper content" do
      Sidekiq::Testing.inline!
      MailWorker.perform_async("1", "metareview", "2018-12-31 00:00:01")
      email = ActionMailer::Base.deliveries.first
      expect(email.from[0]).to eq("expertiza.development@gmail.com")
      expect(email.bcc[0]).to eq(user.email)
      expect(email.subject).to eq('Message regarding teammate review for assignment '+ assignment.name)
    end

    it "should expect the queue size of one" do
      Sidekiq::Testing.fake!
      MailWorker.perform_in(3.hours, "1", "metareview", "2018-12-31 00:00:01")
      queue = Sidekiq::Queues["mailers"]
      expect(queue.size).to eq(1)
    end
  end
end

The tests can be executed "rspec spec" command as shown below:

user-expertiza $rspec spec/sidekiq_mail_worker_spec.rb.

Finished in 3.19 seconds
2 examples, 0 failures

Testing from UI

1. To go to versions index page, type in the following url after logging in as instructor: http://152.46.19.3:8080

2. Create a new assignment and add the rubric and other necessary fields and add the late policy. http://152.46.19.3:8080/assignments/new?private=0

3. You can see the new jobs populated here http://152.46.19.3:8080/sidekiq/scheduled

References

  1. Expertiza on GitHub
  2. GitHub Project Repository Fork
  3. The live Expertiza website
  4. Demo link
  5. Expertiza project documentation wiki
  6. Rspec Documentation
  7. Sidekiq Documentation
  8. Clean Code: A handbook of agile software craftsmanship. Author: Robert C Martin