CSC/ECE 517 Spring 2020/E2012. refactor lottery controller.rb: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
(RSpec Testing)
(Future Improvements)
Line 218: Line 218:
# Assigning open topics to teams that do not have a topic yet
# Assigning open topics to teams that do not have a topic yet
# Matching teams to topics based on their weighted bids
# Matching teams to topics based on their weighted bids
= Future Improvements =
# Improving the RSpec test for assigning open topics for teams without topics. The new test only checks for 1 open assignment and 1 unassigned team. The method is capable of assigning multiple open topics for multiple unassigned teams, but testing for this would require more Factory objects for assignments and teams. While the new test provides code coverage for the entire method, the nested loops currently only run through a single time before completion, which is not always the case in a realistic setting.

Revision as of 00:13, 24 March 2020

Introduction

Expertiza Background

Expertiza is an open-source project written using the Ruby on Rails framework. The Expertiza platform aims to improve student learning by using active and cooperative learning, allowing distance education students to participate in active learning, and by discouraging plagiarism through the use of multiple deadlines. Expertiza helps to improve teaching and resource utilization by having students contribute to the creation of course materials and through peer-graded assignments. Expertiza manages objects related to a course such as instructors, TAs, students, and other users. It also manages assignments, teams, signup topics, rubrics, peer grading, and peer evaluations.

Lottery Controller Background

An assignment can have a list of signup topics. The teams associated with the assignment can bid on the signup topics to try and get assigned one they find more favorable. At a high level, the lottery controller assigns teams to topics based on the priorities the team gave to each signup topic during the bidding process.

In more detail, each student starts off on a team of size 1. They can bid on signup topics by arranging them in priority order. A team can invite other users to join their team. If student2 from Team B joins Team A which includes student1, student2 will lose their bids and take on the bids of TeamA and student1. When the lottery controller is called to run its method run_intelligent_assignment, a web service takes the bidding data from each team and returns a new list of teams, each of which is closer to the maximum team size specified for the assignment. The web service combines teams together that have similar bid data and then assigns those new teams to topics, giving each team their top bid on a topic that hasn't been assigned yet. Teams with larger team sizes and more bids are assigned their topics first.

Problem Statement

Code Climate currently grades lottery_controller.rb at a C level, finding issues with the controller's long methods and complex statements. Most methods are around 40 lines and hard to decipher with a quick look. Also, the file currently has only 10% test coverage. Our goal is to resolve the Code Climate issues, make methods simpler to understand and write new tests to improve test coverage.

Tasks Identified

  1. Cut down on code in run_intelligent_assignment method by moving some code to helper methods
  2. Cut down on code in create_new_teams_for_bidding_response by moving some code to helper methods
  3. Cut down on code in match_new_teams_to_topics by moving some code to helper methods
  4. Reduce cognitive complexity of merge_bids_from_different_previous_teams method
  5. Add additional comments to clarify newly written methods
  6. Add additional RSpec tests to increase the coverage percentage

Implementation

We did the following refactoring to the LotteryController class:

  1. Add 6 helper methods (construct_users_bidding_info, construct_teams_bidding_info, remove_user_from_previous_team, remove_empty_teams, assign_available_slots) that each represents a specific and meaningful step
  2. Replace large chunks of code with a small number of method calls
  3. Reduce the cognitive complexity by reducing the level of nested loops
  4. Optimize the performance by placing conditional checking and input filtering outside of the loop
  5. Rename some variable names to make it self-revealing
  6. Refactor the parameter list to give each method the least amount of information to complete its task
  7. Fix some typos in the comments

Here are some examples of the problems we identified and the solutions we found to these problems.

Problem 1: Long Methods

The code climate analysis tool identified two long methods exceeding 25 lines of code per method, run_intelligent_assignment and match_new_teams_to_topics. While some chunks of code are short and intuitive, others can be pulled out to distinct methods and be assigned with self-revealing method names.

Solutions

run_intelligent_assignment

1. Pull line 19-29 into a separate method called construct_users_bidding_info that is responsible for generating user bidding information hash based on students who haven't signed up yet

view code changes

2. Place the code closest to where it is needed. e.g. place line 32 (the url variable initialization) inside the begin/rescue block because it is considered an integral part of the REST API call.

view code changes

match_new_teams_to_topics

1. Conclude line 147-157 with a new method construct_teams_bidding_info that is responsible for generating team bidding information needed for the assign_available_slots method.

2. Conclude line 161-169 with a new method assign_available_slots that uses the result returned from construct_teams_bidding_info to match unassigned teams to available topics.

These two steps greatly shorten the match_new_teams_to_topics method and make the workflow more intuitive to code maintainers.

view code changes

Problem 2: Verbose Code

Verbosity contributes to more lines of codes and increases the cognitive complexity, which is the measure of how difficult a unit of code is to intuitively understand. Method extraction can't be applied to every long method as some methods are invested in doing only one thing. For these methods, one can keep them simple with alternative language syntaxes and other ways of thinking of the behavior.

Solutions

Example 1
   # Before
   bidding_matrix = {}
   assignment.sign_up_topics.each_with_index do |topic, index|	
     bidding_matrix[topic.id] = [] unless bidding_matrix[topic.id]	
     bidding_matrix[topic.id] << bids[index]
   end

The above code segment ensures that bidding_matrix[topic.id] returns an array object before pushing new value to it. It is inelegant both in its readability and the time wasted to visit each bidding_matrix[topic.id] twice. The same behavior can be accomplished by constructing a Hash object that knows what default value to return when the key has not been seen before.

   # After
   bidding_matrix = Hash.new {|hash, key| hash[key] = [] }
   sign_up_topics.each_with_index do |topic, index|
     bidding_matrix[topic.id] << bids[index]
   end
Example 2
   # Before
   teams.each do |team|
     next if SignedUpTeam.where(team_id: team.id, is_waitlisted: 0).any?
     ... (rest of code omitted)
   end

This block is extracted from the original implementation that like many places in the LotteryController, uses next to skip any elements that don't fulfill the imposed condition. It's more reasonable to eliminate out unwanted elements outside of the loop using either select or reject call. Below is the modification made to this block.

   # After
   teams_not_signed_up = teams.reject {|team| SignedUpTeam.where(team_id: team.id, is_waitlisted: 0).any? }
   teams_not_signed_up.each do |team|
     ... (rest of code omitted)
   end
Example 3

Same as example 1, the following block uses next to skip unwanted team_users until the one that meets the condition, that is, who belongs to the specified assignment. This code does the filtering twice, which can be combined with one query. ​

   # Before
   TeamsUser.where(user_id: user_id).each do |team_user|	
     next unless team_user.team.parent_id == assignment.id	
     team_user.team_user_node.destroy rescue nil	
     team_user.destroy rescue nil	
     break	
   end

We first tried to use find_by with two parameters. How find_by works are pulling column values from the model object. Since the TeamsUser model does not have a team attribute (it only stores team_id), this attempt is syntactically illegal.

   team_user = TeamsUser.find_by(user_id: user_id, team.parent_id: assignment_id)
   team_user.team_user_node.destroy rescue nil	
   team_user.destroy rescue nil

The way we refactored the code is to use the find method from the Enumerable mixin to get the first element for which the block is not false. In this way, the loop will no longer be necessary since the only thing it does is finding the first matching object.

   # After
   team_user = TeamsUser.where(user_id: user_id).find{|team_user| team_user.team.parent_id == assignment_id }
   team_user.team_user_node.destroy rescue nil
   team_user.destroy rescue nil
Example 4

Besides pulling unwanted elements out before entering the loop, one can further reduce the verbosity of the code by replacing next if with unless and adding this checking step to the only statement that's inside the loop.

   # Before
   bidding_matrix.each do |topic_id, value|
     next if value.inject(:+).zero?
     bidding_matrix_summary << [value.count {|i| i != 0 }, value.inject(:+), topic_id]
   end
   # After
   bidding_matrix.each do |topic_id, value|
     bidding_matrix_summary << [value.count {|i| i != 0 }, value.inject(:+), topic_id] unless value.inject(:+).zero?
   end

Problem 3: Ambiguous Identifiers

There were many puzzling variable names in the original lottery_controller.rb file. For example, copied from line 30, the slot with the label users accepts an input of priority_info.

   bidding_data = {users: priority_info, max_team_size: assignment.max_team_size}

The priority_info is an array of hashes, each hash stores a pair of user id and bids that the user has made so far. This variable name is not descriptive enough to visualize its content. It is refactored into a new name, users_bidding_info, and the corresponding construction method is therefore named construct_users_bidding_info. ​ There is another array that has a very similar structure to the users_bidding_info, except that it holds bidding information for teams rather than users. To follow the good naming consistency, this array is granted a new name called teams_bidding_info (previously named team_bids) and the name of the corresponding construction method is changed to construct_teams_bidding_info.

Notes

We originally considered to change the label ranks to bids because the same identity has been presented in three different forms throughout the file: rank, bid, and priority.

   |user| priority_info << {pid: user.id, ranks: bids}

However, such a change could have an impact on the later REST API call to the web service that uses students' bidding data to build teams automatically. We are uncertain about how the bidding data is actually used at the other end so it is safe to not apply such change.

Problem 4: Multi-Purposed Methods

The create_new_teams_for_bidding_response method does more than 1 thing, both creating new teams and deleting residual teams that none of their team members remained in their teams. Since it is doing beyond the name implies, it is reasonable to extract the removing part of the code to a new method and place the method call after the call to the create_new_teams_for_bidding_response method.

   # Before
   create_new_teams_for_bidding_response(teams, assignment, priority_info)
   # ^^ create_new_teams_for_bidding_response is responsible for both creation and deletion of AssignmentTeam objects
   match_new_teams_to_topics(assignment)
   # After
   create_new_teams_for_bidding_response(teams, assignment, users_bidding_info)
   remove_empty_teams(assignment)
   # ^^ remove_empty_teams becomes a separate method
   match_new_teams_to_topics(assignment)

Problem 5: Unnecessary Looping

Many unnecessary loopings can be avoided by moving the checking condition out. For example, still consider this piece of code.

   #Before
   bids = []	
   topics.each do |topic|	
     bid_record = Bid.find_by(team_id: team.id, topic_id: topic.id)	
     bids << (bid_record.nil? ? 0 : bid_record.priority ||= 0)	
   end
   team.users.each {|user| priority_info << {pid: user.id, ranks: bids} if bids.uniq != [0] }

The last line shows a looping on the team users that for each user in the team, pushes its user id along with the generated bids to the priority_info array. If there are 3 users in a team, then the comparison if bids.uniq != [0] always gets executed 3 times while it should only be executed once because up to this point the bids variable will not be changed anymore. The solution to this performance drawback would be to change the last line to the code below:

   # After
   team.users.each {|user| users_bidding_info << {pid: user.id, ranks: bids} } unless bids.uniq == [0]
   # Note that priority_info has been renamed to users_bidding_info

Other fixes on the unnecessary looping are demonstrated in examples 2 and 3 of the problem 2. Instead of moving the checking condition out, they each move the filter out so the loop runs in a minimum number of times.

RSpec Testing

Existing Tests

There were existing RSpec tests for the Lottery Controller, but the tests only coverage about 10% of the code in the controller. The test coverage was determined by the Coveralls gem. The existing tests were only testing small parts of the existing long methods. The current tests included the following:

  1. Testing the webservice call and the JSON response type for the users' bidding data
  2. Checking if an intelligent assignment is truly intelligent
  3. Creating new team for a bid
  4. Sorting unassigned teams

After Refactoring

After refactoring the Lottery Controller, the long methods were broken down into smaller, more simpler methods. This required a new RSpec test for each of the new methods, but it allowed for increased code coverage. In order to effectively test the rest of the controller, testing objects that mimic the expected application objects were created. The testing objects were created using the Factory gem and the double feature of the RSpec gem. The objects are also recreated before each test so that the tests are independent from one another to truly validate the different testing sections. The added tests were able to expand the code coverage from about 10% to 97%. The new tests that were added for each of the newly condensed methods include testing for the following:

  1. Constructing the hash table for the users' bidding information
  2. A more exhaustive running Intelligent Assignment method
  3. Generate the hash for a whole team's bidding information
  4. Match new teams to a topic based on their bids
  5. Remove a user from a team
  6. Destroying teams that have no users
  7. Merging the bids from teams that have merged together so that they have a weighted uniform bid
  8. Assigning open topics to teams that do not have a topic yet
  9. Matching teams to topics based on their weighted bids


Future Improvements

  1. Improving the RSpec test for assigning open topics for teams without topics. The new test only checks for 1 open assignment and 1 unassigned team. The method is capable of assigning multiple open topics for multiple unassigned teams, but testing for this would require more Factory objects for assignments and teams. While the new test provides code coverage for the entire method, the nested loops currently only run through a single time before completion, which is not always the case in a realistic setting.