CSC/ECE 517 Spring 2020/E2012. refactor lottery controller.rb
Introduction
Expertiza Background
Expertiza is an open-source project written using the Ruby on Rails framework. The Expertiza platform aims to improve student learning by using active and cooperative learning, allowing distance education students to participate in active learning, and by discouraging plagiarism through the use of multiple deadlines. Expertiza helps to improve teaching and resource utilization by having students contribute to the creation of course materials and through peer-graded assignments. Expertiza manages objects related to a course such as instructors, TAs, students, and other users. It also manages assignments, teams, signup topics, rubrics, peer grading, and peer evaluations.
Lottery Controller Background
An assignment can have a list of signup topics. The teams associated with the assignment can bid on the signup topics to try and get assigned one they find more favorable. At a high level, the lottery controller assigns teams to topics based on the priorities the team gave to each signup topic during the bidding process.
In more detail, each student starts off on a team of size 1. They can bid on signup topics by arranging them in priority order. A team can invite other users to join their team. If student2 from Team B joins Team A which includes student1, student2 will lose their bids and take on the bids of TeamA and student1. When the lottery controller is called to run its method run_intelligent_assignment, a web service takes the bidding data from each team and returns a new list of teams, each of which is closer to the maximum team size specified for the assignment. The web service combines teams together that have similar bid data and then assigns those new teams to topics, giving each team their top bid on a topic that hasn't been assigned yet. Teams with larger team sizes and more bids are assigned their topics first.
Problem Statement
Code Climate currently grades `lottery_controller.rb` at a C level, finding issues with the controller's long methods and complex statements. Most methods are around 40 lines and hard to decipher with a quick look. Also, the file currently has only 10% test coverage. Our goal is to resolve the Code Climate issues, make methods simpler to understand and write new tests to improve test coverage.
Tasks Identified
- Cut down on code in run_intelligent_assignment method by moving some code to helper methods
- Cut down on code in create_new_teams_for_bidding_response by moving some code to helper methods
- Cut down on code in match_new_teams_to_topics by moving some code to helper methods
- Reduce cognitive complexity of merge_bids_from_different_previous_teams method
- Add additional comments to clarify newly written methods
- Add additional RSpec tests to increase the coverage percentage
Implementation
We did the following refactoring to the `LotteryController` class:
1. Add 6 helper methods (`construct_users_bidding_info`, `construct_teams_bidding_info`, `remove_user_from_previous_team`, `remove_empty_teams`, `assign_available_slots` and `log`) that each represents a specific and meaningful step
2. Replace large chunks of code with a small number of method calls
3. Reduce the cognitive complexity by reducing the level of nested loops
4. Optimize the performance by placing conditional checking and input filtering outside of the loop
5. Rename some variable names to make it self-revealing
6. Refactor the parameter list to give each method the least amount of information to complete its task
7. Fix some typos in the comments
Here are some examples of the problems we identified and the solutions we found to these problems.
Problem 1: Long Methods
The code climate analysis tool identified two long methods exceeding 25 lines of code per methods, `run_intelligent_assignment` and `match_new_teams_to_topics`. While some chunks of code are short and intuitive, others can be pulled out to distinct methods and be assigned with self-revealing method names.
Solutions
run_intelligent_assignment
1. Pull line 19-29 into a separate method called `construct_users_bidding_info` that is responsible for generating user bidding infomation hash based on students who haven't signed up yet
2. Place the code closest to where it is needed. e.g. place line 32 (the `url` variable initialization) inside the `begin/rescue` block because it is considered an integral part of the REST API call.
match_new_teams_to_topics
1. Conclude line 147-157 with a new method `construct_teams_bidding_info` that is responsible for generating team bidding information needed for the `assign_available_slots` method.
2. Conclude line 161-169 with a new method `assign_available_slots` that uses the result returned from `construct_teams_bidding_info` to match unassigned teams to available topics.
These two steps greatly shorten the `match_new_teams_to_topics` method and make the workflow more intuitive to code maintainers.
Problem 2: Verbose Code
Verbosity contributes to more lines of codes and increases the cognitive complexity, which is the measure of how difficult a unit of code is to intuitively understand. Method extraction can't be applied to every long method as some methods are invested in doing only one thing. For these methods, one can keep them simple with alternative language syntaxes and other ways of thinking of the behavior.
Solutions
Example 1
# Before bidding_matrix = {} assignment.sign_up_topics.each_with_index do |topic, index| bidding_matrix[topic.id] = [] unless bidding_matrix[topic.id] bidding_matrix[topic.id] << bids[index] end
The above code segment ensures that `bidding_matrix[topic.id]` returns an array object before pushing new value to it. It is inelegant both in its readability and the time wasted to visit each `bidding_matrix[topic.id] ` twice. The same behavior can be accomplished by constructing a Hash object that knows what default value to return when the key has not been seen before.
# After bidding_matrix = Hash.new {|hash, key| hash[key] = [] } sign_up_topics.each_with_index do |topic, index| bidding_matrix[topic.id] << bids[index] end
Example 2
# Before teams.each do |team| next if SignedUpTeam.where(team_id: team.id, is_waitlisted: 0).any? ... (rest of code omitted) end
This block is extracted from the original implementation that like many places in the `LotteryController`, uses `next` to skip any elements that don't fulfill the imposed condition. It's more reasonable to eliminate out unwanted elements outside of the loop using either `select` or `reject` call. Below is the modifiication made to this block.
# After teams_not_signed_up = teams.reject {|team| SignedUpTeam.where(team_id: team.id, is_waitlisted: 0).any? } teams_not_signed_up.each do |team| ... (rest of code omitted) end
Example 3
Same as example 1, the following block uses `next` to skip unwanted `team_user`s until the one that meets the condition, that is, who belongs to the specified assignment. This code does the filtering twice, which can be combined to one query.
# Before TeamsUser.where(user_id: user_id).each do |team_user| next unless team_user.team.parent_id == assignment.id team_user.team_user_node.destroy rescue nil team_user.destroy rescue nil break end
We first tried to use `find_by` with two parameters. How `find_by` works is pulling column values from the model object. Since the `TeamsUser` model does not have a `team` attribute (it only stores `team_id`), this attempt is syntaxically illegal.
team_user = TeamsUser.find_by(user_id: user_id, team.parent_id: assignment_id) team_user.team_user_node.destroy rescue nil team_user.destroy rescue nil
The way we refactored the code is to use `find` method from the `Enumerable` mixin to get the first element for which the block is not false. In this way, the loop will no longer be necessary since the the only thing it does is finding the first matching object.
# After team_user = TeamsUser.where(user_id: user_id).find{|team_user| team_user.team.parent_id == assignment_id } team_user.team_user_node.destroy rescue nil team_user.destroy rescue nil
Example 4
Besides pulling unwanted elements out before entering the loop, one can further reduce the verbosity of the code by replacing `next if` with `unless` and adding this checking step to the only statement that's inside the loop.
# Before bidding_matrix.each do |topic_id, value| next if value.inject(:+).zero? bidding_matrix_summary << [value.count {|i| i != 0 }, value.inject(:+), topic_id] end
# After bidding_matrix.each do |topic_id, value| bidding_matrix_summary << [value.count {|i| i != 0 }, value.inject(:+), topic_id] unless value.inject(:+).zero? end
Problem 3: Ambiguous Identifiers
There were many puzzling variable names in the original `lottery_controller.rb` file. For example, copied from line 30, the slot with the label `users` accepts an input of `priority_info`.
bidding_data = {users: priority_info, max_team_size: assignment.max_team_size}
The `priority_info` is an array of hashes, each hash stores a pair of user id and bids that the user has made so far. This variable name is not descriptive enough to visualize its content. It is refactored into a new name, `users_bidding_info`, and the corresponding construction method is therefore named `construct_users_bidding_info`. There is another array that has a very similar structure to the `users_bidding_info`, except that it holds bidding information for teams rather than users. To follow the good naming consistency, this array is granted a new name called `teams_bidding_info` (previously named `team_bids`) and the name of the corresponding construction method is changed to `construct_teams_bidding_info`.
Notes
We originally considered to change the label `ranks` to `bids` because the same identity has been presented in three different forms throughout the file: `rank`, `bid`, and `priority`.
|user| priority_info << {pid: user.id, ranks: bids}
However, such a change could have an impact on the later REST API call to the web service that uses students' bidding data to build teams automatically. We are uncertain about how the bidding data is actually used at the other end so it is safe to not apply such change.
Problem 4: Multi-Purposed Methods
The `create_new_teams_for_bidding_response` method does more than 1 thing, both creating new teams and deleting residual teams which none of their team members remained in their teams. Since it is doing beyond the name implies, it is reasonable to extract the removing part of the code to a new method and place the method call after the call to the `create_new_teams_for_bidding_response` method.
# Before create_new_teams_for_bidding_response(teams, assignment, priority_info) # ^^ `create_new_teams_for_bidding_response` is responsible for both creation and deletion of AssignmentTeam objects match_new_teams_to_topics(assignment)
# After create_new_teams_for_bidding_response(teams, assignment, users_bidding_info) remove_empty_teams(assignment) # ^^ `remove_empty_teams` becomes a separate method match_new_teams_to_topics(assignment)
Problem 5: Unnecessary Looping
Many unnecessary loopings can be avoided by moving the checking condition out. For example, still consider this piece of code.
#Before bids = [] topics.each do |topic| bid_record = Bid.find_by(team_id: team.id, topic_id: topic.id) bids << (bid_record.nil? ? 0 : bid_record.priority ||= 0) end team.users.each {|user| priority_info << {pid: user.id, ranks: bids} if bids.uniq != [0] }
The last line shows a looping on the team users that for each user in the team, pushes its user id along with the generated bids to the `priority_info` array. If there are 3 users in a team, then the comparison `if bids.uniq != [0]` always gets executed 3 times while it should only be executed once because up to this point the `bids` variable will not be changed anymore. The solution to this performance drawback would be change the last line to the code below:
# After team.users.each {|user| users_bidding_info << {pid: user.id, ranks: bids} } unless bids.uniq == [0] # Note that priority_info has been renamed to users_bidding_info
Other fixes on the unnecessary looping are demonstrated in examples 2 and 3 of the problem 2. Instead of moving the checking condition out, they each move the filter out so the loop runs in a minimum number of times.