CSC/ECE 517 Spring 2020/E2012. refactor lottery controller.rb: Difference between revisions
(RSpec Testing) |
(Future Improvements) |
||
Line 218: | Line 218: | ||
# Assigning open topics to teams that do not have a topic yet | # Assigning open topics to teams that do not have a topic yet | ||
# Matching teams to topics based on their weighted bids | # Matching teams to topics based on their weighted bids | ||
= Future Improvements = | |||
# Improving the RSpec test for assigning open topics for teams without topics. The new test only checks for 1 open assignment and 1 unassigned team. The method is capable of assigning multiple open topics for multiple unassigned teams, but testing for this would require more Factory objects for assignments and teams. While the new test provides code coverage for the entire method, the nested loops currently only run through a single time before completion, which is not always the case in a realistic setting. |
Revision as of 00:13, 24 March 2020
Introduction
Expertiza Background
Expertiza is an open-source project written using the Ruby on Rails framework. The Expertiza platform aims to improve student learning by using active and cooperative learning, allowing distance education students to participate in active learning, and by discouraging plagiarism through the use of multiple deadlines. Expertiza helps to improve teaching and resource utilization by having students contribute to the creation of course materials and through peer-graded assignments. Expertiza manages objects related to a course such as instructors, TAs, students, and other users. It also manages assignments, teams, signup topics, rubrics, peer grading, and peer evaluations.
Lottery Controller Background
An assignment can have a list of signup topics. The teams associated with the assignment can bid on the signup topics to try and get assigned one they find more favorable. At a high level, the lottery controller assigns teams to topics based on the priorities the team gave to each signup topic during the bidding process.
In more detail, each student starts off on a team of size 1. They can bid on signup topics by arranging them in priority order. A team can invite other users to join their team. If student2 from Team B joins Team A which includes student1, student2 will lose their bids and take on the bids of TeamA and student1. When the lottery controller is called to run its method run_intelligent_assignment, a web service takes the bidding data from each team and returns a new list of teams, each of which is closer to the maximum team size specified for the assignment. The web service combines teams together that have similar bid data and then assigns those new teams to topics, giving each team their top bid on a topic that hasn't been assigned yet. Teams with larger team sizes and more bids are assigned their topics first.
Problem Statement
Code Climate currently grades lottery_controller.rb
at a C level, finding issues with the controller's long methods and complex statements. Most methods are around 40 lines and hard to decipher with a quick look. Also, the file currently has only 10% test coverage. Our goal is to resolve the Code Climate issues, make methods simpler to understand and write new tests to improve test coverage.
Tasks Identified
- Cut down on code in run_intelligent_assignment method by moving some code to helper methods
- Cut down on code in create_new_teams_for_bidding_response by moving some code to helper methods
- Cut down on code in match_new_teams_to_topics by moving some code to helper methods
- Reduce cognitive complexity of merge_bids_from_different_previous_teams method
- Add additional comments to clarify newly written methods
- Add additional RSpec tests to increase the coverage percentage
Implementation
We did the following refactoring to the LotteryController
class:
- Add 6 helper methods (
construct_users_bidding_info
,construct_teams_bidding_info
,remove_user_from_previous_team
,remove_empty_teams
,assign_available_slots
) that each represents a specific and meaningful step - Replace large chunks of code with a small number of method calls
- Reduce the cognitive complexity by reducing the level of nested loops
- Optimize the performance by placing conditional checking and input filtering outside of the loop
- Rename some variable names to make it self-revealing
- Refactor the parameter list to give each method the least amount of information to complete its task
- Fix some typos in the comments
Here are some examples of the problems we identified and the solutions we found to these problems.
Problem 1: Long Methods
The code climate analysis tool identified two long methods exceeding 25 lines of code per method, run_intelligent_assignment
and match_new_teams_to_topics
. While some chunks of code are short and intuitive, others can be pulled out to distinct methods and be assigned with self-revealing method names.
Solutions
run_intelligent_assignment
1. Pull line 19-29 into a separate method called construct_users_bidding_info
that is responsible for generating user bidding information hash based on students who haven't signed up yet
2. Place the code closest to where it is needed. e.g. place line 32 (the url
variable initialization) inside the begin/rescue
block because it is considered an integral part of the REST API call.
match_new_teams_to_topics
1. Conclude line 147-157 with a new method construct_teams_bidding_info
that is responsible for generating team bidding information needed for the assign_available_slots
method.
2. Conclude line 161-169 with a new method assign_available_slots
that uses the result returned from construct_teams_bidding_info
to match unassigned teams to available topics.
These two steps greatly shorten the match_new_teams_to_topics
method and make the workflow more intuitive to code maintainers.
Problem 2: Verbose Code
Verbosity contributes to more lines of codes and increases the cognitive complexity, which is the measure of how difficult a unit of code is to intuitively understand. Method extraction can't be applied to every long method as some methods are invested in doing only one thing. For these methods, one can keep them simple with alternative language syntaxes and other ways of thinking of the behavior.
Solutions
Example 1
# Before bidding_matrix = {} assignment.sign_up_topics.each_with_index do |topic, index| bidding_matrix[topic.id] = [] unless bidding_matrix[topic.id] bidding_matrix[topic.id] << bids[index] end
The above code segment ensures that bidding_matrix[topic.id]
returns an array object before pushing new value to it. It is inelegant both in its readability and the time wasted to visit each bidding_matrix[topic.id]
twice. The same behavior can be accomplished by constructing a Hash object that knows what default value to return when the key has not been seen before.
# After bidding_matrix = Hash.new {|hash, key| hash[key] = [] } sign_up_topics.each_with_index do |topic, index| bidding_matrix[topic.id] << bids[index] end
Example 2
# Before teams.each do |team| next if SignedUpTeam.where(team_id: team.id, is_waitlisted: 0).any? ... (rest of code omitted) end
This block is extracted from the original implementation that like many places in the LotteryController
, uses next
to skip any elements that don't fulfill the imposed condition. It's more reasonable to eliminate out unwanted elements outside of the loop using either select
or reject
call. Below is the modification made to this block.
# After teams_not_signed_up = teams.reject {|team| SignedUpTeam.where(team_id: team.id, is_waitlisted: 0).any? } teams_not_signed_up.each do |team| ... (rest of code omitted) end
Example 3
Same as example 1, the following block uses next
to skip unwanted team_user
s until the one that meets the condition, that is, who belongs to the specified assignment. This code does the filtering twice, which can be combined with one query.
# Before TeamsUser.where(user_id: user_id).each do |team_user| next unless team_user.team.parent_id == assignment.id team_user.team_user_node.destroy rescue nil team_user.destroy rescue nil break end
We first tried to use find_by
with two parameters. How find_by
works are pulling column values from the model object. Since the TeamsUser
model does not have a team
attribute (it only stores team_id
), this attempt is syntactically illegal.
team_user = TeamsUser.find_by(user_id: user_id, team.parent_id: assignment_id) team_user.team_user_node.destroy rescue nil team_user.destroy rescue nil
The way we refactored the code is to use the find
method from the Enumerable
mixin to get the first element for which the block is not false. In this way, the loop will no longer be necessary since the only thing it does is finding the first matching object.
# After team_user = TeamsUser.where(user_id: user_id).find{|team_user| team_user.team.parent_id == assignment_id } team_user.team_user_node.destroy rescue nil team_user.destroy rescue nil
Example 4
Besides pulling unwanted elements out before entering the loop, one can further reduce the verbosity of the code by replacing next if
with unless
and adding this checking step to the only statement that's inside the loop.
# Before bidding_matrix.each do |topic_id, value| next if value.inject(:+).zero? bidding_matrix_summary << [value.count {|i| i != 0 }, value.inject(:+), topic_id] end
# After bidding_matrix.each do |topic_id, value| bidding_matrix_summary << [value.count {|i| i != 0 }, value.inject(:+), topic_id] unless value.inject(:+).zero? end
Problem 3: Ambiguous Identifiers
There were many puzzling variable names in the original lottery_controller.rb
file. For example, copied from line 30, the slot with the label users
accepts an input of priority_info
.
bidding_data = {users: priority_info, max_team_size: assignment.max_team_size}
The priority_info
is an array of hashes, each hash stores a pair of user id and bids that the user has made so far. This variable name is not descriptive enough to visualize its content. It is refactored into a new name, users_bidding_info
, and the corresponding construction method is therefore named construct_users_bidding_info
.
There is another array that has a very similar structure to the users_bidding_info
, except that it holds bidding information for teams rather than users. To follow the good naming consistency, this array is granted a new name called teams_bidding_info
(previously named team_bids
) and the name of the corresponding construction method is changed to construct_teams_bidding_info
.
Notes
We originally considered to change the label ranks
to bids
because the same identity has been presented in three different forms throughout the file: rank
, bid
, and priority
.
|user| priority_info << {pid: user.id, ranks: bids}
However, such a change could have an impact on the later REST API call to the web service that uses students' bidding data to build teams automatically. We are uncertain about how the bidding data is actually used at the other end so it is safe to not apply such change.
Problem 4: Multi-Purposed Methods
The create_new_teams_for_bidding_response
method does more than 1 thing, both creating new teams and deleting residual teams that none of their team members remained in their teams. Since it is doing beyond the name implies, it is reasonable to extract the removing part of the code to a new method and place the method call after the call to the create_new_teams_for_bidding_response
method.
# Before
create_new_teams_for_bidding_response(teams, assignment, priority_info)
# ^^ create_new_teams_for_bidding_response
is responsible for both creation and deletion of AssignmentTeam objects
match_new_teams_to_topics(assignment)
# After
create_new_teams_for_bidding_response(teams, assignment, users_bidding_info)
remove_empty_teams(assignment)
# ^^ remove_empty_teams
becomes a separate method
match_new_teams_to_topics(assignment)
Problem 5: Unnecessary Looping
Many unnecessary loopings can be avoided by moving the checking condition out. For example, still consider this piece of code.
#Before bids = [] topics.each do |topic| bid_record = Bid.find_by(team_id: team.id, topic_id: topic.id) bids << (bid_record.nil? ? 0 : bid_record.priority ||= 0) end team.users.each {|user| priority_info << {pid: user.id, ranks: bids} if bids.uniq != [0] }
The last line shows a looping on the team users that for each user in the team, pushes its user id along with the generated bids to the priority_info
array. If there are 3 users in a team, then the comparison if bids.uniq != [0]
always gets executed 3 times while it should only be executed once because up to this point the bids
variable will not be changed anymore. The solution to this performance drawback would be to change the last line to the code below:
# After team.users.each {|user| users_bidding_info << {pid: user.id, ranks: bids} } unless bids.uniq == [0] # Note that priority_info has been renamed to users_bidding_info
Other fixes on the unnecessary looping are demonstrated in examples 2 and 3 of the problem 2. Instead of moving the checking condition out, they each move the filter out so the loop runs in a minimum number of times.
RSpec Testing
Existing Tests
There were existing RSpec tests for the Lottery Controller, but the tests only coverage about 10% of the code in the controller. The test coverage was determined by the Coveralls gem. The existing tests were only testing small parts of the existing long methods. The current tests included the following:
- Testing the webservice call and the JSON response type for the users' bidding data
- Checking if an intelligent assignment is truly intelligent
- Creating new team for a bid
- Sorting unassigned teams
After Refactoring
After refactoring the Lottery Controller, the long methods were broken down into smaller, more simpler methods. This required a new RSpec test for each of the new methods, but it allowed for increased code coverage. In order to effectively test the rest of the controller, testing objects that mimic the expected application objects were created. The testing objects were created using the Factory gem and the double feature of the RSpec gem. The objects are also recreated before each test so that the tests are independent from one another to truly validate the different testing sections. The added tests were able to expand the code coverage from about 10% to 97%. The new tests that were added for each of the newly condensed methods include testing for the following:
- Constructing the hash table for the users' bidding information
- A more exhaustive running Intelligent Assignment method
- Generate the hash for a whole team's bidding information
- Match new teams to a topic based on their bids
- Remove a user from a team
- Destroying teams that have no users
- Merging the bids from teams that have merged together so that they have a weighted uniform bid
- Assigning open topics to teams that do not have a topic yet
- Matching teams to topics based on their weighted bids
Future Improvements
- Improving the RSpec test for assigning open topics for teams without topics. The new test only checks for 1 open assignment and 1 unassigned team. The method is capable of assigning multiple open topics for multiple unassigned teams, but testing for this would require more Factory objects for assignments and teams. While the new test provides code coverage for the entire method, the nested loops currently only run through a single time before completion, which is not always the case in a realistic setting.