E1920 Fix Code Climate Issues: Difference between revisions
Line 14: | Line 14: | ||
We conclude with Testing strategies we followed and conclusion on the project. | We conclude with Testing strategies we followed and conclusion on the project. | ||
=== Expertiza === | === Expertiza === | ||
Revision as of 23:16, 25 March 2019
This wiki page describes the changes are made in the Expertiza source code to implement E1920 OSS Project, "Fix Code Climate issues in models with names beginning with 'Se' thru 'Z'" in the Spring 2019 semester, CSC/ECE 517.
Introduction
This Wiki page is created per CSC/ECE 517: "Object-Oriented Design and Development", OSS project 3 requirements (Spring, 2019).
Our team was assigned with "E1920. Fix Code Climate issues in models with names beginning with Se thru Z" Expertiza project. In this page we describe what work has been completed by our team to improve overall quality, readability, maintainability and efficiency of the Expertiza source code. What approaches we used to resolve issues reported by Code Climate, what are the remaining issues in the Expertiza code, our suggestions for further improvements and our testing strategy used to ensure integrity of the code. Each section below provides clear and descriptive review on the subject we cover.
We start with introduction to Expertiza application and Code Climate tool. How Code Climate is used to improve Expertiza source code.
We continue with the project description where we explain what has been done and what files are getting effected by the changes/improvements we have made.
Next, we cover project implementation section where we explain all our work in details with code snippets as examples.
We conclude with Testing strategies we followed and conclusion on the project.
Expertiza
Expertiza is a peer to peer application whereby students gain knowledge through submitting their work into the system and review the work of others. This is system diverges from the customary systems where a instructor or body of teaching staff would grade the papers/assignments and provide feedback to the students. This process misses out on a learning opportunity for students performing reviews. Allowing students to see the work of others allows them to see more code examples and from different aspects. This exercise during the review process forces the student to think through lecture material and apply during the process. The Expertiza application can also be used as an iterative process from class to class. The examples of the current semester can be used by students of subsequent semesters to identify shortcomings and suggest further improvements.
Code Climate
Code Climate is an analysis software used to improve the quality and security of software. The Code Climate generated results for the Expertiza project can be found here.
Code Climate utilizes various engines to analyze the code to look for issues that may concern potential security issues, (.i.e. marking code to html_safe), or add to maintainability issues (.i.e number of lines in method exceeding 25 lines). Some of the engines include: RuboCop, Brakeman, Bundler-Audit, etc.
Code Climate can be ran from either a 3-rd party remote server of via the local application using CLI. The Expertiza project uses the 3-rd party remote server to analyze the code. In order to use Code Climate via remote server, desired repository must be registered with Code Climate. Following this process allows to test the code for quality issues after each git commit/push to a registered repository.
Project Description
Some of the Expertiza source code violates many of the Ruby on Rails best practices and must be rectified to improve code readability, robustness, efficiency, and maintainability of the Expertiza application.
Our team was assigned to fix all code smells detected by Code Climate in the given set of Expertiza source code files. In this section we specify what files we have changed to improve overall software of the Expertiza project. We also list and describe the issues that we were asked to ignore during implementation of our project.
Files Involved
All files in the app/models directory with names beginning with Se (e.g. section_header.rb) through Z, except users_controller.rb file were targeted by our team to be Code Climate issue free files. Therefore, our team was assigned exactly with 50 (fifty) files to fix in total. Some of the files had no issues reported by the Code Climate to begin with, and hence were not modified. However, changes/improvements performed on other files had an affect globally on the files outside of the app/models scope and also had to be modified to adapt changes we implemented. For example, renaming method is_leaf as follows:
def leaf?
in the team_user_node.rb file require to change all callers to this method in different files accordingly. Such changes had to be done in the following files (for this particular example):
- models/assignment_node.rb
- models/node.rb
- models/questionnaire_node.rb
- views/tree_display/_row_header.html.erb
These files are outside of the requirements specifications, but we had to modify them in order to maintain correct behavior of the App. There were few such cases when we had to modify multiple files to fix one or more Code Climate issue(s). The list of all of files changed can be found in the Pull Request discussed in this article.
Ignored Code Climate Smells
There are some Code Climate issues that were ignored and not resolved intentionally.
The following issues are exempt based the given requirements specifications and were not fixed:
- Assignment Branch Condition size for [method name] is too high.
- Perceived complexity for [method name] is too high.
- Cyclomatic complexity for [method name] is too high.
- Method [method name] has a Cognitive Complexity of XX (exceeds 5 allowed). Consider refactoring.
- File [file name] has XXX lines of code (exceeds 250 allowed). Consider refactoring.
- Class [class name] has XX methods (exceeds 20 allowed). Consider refactoring.
- Method [method name] has XX lines of code (exceeds 25 allowed). Consider refactoring.
There are also two Code Climate issues that our team decided to ignore after consulting our mentor and discussing their impact on the existing code and Expertiza project. It was determined that the following Code Climate issues reported as False Positive and may not need to be resolved:
- Mass assignment is not restricted using attr_accessible.
- Potentially dangerous attribute available for mass assignment.
Finally, there was a duplicate code issue that was ignored due to a future project. The following files have a duplicate method, self.export_fields(options):
- user.rb
- participant.rb
For further clarification with regard to ignored Code Climate issues please contact our mentor.
Project Implementation
In this section we talk about implementation of our project. We start with the most common issues reported by Code Climate and what approach we used to resolve them. We provide some code samples taken from the files and show their improvements. These changes/improvements are not trivial and require understanding of the code. Therefore, we provide implementation details of such issues along with detailed explanations.
Furthermore, we list all of the issues reported by Code Climate that we fixed. These changes/improvements are trivial and require no explanation. Therefore, any implementation details are excluded for these type of issues.
Next, we cover section with additional improvements that we have done along with fixes for Code Climate issues. These improvements do not directly relate to Code Climate issues; however, they improve overall quality of the code, its readability, maintenance and efficiency.
We also list and talk about remaining issues reported by Code Climate that were intentionally ignored and left in the project. We provide further suggestions based on our experience to the Expertiza software developers.
Finally, we provide out pull request that our team created as we started to work on this project.
Most Common Issues Reported by Code Climate
As we worked on the Code Climate issues for the Expertiza project, we noticed a set of common code smells and issues made by software developers. Since these issues can easily be avoided and most importantly, avoiding them improves code quality, we decided to warn all Expertiza developers about them by providing the list of these issues, their descriptions, how they can be avoided, and how the code quality can be improved.
- Do not use Time.now or Time.parse without zone.
This is very common issue reported by Code Climate. Since time is relative and may have different values (UTC, EST, and etc), it is a good practice to specify the time zone, whenever the Time class used.
Note that Time.parse and Time.zone.parse behave differently for invalid or unknown time. While returns an error for invalid or unknown times, returns nil for those types.
- Time.parse - returns an error for invalid or unknown time. Generally, can be handled by rescue
- Time.zone.parse - returns nil for invalid or unknown time. It must be handled differently.
Therefore, always handle nil appropriately if invalid or unknown time is given (with if/else block, return, or or statements, and etc.) Do not assume that time will always be valid instance. For example, (code sample is used from app/models/student_task.rb file):
# Bad - use Time without zone and rescue if time is invalid or unknown stage_deadline: (Time.parse(participant.stage_deadline) rescue Time.now + 1.year) # Good - use Time with zone and using or statement to handle nil case stage_deadline: (Time.zone.parse(participant.stage_deadline) || (Time.zone.now + 1.year))
Above example is good practice how Time class should be used.
- Method x has n arguments (exceeds 4 allowed). Consider refactoring:
Any method that takes more than 4 arguments must be refactored with accordance to good RoR coding standards. There are multiple ways to refactor signature of such methods and there is no single standard way of resolving this issue. Each method must be considered separately and requires unique approach to refactor, utilizing hash, array and other Ruby's built-in libraries and tools. Below we provide one sample of how fixed such issue utilizing Ruby's built-in hash (code sample is used from app/models/vm_question_response_row.rb file):
# Bad - initialize method has 5 arguments def initialize(question_text, question_id, weight, question_max_score, seq) @question_text = question_text @weight = weight @question_id = question_id @question_seq = seq @question_max_score = question_max_score # Good - initialize method has 1 argument, hash def initialize(question_data) @question_text = question_data[:text] @question_id = question_data[:id] @weight = question_data[:weight] @question_max_score = question_data[:max_score] @question_seq = question_data[:seq]
Note that since method signature is changed, it of the callers must be changed accordingly to adapt new method signature. For example, the following is illustration of such change to adapt above refactored method signature (code sample is used from app/models/vm_question_response.rb file):
# Bad - passing in 5 arguments to the constructor row = VmQuestionResponseRow.new(question.txt, question.id, question.weight, question_max_score, question.seq) # Good - passing in 1 argument to the constructor, hash row = VmQuestionResponseRow.new(text: question.txt, id: question.id, weight: question.weight, max_score: question_max_score, seq: question.seq)
- Prefer each over for:
We noticed there is a lot of iteration in the code using for loop instead of each. Generally, you want to avoid any for loops and use each, each_index, each_with_index, upto and etc. For example, (code sample is used from app/models/sign_up_sheet.rb file):
# Bad - use of for loop for user_signup_topic in user_signup return false if user_signup_topic.is_waitlisted == false # Good - use of each user_signup.each do |user_signup_topic| return false if user_signup_topic.is_waitlisted == false
Similarly, if you need to iterate the given block, passing in integer as an index, utilize each_index, each_with_index, upto instead of for. For example, (code sample is used from app/models/sign_up_sheet.rb file):
# Bad - use of for loop for round in 1..@review_rounds process_review_round(assignment_id, due_date, round, topic) # Good - use of upto 1.upto(@review_rounds) do |r| process_review_round(assignment_id, due_date, r, topic)
- Specify an :inverse_of option:
Whenever there are two models defined with many|one-to-one relationship (has_many|has_one and belongs_to association) out of scope of each other or with certain options used (e.g. foreign_key: and etc.), Active Record is unable to determine the inverse association of them. This would cause redundant look up operation performed on the database for the same object. It is not really an issue that could cause significant problems, but for such massive application like Experiza, it could affect performance and cause delay in response. Therefore, the :inverse_of option must be specified manually to use the associated object in memory instead of loading it from database twice. For example, (code sample is used from app/models/teams_user.rb and app/models/team_user_node.rb files respectively):
class TeamsUser < ActiveRecord::Base has_one :team_user_node, foreign_key: 'node_object_id', dependent: :destroy, inverse_of: 'node_object' end class TeamUserNode < Node belongs_to :node_object, class_name: 'TeamsUser', inverse_of: :team_user_node end
- Specify a :dependent option:
The has_many association should always be specified with one of the :dependent options, which specifies the behavior of the associated objects whenever their owner is destroyed. This is common practice and should be utilized adequately for the Expertiza app to work correctly. No specifying :dependent option could lead to run time errors, unpredictable behavior of the app, and creating orphan objects that will never be referenced again. Specify a :dependent option whenever utilizing has_many association. For example, (code sample is used from app/models/site_controller.rb file):
# Bad - not specifying a :dependent option class SiteController < ActiveRecord::Base has_many :controller_actions # Good - specifying a :dependent option class SiteController < ActiveRecord::Base has_many :controller_actions, dependent: :destroy
- Tagging a string as html safe may be a security risk. This Code Climate issue
For example,
html.html_safe # George can you please talk about it how you fixed it
- Use a guard clause instead of wrapping the code inside a conditional expression.
This is very common issue reported by Code Climate that can be easily avoided by properly utilizing `if` statements applying some logic. In general, avoid using bulk of if/else statements whenever possible. For example, (code sample is used from app/models/sign_up_topic.rb file):
# Bad - too many conditional expressions that also contain nested ones if !no_of_students_who_selected_the_topic.nil? if topic.max_choosers > no_of_students_who_selected_the_topic.size return true else return false end else return true end # Good - use guard close return true if no_of_students_who_selected_the_topic.nil? topic.max_choosers > no_of_students_who_selected_the_topic.size
Above is typical example when 9 lines of code (LoC) can be reduced to 2 LoC and overall code quality improved.
- Use find_by instead of where.first:
Always use find_by with an attempt to find the first record matching the specified conditions.
Note that find_by and where.first behave differently if no record was found based on specified conditions.
- Model.find_by - if there is a record match, returns the object. If there is no record match, returns nil
- Model.where.first - if there is a record match, returns first ActiveRecord::Relation matched. If there is no record match, returns an empty ActiveRecord::Relation
Therefore, always handle nil appropriately if there is no record match (with if/else block, return statement and etc.) Do not assume that the record will always be found (since empty relation skips iteration and nil throws an error if attempt to iterate). For example,
user = User.find_by(name: row_hash[index.to_sym].to_s) raise ImportError, "The user, " + row_hash[index.to_sym].to_s.strip + ", was not found." if user.nil?
Similar to this issue Code Climate reported "Use find_by instead of dynamic find_by_name" issue. It is resolved in the same manner as the above issue by changing find_by_name to find_by query and properly handling returned edge-cases (e.g. nil).
- Use snake_case for method names and for variable names:
There are a lot of instances where developers use camelCase instead of snake_case in the Expertiza project naming methods and variables. Since Expertiza project is based on RoR, it is preferred to use snake_case for all method and variable names. For example, (code sample is used from app/models/sign_up_sheet.rb and app/models/sign_up_topic.rb files receptively):
# Bad method name - camelCase def self.slotAvailable?(topic_id) # Good method name - snake_case def self.slot_available?(topic_id) # Bad variable name - camelCase dropDate = AssignmentDueDate.where(parent_id: assignment.id, deadline_type_id: '6').first # Good method name - snake_case drop_date = AssignmentDueDate.where(parent_id: assignment.id, deadline_type_id: '6').first
Resolved Code Climate Issues
Along with Most Common Issues Reported by Code Climate we also fixed and verified numerous other code smells. The following is the list of all issues we have resolved. We do not provide solutions for them since most of them were trivial code changes.
- Argument x was shadowed by a local variable before it was used.
- Avoid more than 3 levels of block nesting.
- Avoid too many return statements within the method.
- Avoid using update_attribute because it skips validations.
- Combine _type" type="text"> and " into a single string literal, rather than using implicit string concatenation.
- Consider simplifying this complex logical expression.
- Convert if nested inside else to elsif.
- Do not introduce global variables.
- Do not place comments on the same line as the end keyword.
- Do not prefix reader method names with get_.
- Do not prefix writer method names with set_.
- Do not use DateTime.parse.strftime without zone. Use one of Time.zone.parse.strftime, DateTime.current, DateTime.parse.strftime.in_time_zone, DateTime.parse.strftime.utc, DateTime.parse.strftime.getlocal, DateTime.parse.strftime.iso8601, DateTime.parse.strftime.jisx0301, DateTime.parse.strftime.rfc3339, DateTime.parse.strftime.to_i, DateTime.parse.strftime.to_f instead.
- Extra empty line detected at method body beginning.
- Extra empty line detected at method body end.
- end at xxx, xx is not aligned with unless (or def) at xxx, xx.
- Identical blocks of code found in n locations. Consider refactoring.
- Line is too long.
- Method x is defined at both app/models/y.rb and app/models/z.rb
- Optional arguments should appear at the end of the argument list.
- Place the condition on the same line as elsif.
- Prefer Date or Time over DateTime.
- Redundant use of Object#to_s in interpolation.
- Rename has_x to x?.
- Rename is_x to x?.
- Replace class var @@variable_name with a class instance var.
- Similar blocks of code found in n locations. Consider refactoring.
- Unnecessary spacing detected.
- Unused method argument - x. If it's necessary, use _ or _x as an argument name to indicate that it won't be used.
- Use find_by instead of where.first.
- Use x.zero? instead of x == 0.
- Use == if you meant to do a comparison or wrap the expression in parentheses to indicate you meant to assign in a condition.
- Useless assignment to variable - x.
- Variable x used in void context.
Additional Improvements
While our primary task was to fix code smells reported by Code Climate, we also attempted to improve overall quality of code along the way. We introduced several minor changes that we believe improve the code, and encourage all software engineers working on the Expertiza source code to maintain clean and efficient software. All additional improvements are listed and described below.
- Meaningful names for class and instance variables.
There were several files where names for class or instance variables did not meet good naming conventions. For example, in the app/models/vm_question_response_row.rb file, instance variable
no_of_columns
indicating number of columns that refer to count of reviews, was re-named with
count_columns
Above is only an example of how we attempted to improve readability of the code.
- More comments where is needed.
In general there is sufficient amount of comments to fully understand the code and follow the flow. However, some of the functionalities are not as trivial and additional comments help a lot faster understand what the code does. We added such comments in all the places where we fixed Climate Code issues. We also ensured to provide only sufficient comments to keep code as clean as possible and at the same time to clarify any complexity.
- Eliminate DRY code.
We tried to combine all of the duplicated code into a single functionality and eliminate any repetitive statements. Most of these issues were taken care by fixing "Similar blocks of code found in n locations" and "Identical blocks of code found in n locations" Code Climate issues. However, in some files Code Climate did not report such issues, although there were obvious repetitive statements. For example, add_reviews method in the app/models/vm_question_response.rb file had 4 places of the repetitive code and it was different only by data retrieved from various tables as follows:
if @questionnaire_type == 'ReviewQuestionnaire' reviews = ... reviews.each do |review| review_mapping = ReviewResponseMap.find_by(id: review.map_id) # Repetitive code as everywhere else elsif @questionnaire_type == "AuthorFeedbackQuestionnaire" reviews = participant.feedback reviews.each do |review| review_mapping = FeedbackResponseMap.find_by(id: review.map_id) # Repetitive code as everywhere else elsif @questionnaire_type == "TeammateReviewQuestionnaire" reviews = participant.teammate_reviews reviews.each do |review| review_mapping = TeammateReviewResponseMap.find_by(id: review.map_id) # Repetitive code as everywhere else elsif @questionnaire_type == "MetareviewQuestionnaire" reviews = participant.metareviews reviews.each do |review| review_mapping = MetareviewResponseMap.find_by(id: review.map_id) # Repetitive code as everywhere else
The above duplicate code was refactored to eliminate all repetitive statements and make the code more efficient as follows:
if @questionnaire_type == 'ReviewQuestionnaire' reviews = ... response_map = 'ReviewResponseMap'.constantize elsif @questionnaire_type == 'AuthorFeedbackQuestionnaire' reviews = participant.feedback response_map = 'FeedbackResponseMap'.constantize elsif @questionnaire_type == 'TeammateReviewQuestionnaire' reviews = participant.teammate_reviews response_map = 'TeammateReviewResponseMap'.constantize elsif @questionnaire_type == 'MetareviewQuestionnaire' reviews = participant.metareviews response_map = 'MetareviewResponseMap'.constantize end # All repetitive code was extracted and now is common unless reviews.nil? reviews.each do |review| review_mapping = response_map.find_by(id: review.map_id) # Repetitive code
Again this is only single sample where improvements was made.
Remaining Issues
Unfortunately, Expertiza has some remaining issues reported by Code Climate in the files under app/models directory (names beginning with Se through Z) that we were asked to ignore in this project. These issues described in the Ignored Code Climate Smells section above and why there were ignored. Although these code smells do not impose any problems into existing code and application, we strongly believe that all of remaining issues must be fixed.
We did not find any further issues besides what we already reported and fixed it.
Further Suggestions
We suggest and encourage to fix all of the remaining issues that were intentionally ignored per the projects requirements in all of the files under app/models directory.
We also suggest to maintain clean, efficient, well-commented, and well-tested code in the Expertiza project. Prior delivering any changes to the master or beta branches, we recommend to test your software in the following manner:
- Code inspection
- Passing existing Rspec tests
- Adding new Rspec test that cover changes
- Running through a live demo in the Expertiza instance through the UI
Furthermore, we noticed that currently Expertiza project has ~53% test coverage, which is relatively small test coverage for such a massive project. Remember, "Debugging - sucks, and Testing - rocks!". We encourage Expertiza software developers to reach at least (minimum) 85% test coverage. That must cover all vital functionalities of the application and reduce any possibilities of introducing new issues into the system.
Pull Request
Let's add some info about pull request
Testing
There we 2 types of frame works that we used to verify our code modifications/creation within the models: RSpec and Travis CI.
RSpec was used to locally test the changes we had implemented without having to Push the change to the GitHub, this is discussed more in-depth in the section below.
Travis CI is a more comprehensive testing used by the Expetiza team to verify code changes from multiple sources, (.i.e forks from students). Travis CI will analyze the code once you Push your changes to the GitHub server for your given repository. Travis CI will need to be setup independently for your repository
Talk about Testing we performed
Rspec Testing
To verify the code changes we made did not break the existing application, we used the RSpec Testing Framework. Several of the models already had existing RSpecs Tests already created. For those, we just simply ran the corresponding test via the terminal window. Example
rspec spec/models/tag_prompt_spec.rb *rspec path/to/spec/file.rb*
This would execute the RSpec test for the tag_prompt.rb file located in the app/models folder. As you can see, there is a particular format that is used when naming the RSpec file:
model_name_spec.rb
Once the execution of the RSpec model testing is completed, results will be display
$ rspec spec/models/tag_prompt_spec.rb [10:45:15] [Coveralls] Set up the SimpleCov formatter. [Coveralls] Using SimpleCov's 'rails' settings. Randomized with seed 19443 ......... Finished in 3.35 seconds (files took 23.23 seconds to load) 9 examples, 0 failures Randomized with seed 19443 Coverage report generated for RSpec to /home/expertiza_developer/RubymineProjects/expertiza_testbed/coverage. 1254 / 17313 LOC (7.24%) covered. [Coveralls] Outside the CI environment, not sending data.
In addition to running the individual RSpec Test, we also executed broader in scope testing. To achieve this, we executed the following command in the terminal window:
$ bundle exec rspec --seed 5114
This produces the following results:
Finished in 1 minute 11.23 seconds (files took 23.19 seconds to load) 131 examples, 1 failure, 1 pending Failed examples: rspec ./spec/features/assignment_creation_spec.rb:707 # assignment function check to find if the assignment can be added to a course Randomized with seed 5114 Coverage report generated for RSpec to /home/expertiza_developer/RubymineProjects/expertiza_testbed/coverage. 2998 / 14031 LOC (21.37%) covered. [Coveralls] Outside the CI environment, not sending data. FAIL: 1
As you can see from the examples above there are failures. However, we are only looking for failures caused by the code we modified/created in a given model. In this case, all code we have created/modified is not causing failures.
UI Testing
We concentrated our UI testing based on how changes to Code Climate issues affected the interface between callers and methods. Two main culprits were found.
- Change of Time.parse to Time.zone.parse. While Time.parse returns an error for invalid or unknown times, Time.zone.parse returns nil for those types.
- Change of where().first to find_by() query. Using where().first returns empty records if record is not found, while find_by() returns nil if there was no match.
Note both of the above changes are well described in the Most Common Issues Reported by Code Climate section above. The section describes how such cases must be handled properly.
The user interface testing concentrated on three type of user stories.
- General functionality, could users still access the basic menu items from before.
- Test time functionality, such as sorting based on timestamp.
- Test searching functionality (involving all changes of where().first to find_by()).
UI testing helped us revealed all the problems that were introduced along with our fixes and running through UI live demos ensured that we have fully functioning and properly working Expertiza app.
Conclusion
Talk about conclusion.
Team Contacts and Contribution
We have 3 (three) members in our team and 1 (one) mentor was assigned to us. For any project-related questions/concerns please contact us:
- Christopher Adkins - cladkins@ncsu.edu
- George Hugh - gshugh@ncsu.edu
- Nikolay G. Titov - ngtitov@ncu.edu
- Zachariah Mathews (mentor) - zmathew@ncsu.edu
If you have questions or concerns related to a specific file that was improved/modified by our team, please contact an engineer who worked on the file based on the Assignment Grid we created to keep track of our progress and contribution. Please CC in email all team members and our mentor.
References
- Ruby [1]
- CodeClimate [2]
- Expertiza [3]
- E. F. Gehringer, L. M. Ehresman, S. G. Conger and P. A. Wagle, "Work in Progress: Reusable Learning Objects Through Peer Review: The Expertiza Approach," Proceedings. Frontiers in Education. 36th Annual Conference, San Diego, CA, 2006, pp. 1-2. [4]
- The has_many Association [5]
- Module ActiveRecord::Associations::ClassMethods Options [6]
- TimeZone Class [7]