CSC/ECE 517 Fall 2018- Project E1846. OSS Project Navy: Character Issues
E1846. OSS Project Navy: Character Issues Fall 2018, CSC/ECE 517.
Problem Statement
As part of the project we were given task to fix following two issues.
1. Issue 1: In the existing Expertiza setup, the database supports only UTF-8 characters. Hence, if a user enters a non UTF-8 character, the database throws an error. This further leads to loss of data while refreshing or going back to the input page as data wasn't saved in database, effectively leading to loss of entire review if there's even a single non UTF-8 character. We need to solve the problem by removing such unsupported characters.
2. Issue 2: The existing expertiza stores the HTML formatting tags (Like <b> for bold) as a string. However, while rendering the string these tags are not escaped, resulting in no formatting. We need to solve the issue and display proper formatting.
These are important issues from usability point of view and need to be fixed.
Problem as experienced by the user
Issue 1
While trying to save utf8 chars the application was throwing up error as following
Issue 2
If a user looks at the review score table, he/she can see html tags along with the review comments. There are other instances such as the self review page where the problem is repeated.
Solution : Approach and Fix
Issue 1
One of the solutions proposed was filtering out the non-UTF8 characters before saving the input in the database. Since the non-UTF8 input can come from any view, we implemented a filter_non_UTF8 method in application controller to do just that and adhere to DRY principle. An alternative approach will be removing the problematic characters within individual functions, but this leads to repetitive code and a violation of DRY principle.
However, while experimenting with the fix, we found out that not all tables support UTF8 formatting. For E.g., the versions table which has the latin charset.
mysql> show create table versions; | Table | Create Table | | versions | CREATE TABLE `versions` ( `id` int(11) NOT NULL AUTO_INCREMENT, `item_type` varchar(255) NOT NULL, `item_id` int(11) NOT NULL, `event` varchar(255) NOT NULL, `whodunnit` varchar(255) DEFAULT NULL, `object` mediumtext, `created_at` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `index_versions_on_item_type_and_item_id` (`item_type`,`item_id`) ) ENGINE=InnoDB AUTO_INCREMENT=142423 DEFAULT CHARSET=latin1 | 1 row in set (0.01 sec)
We changed the charset with the command:
mysql> ALTER TABLE versions CONVERT TO CHARACTER SET utf8;
Output of show create table now is
mysql> show create table versions; | Table | Create Table | versions | CREATE TABLE `versions` ( `id` int(11) NOT NULL AUTO_INCREMENT, `item_type` varchar(255) NOT NULL, `item_id` int(11) NOT NULL, `event` varchar(255) NOT NULL, `whodunnit` varchar(255) DEFAULT NULL, `object` mediumtext, `created_at` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `index_versions_on_item_type_and_item_id` (`item_type`,`item_id`) ) ENGINE=InnoDB AUTO_INCREMENT=142423 DEFAULT CHARSET=utf8 | 1 row in set (0.01 sec)
This solved the problem. Therefore, we created the migration VersionTableSupportUTF8 to change version's characterset.
rails g migration VersionTableSupportUTF8
def change execute "ALTER TABLE versions CONVERT TO CHARACTER SET utf8" end
If we want to fix this in all tables, we can do following for each database via script or add migration for each table.
SELECT CONCAT('ALTER TABLE ', TABLE_NAME, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;') FROM information_schema.TABLES WHERE TABLE_SCHEMA = 'expertiza_development';
Thus we see that after lot of exploration and careful debugging we were able to zero down on the root cause. This approach is the best possible outcome as it is not a point fix and solves the root cause. Fix also avoids any duplication of code / point fixes.
Issue 2
The HTML template issue was caused due to a security feature of Ruby , which by default does not evaluate strings. To resolve the HTML template issue, we used the sanitize() function. This strips all the tags that aren't whitelisted, thus ruby now renders the standard HTML tags. We have sanitized required pages. sanitize
We had to carefully test all flows where the html tags were getting rendered. We discovered about more than 2 places where this could be done.
However there were additional screens such as tooltip in grade scores which too were showing html tags. Therefore we added code to strip tags in html.
For tooltip changes, sanitize does not work, therefore we escaped all html form the text.
<td class="<%= score.color_code %>" align="center" data-toggle="tooltip" title="<%= strip_tags(score.comment).html_safe%>">
Files Created or Refactored
The following files were modified for this project namely:
1. Refactored application_controller
2. Refactored self_review_popup
3. Created a new migration - VersionTableSupportUTF8
4. Created Rspec file application_controller_spec.rb
5. Changed app/views/grades/view_team.html.erb
6. Changed db/schema.rb
Test Plan
Manual UI Tests
To check if our fix for rendering html tags works we relied on manual setup and verification. We reran the steps that were causing the issue in first place and checked if it is fixed by inspection.
Test Case 1
Steps for verification :
1. Log in as instructor6
2. Create a copy of “Final Project (and design doc)”
3. Change all due dates to today’s dates. Enable “Use signup deadline” as well and set that date to day’s date
4. Save
5. Click on add participants, then import all participants from course (The following steps assume student7488 and student7489 were added to the course during this step. Double-check to make sure they were added, or use some other students that were added)
6. Login as student7488 – sign up for the first topic.
7. Login as student7489 – sign up for second topic.
8. Login as instructor6 – set signup date to yesterday’s date (so we are now past the sign up deadline)
9. Login as student7488, submit any example hyperlink for this assignment. Perform a self review under “Your scores” (Add bold, italics, lists etc.. in this review as well)
10. Do the same as student7489
11. Login as instructor6 – set submission date to yesterday’s date (so we are now past the submission deadline)
12. Login as student7488 – go to “Other’s work” and click “Request new submission for review”. Perform review.
13. Do the same as student7489.
14. Login as instructor6 – set review1 date to yesterday’s date
15. Navigate back to “Manage -> Assignments”. Click on “View review report”. On the top-left dropdown, select “Self-review report”. Click on “View”. Click on student7488 in the “Self Review” column. You will not see HTML tags if you had entered them in step 9 instead appropriate html will be rendered.
Video of running output is available here: [1] -- As you can see in the later part of the video, HTML tags have been rendered correctly.
Test Case 2
1. Repeat 1-8, create assignments and add reviewers
2. Login as student2 and give review that contains html tags, logout
3. Login as student1 and check reviews
After the fix ( refer to section 3.2 )
Following is the output with html escaped.
Test Case 3
1. Edit any text field and enter a supported UTF8 character. 2. Save. 3. Try to edit it again and try to save. Earlier, this step was failing as it tried entering UTF8 character to the versions table which did not support UTF8.
After the fix, editing it works as seen here: [2]
Automated Tests
We have added Rspec tests to test that the given non-UTF8 character is removed and a valid UTF-8 character is not removed. This ensures that the functionality is exhaustive. The video is available here: [3]
Both the Rspec tests pass: