CSC/ECE 517 Spring 2015 E1527 SWAR: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
 
(55 intermediate revisions by 3 users not shown)
Line 1: Line 1:
<font size="5"><b>E1527. Refactor Autometareviews gem and migration to Web-Service</b></font><br>
<font size="5"><b>E1527. Refactor Autometareviews gem and migration to Web-Service</b></font><br>
= Introduction to Expertiza =
= Introduction to Autometareview project <ref>https://github.com/lramach/autometareviews0.1</ref>=
 
This project is developed as part of Expertiza project <ref>http://wikis.lib.ncsu.edu/index.php/Expertiza</ref>. <br />
The automated metareview tool identifies the quality of a review using natural language processing and machine learning techniques (completely automated). Feedback is provided to reviewers on the following metrics: 1. Review relevance: This metric tells the reviewer how relevant the review is to the content of the author's submission. Numeric feedback in the scale of 0--1 is provided to indicate a review's relevance. 2. Review Content Type: This metric identifies whether the review contains 'summative content' -- positive feedback, problem detection content' -- problems identified by reviewers in the author's work or 'advisory content' -- content indicating suggestions or advice provided by reviewers. A numeric feedback on the scale of 0--1 is provided for each content type to indicate whether the review contains that type of content. 3. Review Coverage: This metric indicates the extent to which a review covers the main points of a submission. Numeric value in the range of 0--1 indicates the coverage of a review. 4. Plagiarism: Indicates the presence of plagiarism in the review text. 5. Tone: The metric indicates whether a review has a positive, negative or neutral tone. 6. Quantity: Indicates the number of unique words used by the reviewer in the review.
The automated metareview tool identifies the quality of a review using natural language processing and machine learning techniques (completely automated). Feedback is provided to reviewers on the following metrics:
 
<ol>
<li>Review relevance: This metric tells the reviewer how relevant the review is to the content of the author's submission. Numeric feedback in the scale of 0--1 is provided to indicate a review's relevance. </li>
<li>Review Content Type: This metric identifies whether the review contains 'summative content' -- positive feedback, problem detection content' -- problems identified by reviewers in the author's work or 'advisory content' -- content indicating suggestions or advice provided by reviewers. A numeric feedback on the scale of 0--1 is provided for each content type to indicate whether the review contains that type of content. </li>
<li>Review Coverage: This metric indicates the extent to which a review covers the main points of a submission. Numeric value in the range of 0--1 indicates the coverage of a review. </li>
<li>Plagiarism<ref>http://www.plagiarism.org/plagiarism-101/what-is-plagiarism/</ref>: Indicates the presence of plagiarism in the review text.</li>
<li>Tone: The metric indicates whether a review has a positive, negative or neutral tone. </li>
<li>Quantity: Indicates the number of unique words used by the reviewer in the review. </li>
</ol>
<br/>
__TOC__
__TOC__


= Problem Statement =
= Problem Statement =


Currently, Autometareviews project is used as a gem in Expertiza project. Purpose of this project is to migrate this gem to a web service and expose its methods on web, which can
Currently, Autometareviews project is used as a gem<ref>http://guides.rubygems.org/what-is-a-gem/</ref> in Expertiza project. Purpose of this project is to migrate this gem to a web service<ref>http://en.wikipedia.org/wiki/Web_service</ref> and expose its methods on web, which can be consumed by any application as web service. Older gem was dependent old libraries<ref>http://en.wikipedia.org/wiki/Library_%28computing%29</ref>
be consumed by any application as web service. Older gem was dependent old libraries
such as Stanford-core-nlp, rwordnet, etc. We will migrate them to new libraries without breaking the existing feature-set. We are also going to refactor the source code of this gem file to promote readability, reduced complexity, and code redundancies. We will fix
such as Stanford-core-nlp, rwordnet, etc. We will migrate them to new libraries without
any bug or bottleneck that we can find to improve the performance of this service. We will not add any new feature to the existing feature set provided by the gem. Before making any modification to the existing features, we will present them before Dr.
breaking the existing feature-set. We are also going to refactor the source code of this
gem file to promote readability, reduced complexity, and code redundancies. We will fix
any bug or bottleneck that we can find to improve the performance of this service.
We will not add any new feature to the existing feature set provided by the gem. Before
making any modification to the existing features, we will present them before Dr.
Gehringer and his Expertiza team.
Gehringer and his Expertiza team.


= List of Tasks =
= Scope =
Below are detailed explanations to the tasks listed in the description documentation.
There are three separate scope items in this project -
:* Migration of existing gem application to a web service
:* Refactoring the existing ruby classes
:* Migrating to newer libraries, wherever possible.


The classes that we propose to refactor are -
:* tone.rb
:* degree_of_relevance.rb
:* wordnet_based_similarity.rb
:* sentence_state.rb
:* cluster_generation.rb
:* plagiarism_check.rb
:* graph_generator.rb
:* predict_class.rb
:* review_coverage.rb


== 1. Replace Buttons and Links to Bootstrap styled Buttons ==
No new feature will be developed as part of this project. Any major code change due to inclusion of newer libraries will be communicated to Expertiza project team. Existing code will be tested to ensure the functionality does not change.
Numerous elements in the current version of Expertiza, like 'Back', are hyperlinks rather than buttons. Our goal is to replace this plain-styled text with color coded Bootstrap buttons.


== 2. Create new theme ==
=Standards =
The theme of Expertiza looks primitive now, a new theme with fix menu on the top is much prefered.
All developed code will adhere to the ruby on rails coding guidelines<ref>https://docs.google.com/document/d/1qQD7fcypFk77nq7Jx7ZNyCNpLyt1oXKaq5G-W7zkV3k/edit</ref>.


For example, the theme with fixed-top navigation bar on Virgin American website
= List of Tasks =
 
Metioned below are the tasks we will perform as part of this project.<ref>https://docs.google.com/document/d/10JTdEjCiRTre3nO4j_czBzhcxkqOSzmAT8jyiJHNz4c/edit#</ref>
[[File:Picture4-2.png]]
 
 
 
== 3. Auto hides contents on the dashboard for instructors ==
Please refer to the later part of this article: [[#Manage-Course page|Manage-Course page]] for viewing the current design
 
Now, when trying to open manage-courses, all records are loaded before rendering them onto the page, which takes too long time to load and is not responsive at all.
 
Better practice is on Virgin American’s website: https://www.virginamerica.com/book/rt/a1/sfo_bos/20150401_20150402
 
When the website is representing the calendar, the price for each day is not actually loaded. But the skeleton of the page, which is the calendar in this case, can be displayed first.
 
[[File:Picture5.png]]
 
So in Expertiza, we can create several buttons for course semesters in the page. After clicking on Fall 2014, the courses for Fall 2014 will come out.
 
And if the list is still too long and takes a long time load, we can use a ‘show more’ button ,or automatically load more data when the scroll bar reaching the end of the page, to minimize the content we need to load after one single mouse click.
 
[[File:Picture6.png]]
 
 
 
== 4. Simplify Action Panel ==
We will need to reduce the number of buttons. For example, the first row are actions for assignments and the second row is for participant. So we can replace them to 5 buttons with responsive design, that is, no redirecting happens after clicking on the buttons.
 
[[File:Picture7.png]]
 
 
 
== 5. Offer Better UI for Managing Users ==
 
Please refer to the later part of this article: [[#Manage-User page|Manage-User page]]
 
 
 
== 6. Make grades_view Responsive with Better Design ==
 
Similar to the previous tasks, we will need to make it responsive.
 
After clicking on ‘Your scores’, all reviews are loaded before rendering the page now. That takes a long time to load and the length of review list is too long.
 
[[File:Picture8.png]]
 
= Pages To Be Modified =
For this project, it is difficult to re-design all webpages and make them responsive. After discussing with the contact person, we will be focusing only on these 3 pages:
 
=='''Login Page'''==
[[File:Picture1.png]]
 
=== What's wrong with it ===
1. Now it takes more than 15 seconds to even login to the admin's home page with the sample development database. This is definitely not good enough for a daily used web application.
 
The cause for the long rendering time is that it will be redirected to 'tree_display/list', which needs to fetch all questionnaires, courses and assignments before page rendering.
 
2. The webpage looks primitive.
 
=== What changes need to be made ===
1. Actually it is unnecessary to fetch all data at the very beginning; if there is too much data to present, the list will be extremely long and it is quite hard for users to locate a specific course.
 
So, we will use AngularJS and jQuery to make a asynchronous webpage that delays the database query until all the basic html elements are correctly rendered, or until the user explicitly asks for that part of data.
 
2. Bootstrap will be applied for better user interfaces.


This system is still in nascent stage and have many performance related issues. It takes a long time (about 2 minutes) to generate single meta-review. This is an unacceptable performance statistics for Expertiza. We propose to re-factor code and identify the areas that affect the overall performance of the system. Few areas we identified in preliminary review are: <br />
:* Reading seed data from csv in each pass takes  up a lot of time. We can move this data into Mysql and use ActiveRecords to speed data fetch.
:* WordNet based semantic matching takes a lot of time. We will review the method used and present our finding about areas of concern.


== 1. Refactor Code==
<ol>
<li><b>Efficient Loop constructs. </b><br />
Description: Many loops over models are implemented using generic “for” loops.
Solution: As specified by Ruby guideline, we plan to use efficient ruby loops, such as “each” and “find_each”.
</li><li><b>Very large methods </b><br />
Description: Several methods have huge amount of code, which makes them difficult to understand and debug.
Solution: In most cases, large methods can be shortened through the use of smaller helper methods. Such methods could be reused across different components.
</li><li><b>Ambiguous method names </b><br />
Description: Many methods have ambiguity between the name used for them and the feature implemented by them.
Solution: We will rename such methods to clearly state the feature implemented by them.
</li><li><b>Legacy Code </b><br />
Description: As the system has been modified for bug fixes and enhancements, unnecessary code has accumulated.
Solution: Isolate and remove all dead code.
</li><li><b>Code beautification </b><br />
Description: Coding style used in gem is not based on Ruby on Rails style, which makes it difficult to read for any Ruby programmer.
Solution: Beautify the code with a consistent standard of documentation, and style.
</li></ol>


=='''Manage-Course page'''==
== 2. Upgrade system to use latest dependent ruby gems ==
[[File:Picture2.png]]
The libraries used by gem are very old. We plan to migrate the dependent libraries to their latest versions.<br />
Libraries, we have identified are: <br />
:* stanford-core-nlp <ref>http://nlp.stanford.edu/software/corenlp.shtml</ref>
:* rwordnet <ref>https://rubygems.org/gems/rwordnet</ref>
:* rjb <ref>https://rubygems.org/gems/rjb</ref>
:* bind-it<ref>https://rubygems.org/gems/bind-it</ref>


=== What's wrong with it ===
We will also migrate the project to use Java 8<ref>http://java.com/en/download/whatis_java.jsp</ref>.
As illustrated above, now the displaying list is too long and it takes too long to scroll to the end, which makes locating a specific course quite difficult.


Also, loading time can be reduced.
== 3. Migrate gem to Web service ==
Expertiza system tries to evaluate each review using an automated meta-review system. This system is packaged as a library and used by Expertiza. Automated Metareview system is an independent entity and can be used by other peer review systems as well.  There are many other peer review systems, which can benefit from this system, if this is available for them to evaluate their rubrics. We are working on migrating this system from a library to a web service.  


=== What changes need to be made ===
===3.1 Design ===
1. Create several buttons for different time intervals, for example, after clicking on the button '2013-2014', only courses for 2013 to 2014 are presenting, and there is no redirection during this process.


2. The database should not be executed before the user clicks on the button. Only parts of databased is fetched when each button is clicked.
The Metareview system is Natural Language Processing based system that compares the reviews written with the original article. The webservice will expose the "AutomatedMetareview" method.


3. Bootstrap will be applied for better user interfaces.
[[File:System Design.jpg|frame|center|upright=0.5|Web Service Design]]




The request JSON object to the method will have the following parameters :
:* original article
:* review written for this article
:* rubric used during article review.


=='''Manage-User page'''==
The web service will return the meta-review as a JSON object. The response JSON object will have the parameters mentioned below: <br />
[[File:Picture3.png]]
:* plagiarism
:* relevance
:* content_summative
:* content_problem
:* content_advisory
:* coverage
:* tone_positive
:* tone_negative
:* tone_neutral
:* quantity


=== What's wrong with it ===
[[File:Workflow.jpg|frame|center|Interaction between Client and Web Service]]
1. The list is too long: if there are 20k users in the database, there will be 20k rows in this table in a one page!


2. When clicking on the letter A-Z, redirections happen and the whole page is reloaded.
===3.2 Assumptions===
For the project, the code that is being modified is assumed to be correct and meet all feature requirements of the system. Interactions modified due to refactoring will not change the underline system definitions.


=== What changes need to be made ===
== 4. Testing==
1. Load only a given number of records to the view, such as 100 records, when accessing into this page for the first time.
We will be using the existing test suite used by gem to test any new code modification. We will be writing new test cases for web service implementation and any new public method exposed by existing classes.


2. Using AngularJS to eliminate the redirections and page reloading for better UI performance.
= References =
<references/>

Latest revision as of 03:05, 9 April 2015

E1527. Refactor Autometareviews gem and migration to Web-Service

Introduction to Autometareview project <ref>https://github.com/lramach/autometareviews0.1</ref>

This project is developed as part of Expertiza project <ref>http://wikis.lib.ncsu.edu/index.php/Expertiza</ref>.
The automated metareview tool identifies the quality of a review using natural language processing and machine learning techniques (completely automated). Feedback is provided to reviewers on the following metrics:

  1. Review relevance: This metric tells the reviewer how relevant the review is to the content of the author's submission. Numeric feedback in the scale of 0--1 is provided to indicate a review's relevance.
  2. Review Content Type: This metric identifies whether the review contains 'summative content' -- positive feedback, problem detection content' -- problems identified by reviewers in the author's work or 'advisory content' -- content indicating suggestions or advice provided by reviewers. A numeric feedback on the scale of 0--1 is provided for each content type to indicate whether the review contains that type of content.
  3. Review Coverage: This metric indicates the extent to which a review covers the main points of a submission. Numeric value in the range of 0--1 indicates the coverage of a review.
  4. Plagiarism<ref>http://www.plagiarism.org/plagiarism-101/what-is-plagiarism/</ref>: Indicates the presence of plagiarism in the review text.
  5. Tone: The metric indicates whether a review has a positive, negative or neutral tone.
  6. Quantity: Indicates the number of unique words used by the reviewer in the review.


Problem Statement

Currently, Autometareviews project is used as a gem<ref>http://guides.rubygems.org/what-is-a-gem/</ref> in Expertiza project. Purpose of this project is to migrate this gem to a web service<ref>http://en.wikipedia.org/wiki/Web_service</ref> and expose its methods on web, which can be consumed by any application as web service. Older gem was dependent old libraries<ref>http://en.wikipedia.org/wiki/Library_%28computing%29</ref> such as Stanford-core-nlp, rwordnet, etc. We will migrate them to new libraries without breaking the existing feature-set. We are also going to refactor the source code of this gem file to promote readability, reduced complexity, and code redundancies. We will fix any bug or bottleneck that we can find to improve the performance of this service. We will not add any new feature to the existing feature set provided by the gem. Before making any modification to the existing features, we will present them before Dr. Gehringer and his Expertiza team.

Scope

There are three separate scope items in this project -

  • Migration of existing gem application to a web service
  • Refactoring the existing ruby classes
  • Migrating to newer libraries, wherever possible.

The classes that we propose to refactor are -

  • tone.rb
  • degree_of_relevance.rb
  • wordnet_based_similarity.rb
  • sentence_state.rb
  • cluster_generation.rb
  • plagiarism_check.rb
  • graph_generator.rb
  • predict_class.rb
  • review_coverage.rb

No new feature will be developed as part of this project. Any major code change due to inclusion of newer libraries will be communicated to Expertiza project team. Existing code will be tested to ensure the functionality does not change.

Standards

All developed code will adhere to the ruby on rails coding guidelines<ref>https://docs.google.com/document/d/1qQD7fcypFk77nq7Jx7ZNyCNpLyt1oXKaq5G-W7zkV3k/edit</ref>.

List of Tasks

Metioned below are the tasks we will perform as part of this project.<ref>https://docs.google.com/document/d/10JTdEjCiRTre3nO4j_czBzhcxkqOSzmAT8jyiJHNz4c/edit#</ref>

This system is still in nascent stage and have many performance related issues. It takes a long time (about 2 minutes) to generate single meta-review. This is an unacceptable performance statistics for Expertiza. We propose to re-factor code and identify the areas that affect the overall performance of the system. Few areas we identified in preliminary review are:

  • Reading seed data from csv in each pass takes up a lot of time. We can move this data into Mysql and use ActiveRecords to speed data fetch.
  • WordNet based semantic matching takes a lot of time. We will review the method used and present our finding about areas of concern.

1. Refactor Code

  1. Efficient Loop constructs.
    Description: Many loops over models are implemented using generic “for” loops. Solution: As specified by Ruby guideline, we plan to use efficient ruby loops, such as “each” and “find_each”.
  2. Very large methods
    Description: Several methods have huge amount of code, which makes them difficult to understand and debug. Solution: In most cases, large methods can be shortened through the use of smaller helper methods. Such methods could be reused across different components.
  3. Ambiguous method names
    Description: Many methods have ambiguity between the name used for them and the feature implemented by them. Solution: We will rename such methods to clearly state the feature implemented by them.
  4. Legacy Code
    Description: As the system has been modified for bug fixes and enhancements, unnecessary code has accumulated. Solution: Isolate and remove all dead code.
  5. Code beautification
    Description: Coding style used in gem is not based on Ruby on Rails style, which makes it difficult to read for any Ruby programmer. Solution: Beautify the code with a consistent standard of documentation, and style.

2. Upgrade system to use latest dependent ruby gems

The libraries used by gem are very old. We plan to migrate the dependent libraries to their latest versions.
Libraries, we have identified are:

We will also migrate the project to use Java 8<ref>http://java.com/en/download/whatis_java.jsp</ref>.

3. Migrate gem to Web service

Expertiza system tries to evaluate each review using an automated meta-review system. This system is packaged as a library and used by Expertiza. Automated Metareview system is an independent entity and can be used by other peer review systems as well. There are many other peer review systems, which can benefit from this system, if this is available for them to evaluate their rubrics. We are working on migrating this system from a library to a web service.

3.1 Design

The Metareview system is Natural Language Processing based system that compares the reviews written with the original article. The webservice will expose the "AutomatedMetareview" method.

Web Service Design


The request JSON object to the method will have the following parameters :

  • original article
  • review written for this article
  • rubric used during article review.

The web service will return the meta-review as a JSON object. The response JSON object will have the parameters mentioned below:

  • plagiarism
  • relevance
  • content_summative
  • content_problem
  • content_advisory
  • coverage
  • tone_positive
  • tone_negative
  • tone_neutral
  • quantity
Interaction between Client and Web Service

3.2 Assumptions

For the project, the code that is being modified is assumed to be correct and meet all feature requirements of the system. Interactions modified due to refactoring will not change the underline system definitions.

4. Testing

We will be using the existing test suite used by gem to test any new code modification. We will be writing new test cases for web service implementation and any new public method exposed by existing classes.

References

<references/>