CSC/ECE 517 Fall 2013/oss E816 cyy: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 6: Line 6:
==sentence_state.rb==
==sentence_state.rb==


==plagiarism_test.rb==
==plagiarism_check_test.rb==


To see the original code please go to this [https://github.com/expertiza/expertiza/blob/master/app/models/automated_metareview/plagiarism_check.rb link].
To see the original code please go to this [https://github.com/expertiza/expertiza/blob/master/app/models/automated_metareview/plagiarism_check.rb link].


The main responsibility of Plagiarism_Check is to determine whether the reviews are just copied from other sources.  
The main responsibility of Plagiarism_Check is to determine whether the reviews are just copied from other sources.  


Basically, there are four kinds of plagiarism need to be check :  
Basically, there are four kinds of plagiarism need to be check :  
Line 23: Line 24:




For example, in the test file: expertiza/test/unit/automated_metareview/plagiarism_check_test.rb
For example, in the test file: expertiza/test/unit/automated_metareview/plagiarism_check_test.rb,


the 1st test shows:
The 1st test shows:


  test "check for plagiarism true match" do
  test "check for plagiarism true match" do
Line 33: Line 34:
     instance = PlagiarismChecker.new
     instance = PlagiarismChecker.new
     assert_equal(true, instance.check_for_plagiarism(review_text, subm_text))
     assert_equal(true, instance.check_for_plagiarism(review_text, subm_text))
end
The check_for_plagiarism method compares the review text with submission text. In this case, the review text does not quote the words as well as sentences properly and the reviewer just copies what the author says, which cause a plagiarism.
From above point of view, the refactoring needs to be done with 4 fundamental methods and each method only does one thing correctly.  So as the initial file Plagiarism_check.rb indicates, the compare_reviews_with_questions_responses method has roughly 2 functions : compare reviews with review questions  as well as compare reviews with others’ responses, which makes us confused.  As the refactoring goes, we need to split the two functions up, and make sure such bad smells disappear.
The first thing to do is based on the above statement, we need to define 4 methods with different functions.
They are compare_reviews_with_submissions, compare_reviews_with_questions,
compare_reviews_with_responses, compare_reviews_with_google_search, each method has its specific functions. 
As showed above, we have to split the method compare_reviews_with_questions _responses up to 2 methods:
def compare_reviews_with_questions(auto_metareview, map_id)
…...
end
def compare_reviews_with_responses(auto_metareview, map_id)
…...
  end
  end



Revision as of 18:17, 30 October 2013

Introduction to Refactoring plagiarism_check.rb and sentence_state.rb

Project description

Design

sentence_state.rb

plagiarism_check_test.rb

To see the original code please go to this link.

The main responsibility of Plagiarism_Check is to determine whether the reviews are just copied from other sources.


Basically, there are four kinds of plagiarism need to be check :

1. whether the review is copied from the submissions of the assignment

2. whether the review is copied from the review questions

3. whether the review is copied from other reviews

4. whether the review is copied from the Internet or other sources, this may be detected through google search


For example, in the test file: expertiza/test/unit/automated_metareview/plagiarism_check_test.rb,

The 1st test shows:

test "check for plagiarism true match" do
   review_text = ["The sweet potatoes in the vegetable bin are green with mold. These sweet potatoes in the vegetable bin are fresh."]
   subm_text = ["The sweet potatoes in the vegetable bin are green with mold. These sweet potatoes in the vegetable bin are fresh."]
  
   instance = PlagiarismChecker.new
   assert_equal(true, instance.check_for_plagiarism(review_text, subm_text))
end

The check_for_plagiarism method compares the review text with submission text. In this case, the review text does not quote the words as well as sentences properly and the reviewer just copies what the author says, which cause a plagiarism.

From above point of view, the refactoring needs to be done with 4 fundamental methods and each method only does one thing correctly. So as the initial file Plagiarism_check.rb indicates, the compare_reviews_with_questions_responses method has roughly 2 functions : compare reviews with review questions as well as compare reviews with others’ responses, which makes us confused. As the refactoring goes, we need to split the two functions up, and make sure such bad smells disappear.

The first thing to do is based on the above statement, we need to define 4 methods with different functions.

They are compare_reviews_with_submissions, compare_reviews_with_questions, compare_reviews_with_responses, compare_reviews_with_google_search, each method has its specific functions.


As showed above, we have to split the method compare_reviews_with_questions _responses up to 2 methods:

def compare_reviews_with_questions(auto_metareview, map_id)
…...
end
def compare_reviews_with_responses(auto_metareview, map_id)
…...
end

Test Our Code

Link to VCL

The purpose of running the VCL server is to let you make sure that expertiza is still working properly using our refactored code. The first VCL link is seeded with the expertiza-scrubbed.sql file which includes questionnaires and courses and assignments so that it is easy to verify that reviews work. You only need to make users and then have them do reviews on one another. The second link is only using the test.sql file but you can still verify that the functionality of expertiza works. If neither of these links work, please do not do your review in a hurry, shoot us an email, we will fix it as soon as possible. (yhuang25@ncsu.edu, ysun6@ncsu.edu, grimes.caroline@gmail.com). Thank you so much!

1. http://152.46.20.30:3000/ Username: admin, password:password

2. If the first one does not work, please use this one. http://vclv99-129.hpc.ncsu.edu:3000 Username: admin, password: admin

Git Forked Repository URL

https://github.com/shanfangshuiyuan/expertiza

Steps to Setup Project

1. Use the git repository shown above.

2. Use ruby 1.9.3

3. Setup mysql and start server

4. Command line: bundle install

5. Download from http://dev.mysql.com/get/Downloads/Connector-C/mysql-connector-c-noinstall-6.0.2-win32.zip/from/pick and copy all files from the lib folder from the download into <Ruby193>\bin

6. Change /config/database.yml according your mysql root password and mysql port.

7. Command line: db:create:all

8. Command line: mysql -u root -p <YOUR_PASSWORD> pg_development < expertiza-scrubbed_2013_07_10.sql

9. Command line: rake db:migrate

10. Command line: rails server

Test Our Code

1. Set up the project following the steps above 2. Run db:test:prepare 3. Run plagiarism_check_test.rb and sentence_state_test.rb, they are under /test/unit/automated_metareview. After refactoring, all tests passed without error. 4. sentence_state.rb and plagiarism_check.rb are under /app/models/automated_metareview

Files Changed

text_preprocessing.rb plagiarism_check.rb sentence_state.rb tagged_sentence.rb constants.rb negations.rb plagiarism_check_test.rb sentence_state_test.rb

Future work

References