<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.expertiza.ncsu.edu/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Semhatr2</id>
	<title>Expertiza_Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.expertiza.ncsu.edu/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Semhatr2"/>
	<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=Special:Contributions/Semhatr2"/>
	<updated>2026-04-16T14:52:06Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82237</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82237"/>
		<updated>2013-10-31T04:12:00Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&amp;lt;ref&amp;gt;http://www.agiledata.org/essays/tdd.html&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] &amp;lt;ref&amp;gt;http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber&amp;lt;/ref&amp;gt;for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks &amp;lt;ref&amp;gt;https://github.com/cucumber/cucumber/wiki/Hooks &amp;lt;/ref&amp;gt;which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
The original Expertiza code had lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
Even simple steps like submitting an assignment and performing reviews took a lot of debugging and bug fixing in various other files not related to our class DegreeOfRelevance.&lt;br /&gt;
Lot of routes were missing from the routes.rb file, which we fixed. Here is one of the sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
There were many errors related to Aspell gem, which is used as a spell-checker for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82232</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82232"/>
		<updated>2013-10-31T04:07:49Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&amp;lt;ref&amp;gt;http://www.agiledata.org/essays/tdd.html&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] &amp;lt;ref&amp;gt;http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber&amp;lt;/ref&amp;gt;for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks &amp;lt;ref&amp;gt;https://github.com/cucumber/cucumber/wiki/Hooks &amp;lt;/ref&amp;gt;which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82230</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82230"/>
		<updated>2013-10-31T04:05:19Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&amp;lt;ref&amp;gt;http://www.agiledata.org/essays/tdd.html&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] &amp;lt;ref&amp;gt;http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber&amp;lt;/ref&amp;gt;for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks &amp;lt;ref&amp;gt;https://github.com/cucumber/cucumber/wiki/Hooks &amp;lt;/ref&amp;gt;which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82229</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82229"/>
		<updated>2013-10-31T04:04:15Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Testing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&amp;lt;ref&amp;gt;http://www.agiledata.org/essays/tdd.html&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] &amp;lt;ref&amp;gt;http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber&amp;lt;/ref&amp;gt;for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks &amp;lt;ref&amp;gt;https://github.com/cucumber/cucumber/wiki/Hooks &amp;lt;/ref&amp;gt;which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82227</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82227"/>
		<updated>2013-10-31T04:02:38Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Testing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] &amp;lt;ref&amp;gt;http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber&amp;lt;/ref&amp;gt;for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks &amp;lt;ref&amp;gt;https://github.com/cucumber/cucumber/wiki/Hooks &amp;lt;/ref&amp;gt;which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82226</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82226"/>
		<updated>2013-10-31T04:01:36Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Testing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] &amp;lt;ref&amp;gt;http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;https://github.com/cucumber/cucumber/wiki/Hooks &amp;lt;/ref&amp;gt;for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82225</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82225"/>
		<updated>2013-10-31T04:00:53Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82224</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82224"/>
		<updated>2013-10-31T03:59:12Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;ref&amp;gt;http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber&amp;lt;/ref&amp;gt;&lt;br /&gt;
&amp;lt;ref&amp;gt;https://github.com/cucumber/cucumber/wiki/Hooks &amp;lt;/ref&amp;gt;&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82223</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82223"/>
		<updated>2013-10-31T03:58:52Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;ref&amp;gt;http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber &amp;lt;/ref&amp;gt;&lt;br /&gt;
&amp;lt;ref&amp;gt;https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks&amp;lt;/ref&amp;gt;&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82221</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82221"/>
		<updated>2013-10-31T03:56:02Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
4. [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
&lt;br /&gt;
5.[https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82220</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82220"/>
		<updated>2013-10-31T03:55:45Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
 4. [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
&lt;br /&gt;
 5.[https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82219</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82219"/>
		<updated>2013-10-31T03:55:15Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
4. [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
5.[https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82218</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82218"/>
		<updated>2013-10-31T03:54:14Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Testing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
# [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
# [https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82217</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82217"/>
		<updated>2013-10-31T03:53:28Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Testing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
# [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
# [https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82216</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82216"/>
		<updated>2013-10-31T03:52:54Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Testing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png|frame]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png|frame|&amp;lt;ref&amp;gt;http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
# [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
# [https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82215</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82215"/>
		<updated>2013-10-31T03:51:14Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
# [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
# [https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82214</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82214"/>
		<updated>2013-10-31T03:49:52Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
# [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
# [https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;br /&gt;
# [http://agilefeedback.blogspot.com/2012/07/tdd-do-you-speak.html]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82198</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82198"/>
		<updated>2013-10-31T03:40:59Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
 # [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
 #[https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82197</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82197"/>
		<updated>2013-10-31T03:40:34Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
 [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
 [https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82196</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82196"/>
		<updated>2013-10-31T03:40:18Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
# [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
# [https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82195</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82195"/>
		<updated>2013-10-31T03:39:56Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
# [http://net.tutsplus.com/tutorials/ruby/ruby-for-newbies-testing-web-apps-with-capybara-and-cucumber Testing with Cucumber ]&lt;br /&gt;
# [https://github.com/cucumber/cucumber/wiki/Hooks Cucumber Hooks]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82185</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82185"/>
		<updated>2013-10-31T03:34:19Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Conclusion */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also, we tested the functionality of the class using Cucumber, behavior driven approach.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82184</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82184"/>
		<updated>2013-10-31T03:32:39Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original Expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. Lot of routes were missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also we successfully tested the functionality using Cucumber.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82179</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82179"/>
		<updated>2013-10-31T03:30:00Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Cucumber */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used [http://en.wikipedia.org/wiki/Cucumber_(software) Cucumber] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. There a lot of routes missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also we successfully tested the functionality using Cucumber.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82175</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82175"/>
		<updated>2013-10-31T03:29:02Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Testing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*[http://en.wikipedia.org/wiki/ Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used Cucumber[http://en.wikipedia.org/wiki/Cucumber_(software)] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. There a lot of routes missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also we successfully tested the functionality using Cucumber.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82170</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=82170"/>
		<updated>2013-10-31T03:27:07Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
In word-order graph vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Re-factoring =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** CompareGraphEdges&lt;br /&gt;
** CompareGraphSVOEdges&lt;br /&gt;
** CompareGraphVertices&lt;br /&gt;
** DegreeOfRelevance&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
The main class DegreeOfRelevance calculates the scaled relevance using the formula described above. It calculates the relevance based on comparisons done between submission and review graphs. We have grouped together similar functionality in one class. Following is a brief description of the classes.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphEdges:&lt;br /&gt;
This class compares edges from across two graphs. Related methods compare_edges_non_syntax_diff and compare_edges_syntax_diff are grouped together in this class. This ensures that this class is just responsible for making comparison between graph edges and return the appropriate metric to the main class DegreeOfRelevance.&lt;br /&gt;
&lt;br /&gt;
* Class  CompareGraphVertices:&lt;br /&gt;
This class is responsible for comparing every vertex of submission graph with every vertex of review graph. This class contains only one method compare_vertices. Its sole responsibility is comparing vertices and calculating the appropriate metrics required by the main class(DegreeOfRelevance).&lt;br /&gt;
&lt;br /&gt;
* Class CompareGraphSVOEdges:&lt;br /&gt;
This class is responsible for making comparison between edges of type SUBJECT-VERB-OBJECT.Related methods - compare_SVO_edges and compare_SVO_diff_syntax are grouped together in this class.&lt;br /&gt;
&lt;br /&gt;
The whole purpose of creating such a design is to make clear separation of logic in different classes. Although each of these classes contain helper methods to be used by the main method(get_relevance) to calculate relevance, yet this kind of design improves maintainability and readability of code for any programmer who is new to this code. The figure below illustrates the concept used in the design.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
=== Types of Re-factoring ===&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
&lt;br /&gt;
* '''Miscellaneous'''&lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
 end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* After complete re-factoring, the grade for the class DegreeOfRelevance changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/ Ruby_on_Rails] and [http://en.wikipedia.org/wiki/Test_automation Automated Testing] go hand in hand. Rails ships in with a built-in test framework and if we want, we can replace the existing with Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails either don't test their projects at all, or at best only add a few test specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; that adding extra work seems unimportant. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features demanded by clients or bosses. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*Test Driven Development[http://en.wikipedia.org/wiki/Test-driven_development] is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring it to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*Behaviour Driven Development[http://en.wikipedia.org/wiki/Behavior-driven_development] combines the general techniques and principles of TDD to provide software developers and business analysts to collaborate on productive software development.&lt;br /&gt;
&lt;br /&gt;
We have used the BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used Cucumber[http://en.wikipedia.org/wiki/Cucumber_(software)] for testing.&lt;br /&gt;
The Given, When, Then syntax used by Cucumber is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when it comes across that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
[[File:cucumber.png|frame|none]]&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. There a lot of routes missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also we successfully tested the functionality using Cucumber.&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:Cucumber.png&amp;diff=82164</id>
		<title>File:Cucumber.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:Cucumber.png&amp;diff=82164"/>
		<updated>2013-10-31T03:26:15Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: Cucumber test&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Cucumber test&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82160</id>
		<title>File:Testscenario.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82160"/>
		<updated>2013-10-31T03:25:03Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: uploaded a new version of &amp;amp;quot;File:Testscenario.png&amp;amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Test cases execution&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82159</id>
		<title>File:Testscenario.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82159"/>
		<updated>2013-10-31T03:24:21Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: uploaded a new version of &amp;amp;quot;File:Testscenario.png&amp;amp;quot;: Reverted to version as of 03:22, 31 October 2013&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Test cases execution&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82158</id>
		<title>File:Testscenario.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82158"/>
		<updated>2013-10-31T03:23:45Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: uploaded a new version of &amp;amp;quot;File:Testscenario.png&amp;amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Test cases execution&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82155</id>
		<title>File:Testscenario.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82155"/>
		<updated>2013-10-31T03:22:02Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: uploaded a new version of &amp;amp;quot;File:Testscenario.png&amp;amp;quot;: Test cases executionsss&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Test cases execution&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82154</id>
		<title>File:Testscenario.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82154"/>
		<updated>2013-10-31T03:20:47Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: uploaded a new version of &amp;amp;quot;File:Testscenario.png&amp;amp;quot;: Test cases execution&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Test cases execution&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82153</id>
		<title>File:Testscenario.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:Testscenario.png&amp;diff=82153"/>
		<updated>2013-10-31T03:20:20Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: Test cases execution&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Test cases execution&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81597</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81597"/>
		<updated>2013-10-30T23:36:38Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Conclusion */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance &amp;lt;ref&amp;gt;http://www.smartinsights.com/wp-content/uploads/2012/05/relevance-rankmaniac.jpg&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
Reviews are text-based feedback provided by a reviewer to the author of a submission. Reviews are used not only in education to assess student work, but also in e-commerce applications, to assess the quality of products on sites like Amazon, e-bay etc. Since reviews play a crucial role in providing feedback to people who make assessment decisions (e.g. deciding a student’s grade, choosing to purchase a product) it is important to find out how relevant reviews are to a submission &amp;lt;ref&amp;gt;http://repository.lib.ncsu.edu/ir/handle/1840.16/8813&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The class ''DegreeOfRelevance'' in Expertiza is used to calculate how relevant one piece of text is to another. It calculates relevance between the submitted work and reviews(or meta reviews) for an assignment. It is important to do this evaluation to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact students' grade.&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' presently contains long and complex methods and a lot of duplicated code. It has been assigned grade &amp;quot;F&amp;quot; by [https://codeclimate.com/ Code Climate], on metrics such as code complexity, duplication, lines of code per method etc. Since this file is important for the research on Expertiza, it should be re-factored to reduce complexity and duplication issues and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly.&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' file calculates [http://en.wikipedia.org/wiki/Relevance relevance] between submission and review [http://en.wikipedia.org/wiki/Graph_%28data_structure%29 graphs]. The algorithm present in the file implements a variance of [http://en.wikipedia.org/wiki/Dependency_graph dependency graphs] called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
Vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined punctuation list. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. [http://nlp.stanford.edu/software/tagger.shtml Stanford NLP POS tagger] is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Wordnet WordNet] is used to identify [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of [http://en.wikipedia.org/wiki/Text_normalization normalizations] (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing [http://en.wikipedia.org/wiki/Semantic_relatedness semantic relatedness] between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. The match is referred to as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, [http://en.wikipedia.org/wiki/Metadata_discovery#Lexical_Matching lexical] and [http://en.wikipedia.org/wiki/Nominalization nominalized]. E&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and E&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this case, only single and double edges are considered, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Structurem.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and T&amp;lt;sub&amp;gt;r&amp;lt;/sub&amp;gt; and T&amp;lt;sub&amp;gt;s&amp;lt;/sub&amp;gt; are the number of double edges in the review and submission texts respectively. match&amp;lt;sub&amp;gt;ord&amp;lt;/sub&amp;gt; and match&amp;lt;sub&amp;gt;voice&amp;lt;/sub&amp;gt; are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to ''degree_of_relevance.rb''. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. &lt;br /&gt;
&lt;br /&gt;
[[File:ModifiedEq.png|frame|none|Formula used in Expertiza‎]]&lt;br /&gt;
&lt;br /&gt;
For example, a possible set of values could be:&lt;br /&gt;
 alpha = 0.55&lt;br /&gt;
 beta = 0.35&lt;br /&gt;
 gamma = 0.1 &lt;br /&gt;
&lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of type SUBJECT-VERB-OBJECT. The comparison done is between edges of same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
This method also does comparison between edges of type SUBJECT-VERB-OBJECT. However, the comparison done is between edges of different types.&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
''degree_of_relevance.rb'' is buried deep inside models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in ''response_controller.rb'' which renders the page where user fills up the review. On submission, it calls ''create'' method of the same file, which then redirects to ''saving'' method in the same file again. ''saving'' makes a call to ''calculate_metareview_metrics'' in ''automated_metareview.rb'' which eventually calls ''get_relevance'' in ''degree_of_relevance.rb''. The figure below shows the flow in the code.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
= Implementation =&lt;br /&gt;
&lt;br /&gt;
=== Design Changes ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from [http://en.wikipedia.org/wiki/Template_method_pattern Template design pattern] to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses, allowing us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to segregate the functionality, making a logical separation of code.&lt;br /&gt;
* These classes are:&lt;br /&gt;
** compare_graph_edges.rb&lt;br /&gt;
** compare_graph_svo_edges.rb&lt;br /&gt;
** compare_graph_vertices.rb&lt;br /&gt;
** degree_of_relevance.rb&lt;br /&gt;
* Extracted common code in methods that can be re-used.&lt;br /&gt;
&lt;br /&gt;
After complete refactoring, we could find that the grade according to Code Climate for the main class is &amp;quot;C&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Refactoring ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==== Design of classes ====&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calculates the scaled relevance using the formula described above. It calculates the relevance based on comparison of submission and review graphs. As described above the following types of comparison is made between the graphs and various metrics is calculated which is used to calculate the relevance:&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_edges.rb:&lt;br /&gt;
** Comparing edges of graphs with non syntax difference: In this SUBJECT-VERB edges are compared with SUBJECT-VERB matches where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done.&lt;br /&gt;
** Comparing edges with syntax diff: Compares the edges from across the two graphs to identify matches and quantify various metrics. Compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa where SUBJECT-OBJECT and VERB_VERB comparisons are done - same type comparisons.&lt;br /&gt;
** Comparing edges with diff types: Compares the edges from across the two graphs to identify matches and quantify various metrics compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT comparisons are done. (Different type comparisons)&lt;br /&gt;
&lt;br /&gt;
All the above functions are grouped in one class - compare_graph_edges.rb.&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_vertices.rb:&lt;br /&gt;
** Comparing vertices of the corresponding graphs: Every vertex is compared with every other vertex. Compares the vertices from across the two graphs to identify matches and quantify various metrics.&lt;br /&gt;
&lt;br /&gt;
This method is factored out to the class - compare_graph_vertices.rb&lt;br /&gt;
&lt;br /&gt;
* Class compare_graph_SVO_edges&lt;br /&gt;
&lt;br /&gt;
** comparing SVO edges.&lt;br /&gt;
** compare SVO edges with different syntax.&lt;br /&gt;
&lt;br /&gt;
These methods are grouped in the compare_graph_SVO_edges.rb class.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png|frame|none|Main class calling helper classes]]&lt;br /&gt;
&lt;br /&gt;
As shown in the figure above ,the main class degree_of_relevance.rb calls the helper methods in the classes to the right to get the appropriate metrics required for evaluating relevance. This implements delegation of functionality to helper classes making a clear segregation of role of each class.&lt;br /&gt;
&lt;br /&gt;
==== Types of Refactoring ====&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''.&lt;br /&gt;
&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* The grade for the refactored class changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
Ruby on Rails and automated testing go hand in hand. Rails ships with a built-in test framework; if it’s not to your liking you can replace it with one of your liking like Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails are either not testing their projects at all, or at best only adding a few token specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; adding an extra layer of work seems like just that—extra work. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features our clients or bosses demand. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*Test Driven Development:&lt;br /&gt;
Test-Driven Development (TDD) is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*Behaviour Driven Development&lt;br /&gt;
Behavior-driven development (abbreviated BDD) is a software development process based on test-driven development (TDD). Behavior-driven development combines the general techniques and principles of TDD to provide software developers and business analysts with shared tools and a shared process to collaborate on software development, with the aim of delivering &amp;quot;software that matters&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
We have used BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Tests are small, automated Ruby programs that automatically test different parts of your applications, It can apply at multiple levels, e.g., Customer Tests, Integration Tests, Unit Tests.&lt;br /&gt;
In this approach, we have to follow a rigorous cycle. Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used Cucumber for testing.Cucumber lets software development teams describe how software should behave in plain text. The text is written in a business-readable domain-specific language and serves as documentation, automated tests and development-aid - all rolled into one format.&lt;br /&gt;
&lt;br /&gt;
Cucumber is a testing framework that helps to bridge the gap between software developers and business managers. &lt;br /&gt;
Because behaviour driven development is partly about understanding what the client wants before you begin coding, Cucumber aims to make its tests readable by clients (AKA, non-programmers). So, Tests are written in plain language like gherkin in the form of Given, When, Then, which any layperson can understand. Test cases are then placed into feature files that cover one or more test scenarios.&lt;br /&gt;
&lt;br /&gt;
The BDD Given, When, Then syntax is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when finding that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
&amp;lt;output scrrenshot&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. There a lot of routes missing from the routes.rb file, which we fixed. I have attached a sample routing error we faced.&lt;br /&gt;
&lt;br /&gt;
[[File:12.png|frame|none|Routing error‎]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Encountered following error when trying to give permission to user to review other's work.&lt;br /&gt;
&lt;br /&gt;
[[File:16.png|frame|none|Error while trying to assign review permission]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
5. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
&lt;br /&gt;
We were successful in refactoring the given class of Degree of Relevance by using the ideas of Template design pattern and adapting its implementation according to our needs.&lt;br /&gt;
According to Code-Climate the grade for the previous class was 'F', while after refactoring the class the grade is 'C'.&lt;br /&gt;
Also we successfully tested the functionality using Cucumber.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;&amp;lt;editttttt later&amp;gt;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=References=&lt;br /&gt;
http://sourcemaking.com/design_patterns/template_method&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81383</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81383"/>
		<updated>2013-10-30T21:42:15Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
The class degree_of_relevance.rb is used to how relevant one piece of text is to another piece of text.  It is used to evaluate the relevance between the submitted work and review(or metareview) for an assignment. It is important to evaluate the reviews for a submitted work to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact student's grade. &lt;br /&gt;
This class contains a lot of duplicated code and has long and complex methods. It has been assigned grade &amp;quot;F&amp;quot;, according to metric like code complexity, duplication, lines of code per method etc. Since this class is important for the research on expertiza, it should be re-factored to reduce its complexity, duplication and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly. This class can be found at the following location in expertiza source code - Expertiza\expertiza\app\models\automated_metareview&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
degree_of_relevance.rb file calculates the relevance between submission and review graphs. The algorithm present in the file implements a variance of dependency graphs called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
Vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined list of punctuation. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. Stanford NLP POS tagger is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
WordNet is used to identify semantic relatedness between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of normalizations (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing semantic relatedness between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. We refer to the match as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, lexical and nominalized. Er and Es represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this work we consider only single and double edges, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and Tr and Ts are the number of double edges in the review and submission texts respectively. matchord and matchvoice are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to degree_of_relevance.rb. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. For example, a possible set of values could be:&lt;br /&gt;
    alpha = 0.55&lt;br /&gt;
    beta = 0.35&lt;br /&gt;
    gamma = 0.1 &lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
degree_of_relevance.rb is buried deep under models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
Once a review is submitted, the further flow is as follows:&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in response_controller.rb which renders the page where user fills up the review. On submission, it calls ''create'' of the same file, which then redirects to ''saving'' in the same file again. Saving makes a call to ''calculate_metareview_metrics'' in automated_metareview.rb which eventually calls ''get_relevance'' in degree_of_relevance.rb.&lt;br /&gt;
&lt;br /&gt;
= Implementation =&lt;br /&gt;
&lt;br /&gt;
== New Design and Refactoring ==&lt;br /&gt;
&lt;br /&gt;
=== New Design ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from the Template design pattern to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses (used in Template design pattern), allowed us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to  segregate the functionality making a logical separation of code.&lt;br /&gt;
* These classes are -&lt;br /&gt;
** compare_graph_edges.rb&lt;br /&gt;
** compare_graph_svo_edges.rb&lt;br /&gt;
** compare_graph_vertices.rb&lt;br /&gt;
** degree_of_relevance.rb&lt;br /&gt;
&lt;br /&gt;
* Extracted common code in methods that be re-used.&lt;br /&gt;
* After refactoring the grade according to codeclimate for the class is &amp;quot;C&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Refactoring ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==== Design of classes ====&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calculates the scaled relevance using the formula described above. It calculates the relevance based on comparison of submission and review graphs. As described above the following types of comparison is made between the graphs and various metrics is calculated which is used to calculate the relevance:&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_edges.rb:&lt;br /&gt;
** Comparing edges of graphs with non syntax difference: In this SUBJECT-VERB edges are compared with SUBJECT-VERB matches where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done.&lt;br /&gt;
** Comparing edges with syntax diff: Compares the edges from across the two graphs to identify matches and quantify various metrics. Compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa where SUBJECT-OBJECT and VERB_VERB comparisons are done - same type comparisons.&lt;br /&gt;
** Comparing edges with diff types: Compares the edges from across the two graphs to identify matches and quantify various metrics compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT comparisons are done. (Different type comparisons)&lt;br /&gt;
&lt;br /&gt;
All the above functions are grouped in one class - compare_graph_edges.rb.&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_vertices.rb:&lt;br /&gt;
** Comparing vertices of the corresponding graphs: Every vertex is compared with every other vertex. Compares the vertices from across the two graphs to identify matches and quantify various metrics.&lt;br /&gt;
&lt;br /&gt;
This method is factored out to the class - compare_graph_vertices.rb&lt;br /&gt;
&lt;br /&gt;
* Class compare_graph_SVO_edges&lt;br /&gt;
&lt;br /&gt;
** comparing SVO edges.&lt;br /&gt;
** compare SVO edges with different syntax.&lt;br /&gt;
&lt;br /&gt;
These methods are grouped in the compare_graph_SVO_edges.rb class.&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calls each of these methods to get the appropriate metrics required for evaluating relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png]]&lt;br /&gt;
&lt;br /&gt;
==== Types of Refactoring ====&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''.&lt;br /&gt;
&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* The grade for the refactored class changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
Ruby on Rails and automated testing go hand in hand. Rails ships with a built-in test framework; if it’s not to your liking you can replace it with one of your liking like Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails are either not testing their projects at all, or at best only adding a few token specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; adding an extra layer of work seems like just that—extra work. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features our clients or bosses demand. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*Test Driven Development:&lt;br /&gt;
Test-Driven Development (TDD) is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*Behaviour Driven Development&lt;br /&gt;
Behavior-driven development (abbreviated BDD) is a software development process based on test-driven development (TDD). Behavior-driven development combines the general techniques and principles of TDD to provide software developers and business analysts with shared tools and a shared process to collaborate on software development, with the aim of delivering &amp;quot;software that matters&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
We have used BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Tests are small, automated Ruby programs that automatically test different parts of your applications, It can apply at multiple levels, e.g., Customer Tests, Integration Tests, Unit Tests.&lt;br /&gt;
In this approach, we have to follow a rigorous cycle. Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used Cucumber for testing.Cucumber lets software development teams describe how software should behave in plain text. The text is written in a business-readable domain-specific language and serves as documentation, automated tests and development-aid - all rolled into one format.&lt;br /&gt;
&lt;br /&gt;
Cucumber is a testing framework that helps to bridge the gap between software developers and business managers. &lt;br /&gt;
Because behaviour driven development is partly about understanding what the client wants before you begin coding, Cucumber aims to make its tests readable by clients (AKA, non-programmers). So, Tests are written in plain language like gherkin in the form of Given, When, Then, which any layperson can understand. Test cases are then placed into feature files that cover one or more test scenarios.&lt;br /&gt;
&lt;br /&gt;
The BDD Given, When, Then syntax is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when finding that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Steps for Testing==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
&amp;lt;output scrrenshot&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
=Issues=&lt;br /&gt;
&lt;br /&gt;
1. The original expertiza code had a lot of bugs and the application was throwing a lot of errors at each step.&lt;br /&gt;
&lt;br /&gt;
2. Even simple steps like submitting an assignment and performing reviews took a lot of debugging and fixing bugs in various other files not related to our class (degree_of_relevance.rb).&lt;br /&gt;
&lt;br /&gt;
3. There a lot of routes missing from the routes.rb file, which we fixed.&lt;br /&gt;
&lt;br /&gt;
4. Overall effort spent in debugging and fixing the existing code took is almost twice that spent on refactoring the class.&lt;br /&gt;
&lt;br /&gt;
5. There were a lot of errors related to  Aspell gem which is needed for spell-checker to be used for checking relevance between the submissions and reviews.&lt;br /&gt;
&lt;br /&gt;
6. We were in touch of another group facing similar problems and they even tried to contact Lakshmi to resolve the issues.&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
= References =&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81372</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81372"/>
		<updated>2013-10-30T21:30:12Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
The class degree_of_relevance.rb is used to how relevant one piece of text is to another piece of text.  It is used to evaluate the relevance between the submitted work and review(or metareview) for an assignment. It is important to evaluate the reviews for a submitted work to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact student's grade. &lt;br /&gt;
This class contains a lot of duplicated code and has long and complex methods. It has been assigned grade &amp;quot;F&amp;quot;, according to metric like code complexity, duplication, lines of code per method etc. Since this class is important for the research on expertiza, it should be re-factored to reduce its complexity, duplication and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly. This class can be found at the following location in expertiza source code - Expertiza\expertiza\app\models\automated_metareview&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
degree_of_relevance.rb file calculates the relevance between submission and review graphs. The algorithm present in the file implements a variance of dependency graphs called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
Vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined list of punctuation. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. Stanford NLP POS tagger is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
WordNet is used to identify semantic relatedness between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of normalizations (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing semantic relatedness between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. We refer to the match as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, lexical and nominalized. Er and Es represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this work we consider only single and double edges, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and Tr and Ts are the number of double edges in the review and submission texts respectively. matchord and matchvoice are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to degree_of_relevance.rb. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. For example, a possible set of values could be:&lt;br /&gt;
    alpha = 0.55&lt;br /&gt;
    beta = 0.35&lt;br /&gt;
    gamma = 0.1 &lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
degree_of_relevance.rb is buried deep under models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
Once a review is submitted, the further flow is as follows:&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in response_controller.rb which renders the page where user fills up the review. On submission, it calls ''create'' of the same file, which then redirects to ''saving'' in the same file again. Saving makes a call to ''calculate_metareview_metrics'' in automated_metareview.rb which eventually calls ''get_relevance'' in degree_of_relevance.rb.&lt;br /&gt;
&lt;br /&gt;
= Implementation =&lt;br /&gt;
&lt;br /&gt;
== New Design and Refactoring ==&lt;br /&gt;
&lt;br /&gt;
=== New Design ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from the Template design pattern to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses (used in Template design pattern), allowed us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to  segregate the functionality making a logical separation of code.&lt;br /&gt;
* These classes are -&lt;br /&gt;
** compare_graph_edges.rb&lt;br /&gt;
** compare_graph_svo_edges.rb&lt;br /&gt;
** compare_graph_vertices.rb&lt;br /&gt;
** degree_of_relevance.rb&lt;br /&gt;
&lt;br /&gt;
* Extracted common code in methods that be re-used.&lt;br /&gt;
* After refactoring the grade according to codeclimate for the class is &amp;quot;C&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Refactoring ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==== Design of classes ====&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calculates the scaled relevance using the formula described above. It calculates the relevance based on comparison of submission and review graphs. As described above the following types of comparison is made between the graphs and various metrics is calculated which is used to calculate the relevance:&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_edges.rb:&lt;br /&gt;
** Comparing edges of graphs with non syntax difference: In this SUBJECT-VERB edges are compared with SUBJECT-VERB matches where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done.&lt;br /&gt;
** Comparing edges with syntax diff: Compares the edges from across the two graphs to identify matches and quantify various metrics. Compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa where SUBJECT-OBJECT and VERB_VERB comparisons are done - same type comparisons.&lt;br /&gt;
** Comparing edges with diff types: Compares the edges from across the two graphs to identify matches and quantify various metrics compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT comparisons are done. (Different type comparisons)&lt;br /&gt;
&lt;br /&gt;
All the above functions are grouped in one class - compare_graph_edges.rb.&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_vertices.rb:&lt;br /&gt;
** Comparing vertices of the corresponding graphs: Every vertex is compared with every other vertex. Compares the vertices from across the two graphs to identify matches and quantify various metrics.&lt;br /&gt;
&lt;br /&gt;
This method is factored out to the class - compare_graph_vertices.rb&lt;br /&gt;
&lt;br /&gt;
* Class compare_graph_SVO_edges&lt;br /&gt;
&lt;br /&gt;
** comparing SVO edges.&lt;br /&gt;
** compare SVO edges with different syntax.&lt;br /&gt;
&lt;br /&gt;
These methods are grouped in the compare_graph_SVO_edges.rb class.&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calls each of these methods to get the appropriate metrics required for evaluating relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png]]&lt;br /&gt;
&lt;br /&gt;
==== Types of Refactoring ====&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''.&lt;br /&gt;
&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* The grade for the refactored class changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
Ruby on Rails and automated testing go hand in hand. Rails ships with a built-in test framework; if it’s not to your liking you can replace it with one of your liking like Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails are either not testing their projects at all, or at best only adding a few token specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; adding an extra layer of work seems like just that—extra work. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features our clients or bosses demand. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*Test Driven Development:&lt;br /&gt;
Test-Driven Development (TDD) is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*Behaviour Driven Development&lt;br /&gt;
Behavior-driven development (abbreviated BDD) is a software development process based on test-driven development (TDD). Behavior-driven development combines the general techniques and principles of TDD to provide software developers and business analysts with shared tools and a shared process to collaborate on software development, with the aim of delivering &amp;quot;software that matters&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
We have used BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Tests are small, automated Ruby programs that automatically test different parts of your applications, It can apply at multiple levels, e.g., Customer Tests, Integration Tests, Unit Tests.&lt;br /&gt;
In this approach, we have to follow a rigorous cycle. Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used Cucumber for testing.Cucumber lets software development teams describe how software should behave in plain text. The text is written in a business-readable domain-specific language and serves as documentation, automated tests and development-aid - all rolled into one format.&lt;br /&gt;
&lt;br /&gt;
Cucumber is a testing framework that helps to bridge the gap between software developers and business managers. &lt;br /&gt;
Because behaviour driven development is partly about understanding what the client wants before you begin coding, Cucumber aims to make its tests readable by clients (AKA, non-programmers). So, Tests are written in plain language like gherkin in the form of Given, When, Then, which any layperson can understand. Test cases are then placed into feature files that cover one or more test scenarios.&lt;br /&gt;
&lt;br /&gt;
The BDD Given, When, Then syntax is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when finding that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Procedure:==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
&amp;lt;output scrrenshot&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
= References =&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81369</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81369"/>
		<updated>2013-10-30T21:27:39Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
The class degree_of_relevance.rb is used to how relevant one piece of text is to another piece of text.  It is used to evaluate the relevance between the submitted work and review(or metareview) for an assignment. It is important to evaluate the reviews for a submitted work to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact student's grade. &lt;br /&gt;
This class contains a lot of duplicated code and has long and complex methods. It has been assigned grade &amp;quot;F&amp;quot;, according to metric like code complexity, duplication, lines of code per method etc. Since this class is important for the research on expertiza, it should be re-factored to reduce its complexity, duplication and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly. This class can be found at the following location in expertiza source code - Expertiza\expertiza\app\models\automated_metareview&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
degree_of_relevance.rb file calculates the relevance between submission and review graphs. The algorithm present in the file implements a variance of dependency graphs called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
Vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined list of punctuation. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. Stanford NLP POS tagger is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
WordNet is used to identify semantic relatedness between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of normalizations (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing semantic relatedness between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. We refer to the match as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, lexical and nominalized. Er and Es represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this work we consider only single and double edges, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and Tr and Ts are the number of double edges in the review and submission texts respectively. matchord and matchvoice are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to degree_of_relevance.rb. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. For example, a possible set of values could be:&lt;br /&gt;
    alpha = 0.55&lt;br /&gt;
    beta = 0.35&lt;br /&gt;
    gamma = 0.1 &lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
degree_of_relevance.rb is buried deep under models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
Once a review is submitted, the further flow is as follows:&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in response_controller.rb which renders the page where user fills up the review. On submission, it calls ''create'' of the same file, which then redirects to ''saving'' in the same file again. Saving makes a call to ''calculate_metareview_metrics'' in automated_metareview.rb which eventually calls ''get_relevance'' in degree_of_relevance.rb.&lt;br /&gt;
&lt;br /&gt;
= Implementation =&lt;br /&gt;
&lt;br /&gt;
== New Design and Refactoring ==&lt;br /&gt;
&lt;br /&gt;
=== New Design ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from the Template design pattern to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and deferring some steps to subclasses (used in Template design pattern), allowed us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to  segregate the functionality making a logical separation of code.&lt;br /&gt;
* These classes are -&lt;br /&gt;
** compare_graph_edges.rb&lt;br /&gt;
** compare_graph_svo_edges.rb&lt;br /&gt;
** compare_graph_vertices.rb&lt;br /&gt;
** degree_of_relevance.rb&lt;br /&gt;
&lt;br /&gt;
* Extracted common code in methods that be re-used.&lt;br /&gt;
* After refactoring the grade according to codeclimate for the class is &amp;quot;C&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Refactoring ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==== Design of classes ====&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calculates the scaled relevance using the formula described above. It calculates the relevance based on comparison of submission and review graphs. As described above the following types of comparison is made between the graphs and various metrics is calculated which is used to calculate the relevance:&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_edges.rb:&lt;br /&gt;
** Comparing edges of graphs with non syntax difference: In this SUBJECT-VERB edges are compared with SUBJECT-VERB matches where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done.&lt;br /&gt;
** Comparing edges with syntax diff: Compares the edges from across the two graphs to identify matches and quantify various metrics. Compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa where SUBJECT-OBJECT and VERB_VERB comparisons are done - same type comparisons.&lt;br /&gt;
** Comparing edges with diff types: Compares the edges from across the two graphs to identify matches and quantify various metrics compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT comparisons are done. (Different type comparisons)&lt;br /&gt;
&lt;br /&gt;
All the above functions are grouped in one class - compare_graph_edges.rb.&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_vertices.rb:&lt;br /&gt;
** Comparing vertices of the corresponding graphs: Every vertex is compared with every other vertex. Compares the vertices from across the two graphs to identify matches and quantify various metrics.&lt;br /&gt;
&lt;br /&gt;
This method is factored out to the class - compare_graph_vertices.rb&lt;br /&gt;
&lt;br /&gt;
* Class compare_graph_SVO_edges&lt;br /&gt;
&lt;br /&gt;
** comparing SVO edges.&lt;br /&gt;
** compare SVO edges with different syntax.&lt;br /&gt;
&lt;br /&gt;
These methods are grouped in the compare_graph_SVO_edges.rb class.&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calls each of these methods to get the appropriate metrics required for evaluating relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png]]&lt;br /&gt;
&lt;br /&gt;
==== Types of Refactoring ====&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''.&lt;br /&gt;
&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
* '''Re-written some methods containing complicated logic'''. &lt;br /&gt;
&lt;br /&gt;
Below is a sample code that we have re-written to make the logic more simple and maintainable. &lt;br /&gt;
&lt;br /&gt;
Before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    result = EQUAL&lt;br /&gt;
    if(!edge1.label.nil? and !edge2.label .nil?)&lt;br /&gt;
      if(edge1.label.downcase == edge2.label.downcase)&lt;br /&gt;
        result = EQUAL #divide by 1&lt;br /&gt;
      else&lt;br /&gt;
        result = DISTINCT #divide by 2&lt;br /&gt;
      end&lt;br /&gt;
    elsif((!edge1.label.nil? and !edge2.label.nil?) or (edge1.label.nil? and !edge2.label.nil? )) #if only one of the labels was null&lt;br /&gt;
        result = DISTINCT&lt;br /&gt;
    elsif(edge1.label.nil? and edge2.label.nil?) #if both labels were null!&lt;br /&gt;
        result = EQUAL&lt;br /&gt;
    end  &lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
After Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_labels(edge1, edge2)&lt;br /&gt;
    if(((!edge1.label.nil? &amp;amp;&amp;amp; !edge2.label.nil?) &amp;amp;&amp;amp;(edge1.label.downcase == edge2.label.downcase)) ||&lt;br /&gt;
        (edge1.label.nil? and edge2.label.nil?))&lt;br /&gt;
      result = EQUAL&lt;br /&gt;
    else&lt;br /&gt;
      result = DISTINCT&lt;br /&gt;
    end&lt;br /&gt;
    return result&lt;br /&gt;
  end # end of method&lt;br /&gt;
&lt;br /&gt;
We have achieved the following after refactoring:&lt;br /&gt;
* Reduced duplication significantly.&lt;br /&gt;
* Lines of code per method reduced from 75 to 25. This makes the code more readable and maintainable.&lt;br /&gt;
* Improved the design of classes by making logical separation of code into different classes.&lt;br /&gt;
* The grade for the refactored class changed from &amp;quot;F&amp;quot; to &amp;quot;C&amp;quot; according to codeclimate.&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
Ruby on Rails and automated testing go hand in hand. Rails ships with a built-in test framework; if it’s not to your liking you can replace it with one of your liking like Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails are either not testing their projects at all, or at best only adding a few token specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; adding an extra layer of work seems like just that—extra work. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features our clients or bosses demand. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*Test Driven Development:&lt;br /&gt;
Test-Driven Development (TDD) is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring to keep the design as simple as possible.&lt;br /&gt;
&lt;br /&gt;
[[File:Test-driven development.png]]&lt;br /&gt;
&lt;br /&gt;
*Behaviour Driven Development&lt;br /&gt;
Behavior-driven development (abbreviated BDD) is a software development process based on test-driven development (TDD). Behavior-driven development combines the general techniques and principles of TDD to provide software developers and business analysts with shared tools and a shared process to collaborate on software development, with the aim of delivering &amp;quot;software that matters&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
We have used BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Tests are small, automated Ruby programs that automatically test different parts of your applications, It can apply at multiple levels, e.g., Customer Tests, Integration Tests, Unit Tests.&lt;br /&gt;
In this approach, we have to follow a rigorous cycle. Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
[[File:TestDrivenGameDevelopment1.png]]&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used Cucumber with Capybara for testing.Cucumber lets software development teams describe how software should behave in plain text. The text is written in a business-readable domain-specific language and serves as documentation, automated tests and development-aid - all rolled into one format.&lt;br /&gt;
&lt;br /&gt;
Cucumber is a testing framework that helps to bridge the gap between software developers and business managers. &lt;br /&gt;
Because behaviour driven development is partly about understanding what the client wants before you begin coding, Cucumber aims to make its tests readable by clients (AKA, non-programmers). So, Tests are written in plain language like gherkin in the form of Given, When, Then, which any layperson can understand. Test cases are then placed into feature files that cover one or more test scenarios.&lt;br /&gt;
&lt;br /&gt;
The BDD Given, When, Then syntax is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when finding that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Capybara==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Procedure:==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
&amp;lt;output scrrenshot&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
= References =&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:TestDrivenGameDevelopment1.png&amp;diff=81366</id>
		<title>File:TestDrivenGameDevelopment1.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:TestDrivenGameDevelopment1.png&amp;diff=81366"/>
		<updated>2013-10-30T21:24:16Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: Scenarios&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Scenarios&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=File:Test-driven_development.png&amp;diff=81365</id>
		<title>File:Test-driven development.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=File:Test-driven_development.png&amp;diff=81365"/>
		<updated>2013-10-30T21:23:23Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: Flow chart&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Flow chart&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81325</id>
		<title>CSC/ECE 517 Fall 2013/oss ssv</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/oss_ssv&amp;diff=81325"/>
		<updated>2013-10-30T20:58:36Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Refactoring and testing — degree_of_relevance.rb =&lt;br /&gt;
[[File:RelevanceJoke.png|frame|Relevance]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
= Introduction  =&lt;br /&gt;
&lt;br /&gt;
The class degree_of_relevance.rb is used to how relevant one piece of text is to another piece of text.  It is used to evaluate the relevance between the submitted work and review(or metareview) for an assignment. It is important to evaluate the reviews for a submitted work to ensure that if the review is not relevant to the submission, it is considered to be invalid and does not impact student's grade. &lt;br /&gt;
This class contains a lot of duplicated code and has long and complex methods. It has been assigned grade &amp;quot;F&amp;quot;, according to metric like code complexity, duplication, lines of code per method etc. Since this class is important for the research on expertiza, it should be re-factored to reduce its complexity, duplication and introduce coding best practices. Our task for this project is to re-factor this class and test it thoroughly. This class can be found at the following location in expertiza source code - Expertiza\expertiza\app\models\automated_metareview&lt;br /&gt;
&lt;br /&gt;
= Theory of Relevance =&lt;br /&gt;
&lt;br /&gt;
Relevance between two texts can be defined as:&lt;br /&gt;
&lt;br /&gt;
[[File:Relevance.png|frame|none|Definition of Relevance‎]]&lt;br /&gt;
&lt;br /&gt;
degree_of_relevance.rb file calculates the relevance between submission and review graphs. The algorithm present in the file implements a variance of dependency graphs called word-order graph to capture both governance and ordering information crucial to obtaining more accurate results. &lt;br /&gt;
&lt;br /&gt;
== Word-order graph generation ==&lt;br /&gt;
&lt;br /&gt;
Vertices represent noun, verb, adjective or adverbial words or phrases in a text, and edges represent relationships between vertices. The first step to graph generation is dividing input text into segments on the basis of a predefined list of punctuation. The text is then tagged with part-of-speech information which is used to determine how to group words into phrases while still maintaining the order. Stanford NLP POS tagger is used to generate the tagged text. The vertex and edges are created using the algorithm below. Graph edges that are subsequently created are then labeled with dependency(word-modifier) information.&lt;br /&gt;
&lt;br /&gt;
[[File:MainAlgo.png|frame|none|Algorithm to create vertices and edges‎]]&lt;br /&gt;
&lt;br /&gt;
WordNet is used to identify semantic relatedness between two terms. The comparison between reviews and submissions would involve checking for relatedness across verb, adjective or adverbial forms, checking for cases of normalizations (noun form of adjectives) etc.&lt;br /&gt;
&lt;br /&gt;
== Lexico-Semantic Graph-based Matching ==&lt;br /&gt;
&lt;br /&gt;
=== Phrase or token matching ===&lt;br /&gt;
&lt;br /&gt;
In phrase or token matching, vertices containing phrases or tokens are compared across graphs. This matching succeeds in capturing semantic relatedness between single or compound words. &lt;br /&gt;
&lt;br /&gt;
[[File:Phrase.png|frame|none|Phrase-matching‎]]&lt;br /&gt;
&lt;br /&gt;
=== Context matching ===&lt;br /&gt;
&lt;br /&gt;
Context matching compares edges with same and different syntax, and edges of different types across two text graphs. We refer to the match as context matching since contiguous phrases, which capture additional context information, are chosen from a graph for comparison with those in another. Edge labels capture grammatical relations, and play an important role in matching.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Context-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(e) and s(e) refer to review and submission edges. The formula calculates the average for each of the above three types of matches ordered, lexical and nominalized. Er and Es represent the sets of review and submission edges.&lt;br /&gt;
&lt;br /&gt;
=== Sentence structure matching ===&lt;br /&gt;
&lt;br /&gt;
Sentence structure matching compares double edges (two contiguous edges), which constitute a complete segment (e.g. subject–verb–object), across graphs. In this work we consider only single and double edges, and not more contiguous edges (triple edges etc.), for text matching. The matching captures similarity across segments and it captures voice changes.&lt;br /&gt;
&lt;br /&gt;
[[File:Context.png|frame|none|Structure-matching‎]]&lt;br /&gt;
&lt;br /&gt;
r(t) and s(t) refer to double edges, and Tr and Ts are the number of double edges in the review and submission texts respectively. matchord and matchvoice are the averages of the best ordered and voice change matches.&lt;br /&gt;
&lt;br /&gt;
= Existing Design =&lt;br /&gt;
&lt;br /&gt;
The current design has a method ''get_relevance'' which is a single entry point to degree_of_relevance.rb. It takes as input submission and review graphs in array form along with other important parameters. The algorithm is broken down into different parts each of which is handled by a different helper method. The results obtained from these methods are used in the following formula to obtain degree of relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:RelevanceFormula.png|frame|none|Formula to calculate relevance‎]]&lt;br /&gt;
&lt;br /&gt;
The implementation in Expertiza leaves room for applying separate weights to different types of matches instead of a fixed value of 0.33 in above formula. For example, a possible set of values could be:&lt;br /&gt;
    alpha = 0.55&lt;br /&gt;
    beta = 0.35&lt;br /&gt;
    gamma = 0.1 &lt;br /&gt;
The rule generally followed is alpha &amp;gt; beta &amp;gt; gamma&lt;br /&gt;
&lt;br /&gt;
== Helper Methods ==&lt;br /&gt;
&lt;br /&gt;
The helper methods used by ''get_relevance'' are:&lt;br /&gt;
&lt;br /&gt;
=== compare_vertices ===&lt;br /&gt;
&lt;br /&gt;
This method compares the vertices from across the two graphs(submission and revision) to identify matches and quantify various metrics. Every vertex is compared with every other vertex to obtain the comparison results.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_non_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This is the method where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done to identify matches and quantify various metrics. It compares edges of the same type ie SUBJECT-VERB edges are matched with SUBJECT-VERB matches.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_syntax_diff ===&lt;br /&gt;
&lt;br /&gt;
This method handles comparisons between edges of type SUBJECT-VERB with VERB-OBJECT. It also does comparison between SUBJECT-OBJECT and VERB_VERB edges. It compares edges of the same type.&lt;br /&gt;
&lt;br /&gt;
=== compare_edges_diff_types ===&lt;br /&gt;
&lt;br /&gt;
This method does comparison between edges of different types. Comparisons related to SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT are handled and relevant results returned.&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_edges ===&lt;br /&gt;
&lt;br /&gt;
=== compare_SVO_diff_syntax ===&lt;br /&gt;
&lt;br /&gt;
== Program Flow ==&lt;br /&gt;
&lt;br /&gt;
degree_of_relevance.rb is buried deep under models and to bring control to this file, the tester needs to undertake some tasks. The diagram below gives a quick look into what needs to be done.&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow1.png|frame|none|Initial setup‎]]&lt;br /&gt;
&lt;br /&gt;
Once a review is submitted, the further flow is as follows:&lt;br /&gt;
&lt;br /&gt;
[[File:ProgramFlow2.png|frame|none|Flow in the code(filename and corresponding methods called‎)]]&lt;br /&gt;
&lt;br /&gt;
When a user tries to review an existing work, it calls ''new'' method in response_controller.rb which renders the page where user fills up the review. On submission, it calls ''create'' of the same file, which then redirects to ''saving'' in the same file again. Saving makes a call to ''calculate_metareview_metrics'' in automated_metareview.rb which eventually calls ''get_relevance'' in degree_of_relevance.rb.&lt;br /&gt;
&lt;br /&gt;
= Implementation =&lt;br /&gt;
&lt;br /&gt;
== New Design and Refactoring ==&lt;br /&gt;
&lt;br /&gt;
=== New Design ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
We have taken ideas from the Template design pattern to improve the design of the class. Although we did not directly implement this design pattern on the class, the idea of defining a skeleton of an algorithm in one class, and defering some steps to subclasses, allowed us to come up with a similar design which segregates different functionality to different classes. The following is a brief outline of the changes made:&lt;br /&gt;
&lt;br /&gt;
* Divided the code into 4 classes to  segregate the functionality making a logical separation of code.&lt;br /&gt;
* These classes are -&lt;br /&gt;
** compare_graph_edges.rb&lt;br /&gt;
** compare_graph_svo_edges.rb&lt;br /&gt;
** compare_graph_vertices.rb&lt;br /&gt;
** degree_of_relevance.rb&lt;br /&gt;
&lt;br /&gt;
* Extracted common code in methods that be re-used.&lt;br /&gt;
* After refactoring the grade according to codeclimate for the class is &amp;quot;C&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Refactoring ===&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==== Design of classes ====&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calculates the scaled relevance using the formula described above. It calculates the relevance based on comparison of submission and review graphs. As described above the following types of comparison is made between the graphs and various metrics is calculated which is used to calculate the relevance:&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_edges.rb:&lt;br /&gt;
** Comparing edges of graphs with non syntax difference: In this SUBJECT-VERB edges are compred with SUBJECT-VERB matches where SUBJECT-SUBJECT and VERB-VERB or VERB-VERB and OBJECT-OBJECT comparisons are done.&lt;br /&gt;
** Comparing edges with syntax diff: Compares the edges from across the two graphs to identify matches and quantify various metrics. Compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa where SUBJECT-OBJECT and VERB_VERB comparisons are done - same type comparisons.&lt;br /&gt;
** Comparing edges with diff types: Compares the edges from across the two graphs to identify matches and quantify various metrics compare SUBJECT-VERB edges with VERB-OBJECT matches and vice-versa SUBJECT-VERB, VERB-SUBJECT, OBJECT-VERB, VERB-OBJECT comparisons are done. (Different type comparisons)&lt;br /&gt;
&lt;br /&gt;
All the above functions are grouped in one class - compare_graph_edges.rb.&lt;br /&gt;
&lt;br /&gt;
* Class  compare_graph_vertices.rb:&lt;br /&gt;
** Comparing vertices of the corresponding graphs: Every vertex is compared with every other vertex. Compares the vertices from across the two graphs to identify matches and quantify various metrics.&lt;br /&gt;
&lt;br /&gt;
This method is factored out to the class - compare_graph_vertices.rb&lt;br /&gt;
&lt;br /&gt;
* Class compare_graph_SVO_edges&lt;br /&gt;
&lt;br /&gt;
** comparing SVO edges.&lt;br /&gt;
** compare SVO edges with different syntax.&lt;br /&gt;
&lt;br /&gt;
These methods are grouped in the compare_graph_SVO_edges.rb class.&lt;br /&gt;
&lt;br /&gt;
The main class degree_of_relevance.rb calls each of these methods to get the appropriate metrics required for evaluating relevance.&lt;br /&gt;
&lt;br /&gt;
[[File:class design.png]]&lt;br /&gt;
&lt;br /&gt;
==== Types of Refactoring ====&lt;br /&gt;
&lt;br /&gt;
The following are the three main types of refactoring we have done in our project.&lt;br /&gt;
&lt;br /&gt;
* '''Move out common code in re-usable method:'''&lt;br /&gt;
&lt;br /&gt;
The following is a sample of the duplicated code which was present in many methods. We moved it to a separate method and just called this method with appropriate parameters wherever required. This implements nil condition check on an edge of the graph and if the id of the vertex is not invalid(negative).&lt;br /&gt;
&lt;br /&gt;
 def edge_nil_cond_check(edge)&lt;br /&gt;
   return (!edge.nil? &amp;amp;&amp;amp; edge.in_vertex.node_id != -1 &amp;amp;&amp;amp; edge.out_vertex.node_id != -1)&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
* '''Reduce number of lines of code per method'''.&lt;br /&gt;
&lt;br /&gt;
We ensured that each method should be short enough to fit in less than one screen length(no need to scroll). &lt;br /&gt;
** Combined conditions together in one statement instead of nesting them inside one another.&lt;br /&gt;
** Took out code that can be re-used or logically separated in separate methods.&lt;br /&gt;
** Removed commented debug statements.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample before Refactoring:&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
  # puts(&amp;quot;*****Inside compareEdgesDiffTypes :: numRevEdg :: #{num_rev_edg} numSubEdg:: #{num_sub_edg}&amp;quot;)   &lt;br /&gt;
  best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
  cum_edge_match = 0.0&lt;br /&gt;
  count = 0&lt;br /&gt;
  max = 0.0&lt;br /&gt;
  flag = 0&lt;br /&gt;
  wnet = WordnetBasedSimilarity.new  &lt;br /&gt;
  for i in (0..num_rev_edg - 1)&lt;br /&gt;
    if(!rev[i].nil? and rev[i].in_vertex.node_id != -1 and rev[i].out_vertex.node_id != -1)&lt;br /&gt;
      #skipping edges with frequent words for vertices&lt;br /&gt;
      if(wnet.is_frequent_word(rev[i].in_vertex.name) and wnet.is_frequent_word(rev[i].out_vertex.name))&lt;br /&gt;
        next&lt;br /&gt;
      end&lt;br /&gt;
      #identifying best match for edges&lt;br /&gt;
      for j in (0..num_sub_edg - 1) &lt;br /&gt;
        if(!subm[j].nil? and subm[j].in_vertex.node_id != -1 and subm[j].out_vertex.node_id != -1)&lt;br /&gt;
          #checking if the subm token is a frequent word&lt;br /&gt;
          if(wnet.is_frequent_word(subm[j].in_vertex.name) and wnet.is_frequent_word(subm[j].out_vertex.name))&lt;br /&gt;
            next&lt;br /&gt;
          end &lt;br /&gt;
          #for S-V with S-V or V-O with V-O&lt;br /&gt;
          if(rev[i].in_vertex.type == subm[j].in_vertex.type and rev[i].out_vertex.type == subm[j].out_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] = 0.0&lt;br /&gt;
            end&lt;br /&gt;
            #-- Vertex and SRL&lt;br /&gt;
            best_SV_VS_match[i][j] = best_SV_VS_match[i][j]/ compare_labels(rev[i], subm[j])&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          #for S-V with V-O or V-O with S-V&lt;br /&gt;
          elsif(rev[i].in_vertex.type == subm[j].out_vertex.type and rev[i].out_vertex.type == subm[j].in_vertex.type)&lt;br /&gt;
            #taking each match separately because one or more of the terms may be a frequent word, for which no @vertex_match exists!&lt;br /&gt;
            sum = 0.0&lt;br /&gt;
            cou = 0&lt;br /&gt;
            if(!@vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].in_vertex.node_id][subm[j].in_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end&lt;br /&gt;
            if(!@vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id].nil?)&lt;br /&gt;
              sum = sum + @vertex_match[rev[i].out_vertex.node_id][subm[j].out_vertex.node_id]&lt;br /&gt;
              cou +=1&lt;br /&gt;
            end  &lt;br /&gt;
            if(cou &amp;gt; 0)&lt;br /&gt;
              best_SV_VS_match[i][j] = sum.to_f/cou.to_f&lt;br /&gt;
            else&lt;br /&gt;
              best_SV_VS_match[i][j] =0.0&lt;br /&gt;
            end&lt;br /&gt;
            flag = 1&lt;br /&gt;
            if(best_SV_VS_match[i][j] &amp;gt; max)&lt;br /&gt;
              max = best_SV_VS_match[i][j]&lt;br /&gt;
            end&lt;br /&gt;
          end&lt;br /&gt;
        end #end of the if condition&lt;br /&gt;
      end #end of the for loop for submission edges&lt;br /&gt;
        &lt;br /&gt;
      if(flag != 0) #if the review edge had any submission edges with which it was matched, since not all S-V edges might have corresponding V-O edges to match with&lt;br /&gt;
        # puts(&amp;quot;**** Best match for:: #{rev[i].in_vertex.name} - #{rev[i].out_vertex.name} -- #{max}&amp;quot;)&lt;br /&gt;
        cum_edge_match = cum_edge_match + max&lt;br /&gt;
        count+=1&lt;br /&gt;
        max = 0.0 #re-initialize&lt;br /&gt;
        flag = 0&lt;br /&gt;
      end&lt;br /&gt;
    end #end of if condition&lt;br /&gt;
  end #end of for loop for review edges&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code sample after Refactoring :&lt;br /&gt;
&lt;br /&gt;
 def compare_edges_diff_types(vertex_match, rev, subm, num_rev_edg, num_sub_edg)&lt;br /&gt;
    best_SV_VS_match = Array.new(num_rev_edg){Array.new}&lt;br /&gt;
    cum_edge_match = max = 0.0&lt;br /&gt;
    count = flag = 0&lt;br /&gt;
    wnet = WordnetBasedSimilarity.new&lt;br /&gt;
    for i in (0..num_rev_edg - 1)&lt;br /&gt;
      if(edge_nil_cond_check(rev[i]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, rev[i]))&lt;br /&gt;
        #identifying best match for edges&lt;br /&gt;
        for j in (0..num_sub_edg - 1)&lt;br /&gt;
          if(edge_nil_cond_check(subm[j]) &amp;amp;&amp;amp; !wnet_frequent_word_check(wnet, subm[j]))&lt;br /&gt;
            #for S-V with S-V or V-O with V-O&lt;br /&gt;
            if(edge_symm_vertex_type_check(rev[i],subm[j]) || edge_asymm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
              if(edge_symm_vertex_type_check(rev[i],subm[j]))&lt;br /&gt;
                cou, sum = sum_cou_asymm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              else&lt;br /&gt;
                cou, sum = sum_cou_symm_calculation(vertex_match, rev[i], subm[j])&lt;br /&gt;
              end&lt;br /&gt;
              max, best_SV_VS_match[i][j] = max_calculation(cou,sum, rev[i], subm[j], max)&lt;br /&gt;
              flag = 1&lt;br /&gt;
            end&lt;br /&gt;
          end #end of the if condition&lt;br /&gt;
        end #end of the for loop for submission edges&lt;br /&gt;
        flag_cond_var_set(:cum_edge_match, :max, :count, :flag, binding)&lt;br /&gt;
      end&lt;br /&gt;
    end #end of 'for' loop for the review's edges&lt;br /&gt;
    return calculate_avg_match(cum_edge_match, count)&lt;br /&gt;
  end #end of the method&lt;br /&gt;
&lt;br /&gt;
= Testing =&lt;br /&gt;
&lt;br /&gt;
Ruby on Rails and automated testing go hand in hand. Rails ships with a built-in test framework; if it’s not to your liking you can replace it with one of your liking like Capybara, Rspec etc. Testing is pretty important in Rails—yet many people developing in Rails are either not testing their projects at all, or at best only adding a few token specs on model validations.&lt;br /&gt;
 &lt;br /&gt;
 If it's worth building, it's worth testing.&lt;br /&gt;
 If it's not worth testing, why are you wasting your time working on it?&lt;br /&gt;
&lt;br /&gt;
There are several reasons for this. Perhaps working with Ruby or web frameworks is a novel enough concept; adding an extra layer of work seems like just that—extra work. Or maybe there is a perceived time constraint—spending time on writing tests takes time away from writing the features our clients or bosses demand. Or maybe the habit of defining “test” as clicking links in the browser is too hard to break.&lt;br /&gt;
Testing helps us find defects in our applications. But, it also ensures that the developers follow the release plan and the rule-set of necessary requirements, instead of just developing carelessly.&lt;br /&gt;
Testing can be done in two approaches:&lt;br /&gt;
*Test Driven Development:&lt;br /&gt;
Test-Driven Development (TDD) is a way of driving the design of code by writing a test which expresses what you intend the code to do, making that test pass, and continuously refactoring to keep the design as simple as possible.&lt;br /&gt;
&amp;lt;&amp;lt;image of the flowchart&amp;gt;&amp;gt;&lt;br /&gt;
*Behaviour Driven Development&lt;br /&gt;
Behavior-driven development (abbreviated BDD) is a software development process based on test-driven development (TDD). Behavior-driven development combines the general techniques and principles of TDD to provide software developers and business analysts with shared tools and a shared process to collaborate on software development, with the aim of delivering &amp;quot;software that matters&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
We have used BDD approach, in which the tests are described in a natural language, which makes them more accessible to people outside of development or testing teams. &lt;br /&gt;
Tests are small, automated Ruby programs that automatically test different parts of your applications, It can apply at multiple levels, e.g., Customer Tests, Integration Tests, Unit Tests.&lt;br /&gt;
In this approach, we have to follow a rigorous cycle. Start by writing a failing test (Red.) Implement the simplest solution that will cause the test to pass (Green.) Search for duplication and remove it (Refactor.) &amp;quot;RED-GREEN-REFACTOR&amp;quot; has become almost a mantra for many TDD practitioners.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;&amp;lt;image&amp;gt;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Cucumber==&lt;br /&gt;
&lt;br /&gt;
Of the various tools available we have used Cucumber with Capybara for testing.Cucumber lets software development teams describe how software should behave in plain text. The text is written in a business-readable domain-specific language and serves as documentation, automated tests and development-aid - all rolled into one format.&lt;br /&gt;
&lt;br /&gt;
Cucumber is a testing framework that helps to bridge the gap between software developers and business managers. &lt;br /&gt;
Because behaviour driven development is partly about understanding what the client wants before you begin coding, Cucumber aims to make its tests readable by clients (AKA, non-programmers). So, Tests are written in plain language like gherkin in the form of Given, When, Then, which any layperson can understand. Test cases are then placed into feature files that cover one or more test scenarios.&lt;br /&gt;
&lt;br /&gt;
The BDD Given, When, Then syntax is designed to be intuitive. Consider the syntax elements:&lt;br /&gt;
&lt;br /&gt;
*'''Given''' provides context for the test scenario about to be executed, such as the point in your application that the test occurs as well as any prerequisite data.&lt;br /&gt;
*'''When''' specifies the set of actions that triggers the test, such as user or subsystem actions.&lt;br /&gt;
*'''Then''' specifies the expected result of the test.&lt;br /&gt;
&lt;br /&gt;
As Cucumber doesn’t know how to interpret the features by itself, the next step is to create step definitions explaining it what to do when finding that step in one of the scenarios. The step definitions are written in Ruby. This gives much flexibility on how the test steps are executed&lt;br /&gt;
&lt;br /&gt;
==Capybara==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Procedure:==&lt;br /&gt;
*First of all, we have to include the required gems for our tests in the '''Gemfile'''.&lt;br /&gt;
 group :test do&lt;br /&gt;
   gem 'capybara'&lt;br /&gt;
   gem 'cucumber-rails', require: false&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Then, write the actual tests in the feature file, '''deg_rel_set.feature''' located in the features folder.&lt;br /&gt;
The degree or relevance file that we are dealing with consists of six major functions.We have written tests for implementing all the six functions.&lt;br /&gt;
The major feature is to test the degree of relevance by finding the comparison between the expected values and the actual values of the Submission and Review Graphs.&lt;br /&gt;
&lt;br /&gt;
 Feature: To check degree of Relevance&lt;br /&gt;
   In order to check the various methods of the class&lt;br /&gt;
   As an Administrator&lt;br /&gt;
   I want to check if the actual and expected values match.&lt;br /&gt;
&lt;br /&gt;
 @one&lt;br /&gt;
  Scenario: Check actual v/s expected values&lt;br /&gt;
    Given Instance of the class is created&lt;br /&gt;
    When I compare actual value and expected value of &amp;quot;compare_vertices&amp;quot;&lt;br /&gt;
    Then It will return true if both are equal.&lt;br /&gt;
&lt;br /&gt;
Similarly, we have tested all the six major modules of degree of relevance like comparing edges, vertices and subject-verb objects.&lt;br /&gt;
&lt;br /&gt;
*Cucumber feature files are written to be readable to non-programmers, so we have to implement step definitions for undefined steps, provided by Cucumber. Cucumber will load any files in the folder features/step_definitions for steps. We have created a step file, '''deg_relev_step.rb'''.&lt;br /&gt;
&lt;br /&gt;
 Given /^Instance of the class is created$/ do&lt;br /&gt;
   @inst=DegreeOfRelevance.new&lt;br /&gt;
 end&lt;br /&gt;
 When /^I compare (\d+) and &amp;quot;(\S+)&amp;quot;$/ do |qty|&lt;br /&gt;
  actual=qty&lt;br /&gt;
  if(assert_equal(qty,@inst.compare_vertices(@vertex_match,@pos_tagger, @review_vertices, @subm_vertices, @num_rev_vert, @num_sub_vert, @speller) ))&lt;br /&gt;
  step(&amp;quot;It will return true&amp;quot;)&lt;br /&gt;
  else&lt;br /&gt;
  step(“It will return false”)&lt;br /&gt;
  end&lt;br /&gt;
 end&lt;br /&gt;
 Then /^It will return true$/  do&lt;br /&gt;
   #print true or false&lt;br /&gt;
 End&lt;br /&gt;
&lt;br /&gt;
*The dummy model/data is created in the setup function defined in the class Deg_rel_setup present in '''lib/deg_rel_setup.rb'''. It generates a graph of submissions and reviews and then executes the 6 different methods.&lt;br /&gt;
&lt;br /&gt;
 class Deg_rel_setup&lt;br /&gt;
   def setup&lt;br /&gt;
    #set up nodes and graphs&lt;br /&gt;
   end&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*Cucumber provides a number of hooks which allow us to run blocks at various points in the Cucumber test cycle. we can put them in the '''support/env.rb''' file or any other file under the support directory, for example in a file called '''support/hooks.rb'''. There is no association between where the hook is defined and which scenario/step it is run for, but you can use tagged hooks, if we want more fine grained control.&lt;br /&gt;
&lt;br /&gt;
All defined hooks are run whenever the relevant event occurs.&lt;br /&gt;
Before hooks will be run before the first step of each scenario. They will run in the same order of which they are registered.&lt;br /&gt;
&lt;br /&gt;
 Before do &lt;br /&gt;
   # Do something before each scenario. &lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
Sometimes we may want a certain hook to run only for certain scenarios. This can be achieved by associating a Before, After,Around or AfterStep hook with one or more tags. We have used a tag named one, and hence the following code snippet will be executed everytime before the implementation of tag one.&lt;br /&gt;
&lt;br /&gt;
 Before ('@one') do&lt;br /&gt;
   set= Deg_rel_setup.new&lt;br /&gt;
   set.setup&lt;br /&gt;
 end&lt;br /&gt;
&lt;br /&gt;
*To run the test we simply need to execute, &lt;br /&gt;
 rake db:migrate		                   #To initialize the test database&lt;br /&gt;
 cucumber features/deg_rel.feature	   #This will execute the feature along with it's step definitons	&lt;br /&gt;
&lt;br /&gt;
&amp;lt;output scrrenshot&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In the end, though, the most important thing is that tests are reliable and understandable, even if they’re not quite as optimized as they could be, they are a great way to start an application. We should keep taking advantage of the fully automated test suites available and use tests to drive development and ferret out potential bugs in our application.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Conclusion =&lt;br /&gt;
= References =&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79857</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79857"/>
		<updated>2013-10-07T21:08:14Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. &lt;br /&gt;
A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Spring_2013/ch1_1d_zk CSC/ECE 517 Spring 2013/ch1 1d zk] wiki page.&lt;br /&gt;
&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia , Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://guides.rubyonrails.org/active_record_querying.html  ActiveRecord Query Interface]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/DataMapper_(Ruby)  DataMapper]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;br /&gt;
# [http://blogs.msdn.com/b/gblock/archive/2006/10/26/ten-advantages-of-an-orm.aspx Advantages of ORM]&lt;br /&gt;
# [http://www.codefutures.com/weblog/andygrove/archives/2005/02/data_access_obj.html Disadvantages of ORM]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79856</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79856"/>
		<updated>2013-10-07T21:07:19Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. &lt;br /&gt;
A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Spring_2013/ch1_1d_zk CSC/ECE 517 Spring 2013/ch1 1d zk] wiki page.&lt;br /&gt;
&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia , Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://ar.rubyonrails.org/  ActiveRecord]&lt;br /&gt;
# [http://guides.rubyonrails.org/active_record_querying.html  ActiveRecord Query Interface]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/DataMapper_(Ruby)  DataMapper]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;br /&gt;
# [http://blogs.msdn.com/b/gblock/archive/2006/10/26/ten-advantages-of-an-orm.aspx Advantages of ORM]&lt;br /&gt;
# [http://www.codefutures.com/weblog/andygrove/archives/2005/02/data_access_obj.html Disadvantages of ORM]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79855</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79855"/>
		<updated>2013-10-07T21:07:00Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. &lt;br /&gt;
A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Spring_2013/ch1_1d_zk CSC/ECE 517 Spring 2013/ch1 1d zk] wiki page.&lt;br /&gt;
&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://ar.rubyonrails.org/  ActiveRecord]&lt;br /&gt;
# [http://guides.rubyonrails.org/active_record_querying.html  ActiveRecord Query Interface]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/DataMapper_(Ruby)  DataMapper]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;br /&gt;
# [http://blogs.msdn.com/b/gblock/archive/2006/10/26/ten-advantages-of-an-orm.aspx Advantages of ORM]&lt;br /&gt;
# [http://www.codefutures.com/weblog/andygrove/archives/2005/02/data_access_obj.html Disadvantages of ORM]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79854</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79854"/>
		<updated>2013-10-07T21:06:19Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* References */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. &lt;br /&gt;
A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Spring_2013/ch1_1d_zk CSC/ECE 517 Spring 2013/ch1 1d zk] wiki page.&lt;br /&gt;
&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia, Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://ar.rubyonrails.org/ , ActiveRecord]&lt;br /&gt;
# [http://guides.rubyonrails.org/active_record_querying.html , ActiveRecord Query Interface]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/DataMapper_(Ruby) , DataMapper]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;br /&gt;
# [http://blogs.msdn.com/b/gblock/archive/2006/10/26/ten-advantages-of-an-orm.aspx Advantages of ORM]&lt;br /&gt;
# [http://www.codefutures.com/weblog/andygrove/archives/2005/02/data_access_obj.html Disadvantages of ORM]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79852</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79852"/>
		<updated>2013-10-07T21:00:51Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Further Reading */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. &lt;br /&gt;
A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Spring_2013/ch1_1d_zk CSC/ECE 517 Spring 2013/ch1 1d zk] wiki page.&lt;br /&gt;
&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia, Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79850</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79850"/>
		<updated>2013-10-07T21:00:09Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Further Reading */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. &lt;br /&gt;
A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Spring_2013/ch1_1d_zk] wiki page.&lt;br /&gt;
&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia, Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79849</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79849"/>
		<updated>2013-10-07T20:58:35Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* NonSQL databases */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. &lt;br /&gt;
A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Fall_2010/ch3_3j_KS Object-relational mapping Fall 2010] and [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Fall_2010/ch3_3f_ac Fall 2010 detailed ORM explanation] wiki pages.&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia, Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79848</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79848"/>
		<updated>2013-10-07T20:54:57Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* MyBatis/iBatis */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use OODBMS or document oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form.&lt;br /&gt;
Document oriented databases also eliminates the user from retrieving  objects as data rows.Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path.Also OODBMS limits the extent of processing SQL queries.A relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not in XML file. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Fall_2010/ch3_3j_KS Object-relational mapping Fall 2010] and [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Fall_2010/ch3_3f_ac Fall 2010 detailed ORM explanation] wiki pages.&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia, Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79847</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79847"/>
		<updated>2013-10-07T20:53:51Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* Squeel */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[http://rubydoc.info/gems/squeel/1.1.1/frames Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Follows below a MyBatis mapper, that consist on a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use OODBMS or document oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form.&lt;br /&gt;
Document oriented databases also eliminates the user from retrieving  objects as data rows.Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path.Also OODBMS limits the extent of processing SQL queries.A relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not in XML file. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Fall_2010/ch3_3j_KS Object-relational mapping Fall 2010] and [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Fall_2010/ch3_3f_ac Fall 2010 detailed ORM explanation] wiki pages.&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia, Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
	<entry>
		<id>https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79841</id>
		<title>CSC/ECE 517 Fall 2013/ch1 1w43 sm</title>
		<link rel="alternate" type="text/html" href="https://wiki.expertiza.ncsu.edu/index.php?title=CSC/ECE_517_Fall_2013/ch1_1w43_sm&amp;diff=79841"/>
		<updated>2013-10-07T20:47:12Z</updated>

		<summary type="html">&lt;p&gt;Semhatr2: /* DataMapper */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=Object Relational Mapping=&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Introduction=&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_mapping Object-relational mapping (ORM)], or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated.&lt;br /&gt;
ORM basically creates a &amp;quot;virtual object database&amp;quot; that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques  and the advantages and disadvantages of various styles of ORM mapping.&lt;br /&gt;
&lt;br /&gt;
=Need For ORM=&lt;br /&gt;
Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance MisMatch Problem] is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects.&lt;br /&gt;
The Mismatches that occur in Object-Oriented concepts are as follows: &lt;br /&gt;
*Encapsulation:&lt;br /&gt;
Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.&lt;br /&gt;
*Interface, Class, Inheritance and polymorphism:&lt;br /&gt;
Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.&lt;br /&gt;
*Identity:&lt;br /&gt;
RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).&lt;br /&gt;
*Associations:&lt;br /&gt;
Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.&lt;br /&gt;
*Data navigation:&lt;br /&gt;
The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.&lt;br /&gt;
&lt;br /&gt;
=Overview=&lt;br /&gt;
&lt;br /&gt;
Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.&lt;br /&gt;
&lt;br /&gt;
Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.&lt;br /&gt;
&lt;br /&gt;
Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
Typical ORM features:&lt;br /&gt;
* Automatic mapping from classes to database tables&lt;br /&gt;
** Class instance variables to database columns&lt;br /&gt;
** Class instances to table rows&lt;br /&gt;
* Aggregation and association relationships between mapped classes are managed&lt;br /&gt;
** Example, :has_many, :belongs_to associations in ActiveRecord&lt;br /&gt;
** Inheritance cases are mapped to tables&lt;br /&gt;
* Validation of data prior to table storage&lt;br /&gt;
* Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records&lt;br /&gt;
* Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)&lt;br /&gt;
&lt;br /&gt;
The diagram below depicts a simple mapping of an object to a database table&lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic1.jpg]]&lt;br /&gt;
&lt;br /&gt;
ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.&lt;br /&gt;
&lt;br /&gt;
When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.&lt;br /&gt;
&lt;br /&gt;
=Active Record=&lt;br /&gt;
[http://ar.rubyonrails.org/ ActiveRecord] insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.&lt;br /&gt;
&lt;br /&gt;
ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.&lt;br /&gt;
&lt;br /&gt;
Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.[wiki]&lt;br /&gt;
&lt;br /&gt;
To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled.  As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUsers &amp;lt; ActiveRecord::Migration&lt;br /&gt;
  def self.up&lt;br /&gt;
    create_table :users do |t|&lt;br /&gt;
      t.string :name&lt;br /&gt;
      t.string :email&lt;br /&gt;
      t.string :age&lt;br /&gt;
	  t.references :cheer&lt;br /&gt;
	  t.references :post&lt;br /&gt;
&lt;br /&gt;
      t.timestamps     # add creation and modification timestamps&lt;br /&gt;
    end&lt;br /&gt;
  end&lt;br /&gt;
&lt;br /&gt;
  def self.down        # undo the table creation&lt;br /&gt;
    drop_table :users&lt;br /&gt;
  end&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method.  In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.  &lt;br /&gt;
&lt;br /&gt;
ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition.  For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator.  Also note ActiveRecord’s integrated support for validation of table information when attempting an update.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; ActiveRecord::Base&lt;br /&gt;
  has_many :cheers&lt;br /&gt;
  has_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence_of :name&lt;br /&gt;
  validates_presence_of :age&lt;br /&gt;
  validates_uniqueness_of :name&lt;br /&gt;
  validates_length_of :name, :within =&amp;gt; 3..20&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes.&lt;br /&gt;
While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.  &lt;br /&gt;
&lt;br /&gt;
[[Image:3j_ks_pic2.jpg]]&lt;br /&gt;
&lt;br /&gt;
Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same.&lt;br /&gt;
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.&lt;br /&gt;
&lt;br /&gt;
'''Pros'''–&lt;br /&gt;
* Integrated with popular Rails development framework.&lt;br /&gt;
*Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.&lt;br /&gt;
*DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.&lt;br /&gt;
'''Cons''' –&lt;br /&gt;
*DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.&lt;br /&gt;
&lt;br /&gt;
=Sequel=&lt;br /&gt;
[http://sequel.rubyforge.org Sequel] is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.&lt;br /&gt;
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.&lt;br /&gt;
&lt;br /&gt;
Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior&lt;br /&gt;
&lt;br /&gt;
'''Features''' -&lt;br /&gt;
*Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.&lt;br /&gt;
*Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.&lt;br /&gt;
*Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.&lt;br /&gt;
*Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class CreateUser &amp;lt; Sequel::Migration&lt;br /&gt;
  def up&lt;br /&gt;
    create_table(:user) {&lt;br /&gt;
      primary_key :id&lt;br /&gt;
      String :name&lt;br /&gt;
      String :age&lt;br /&gt;
      String :email}&lt;br /&gt;
  end&lt;br /&gt;
  def down&lt;br /&gt;
    drop_table(:user)&lt;br /&gt;
  end&lt;br /&gt;
end # CreateUser.apply(DB, :up)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute '''name'''.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User &amp;lt; Sequel::Model&lt;br /&gt;
  one_to_many :cheers&lt;br /&gt;
  one_to_many :posts&lt;br /&gt;
  &lt;br /&gt;
  validates_presence [:name, :age]&lt;br /&gt;
  validates_unique(:name)&lt;br /&gt;
  validates_length_range 3..20, :name&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To access the data in Sequel, one can use '''where''' clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.&lt;br /&gt;
&lt;br /&gt;
=DataMapper=&lt;br /&gt;
[http://datamapper.org/why.html  DataMapper] is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich.&lt;br /&gt;
DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.&lt;br /&gt;
&lt;br /&gt;
'''Features'''-&lt;br /&gt;
*Eager loading of child associations to avoid (N+1) queries.&lt;br /&gt;
*Lazy loading of select properties, e.g., larger fields.&lt;br /&gt;
*Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).&lt;br /&gt;
*An API not too heavily oriented to SQL databases.&lt;br /&gt;
&lt;br /&gt;
DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern.[2] As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,[3] Apache Solr,[4] and web services such as Salesforce.&lt;br /&gt;
&lt;br /&gt;
Example of table definition in the model:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
class User&lt;br /&gt;
  include DataMapper::Resource&lt;br /&gt;
&lt;br /&gt;
  property :id,         Serial    # key&lt;br /&gt;
  property :name,       String, :required =&amp;gt; true, :unique =&amp;gt; true     &lt;br /&gt;
  property :age,        String, :required =&amp;gt; true, :length =&amp;gt; 3..20 &lt;br /&gt;
  property :email,      String &lt;br /&gt;
  &lt;br /&gt;
  has n, :posts          # one to many association&lt;br /&gt;
  has n, :cheers          # one to many association&lt;br /&gt;
end&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice that in the example above , &amp;quot;:required = true&amp;quot; is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.&lt;br /&gt;
&lt;br /&gt;
=Squeel=&lt;br /&gt;
[https://www.ruby-toolbox.com/categories/orm Squeel] unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, all without writing SQL strings. Squeel acts as an add on to Active Record and makes it easier to write queries with fewer strings.&lt;br /&gt;
A simple example is as follows:&lt;br /&gt;
Squeel lets you rewrite&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where ['created_at &amp;gt;= ?', 2.weeks.ago]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
as&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Article.where{created_at &amp;gt;= 2.weeks.ago}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=MyBatis/iBatis=&lt;br /&gt;
[http://en.wikipedia.org/wiki/MyBatis MyBatis/iBatis] was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.&lt;br /&gt;
&lt;br /&gt;
The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.&lt;br /&gt;
&lt;br /&gt;
SQL statements are stored in XML files or annotations. Follows below a MyBatis mapper, that consist on a Java interface with some MyBatis annotations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
package org.mybatis.example;&lt;br /&gt;
 &lt;br /&gt;
public interface BlogMapper {&lt;br /&gt;
    @Select(&amp;quot;select * from Blog where id = #{id}&amp;quot;)&lt;br /&gt;
    Blog selectBlog(int id);&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The sentence is executed as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
BlogMapper mapper = session.getMapper(BlogMapper.class);&lt;br /&gt;
Blog blog = mapper.selectBlog(101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can also be executed using MyBatis API.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Blog blog = session.selectOne(&amp;quot;org.mybatis.example.BlogMapper.selectBlog&amp;quot;, 101);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SQL statements and mappings can also be externalized to an XML file like this.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot; ?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE mapper PUBLIC &amp;quot;-//mybatis.org//DTD Mapper 3.0//EN&amp;quot; &amp;quot;http://mybatis.org/dtd/mybatis-3-mapper.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;mapper namespace=&amp;quot;org.mybatis.example.BlogMapper&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;select id=&amp;quot;selectBlog&amp;quot; parameterType=&amp;quot;int&amp;quot; resultType=&amp;quot;Blog&amp;quot;&amp;gt;&lt;br /&gt;
        select * from Blog where id = #{id}&lt;br /&gt;
    &amp;lt;/select&amp;gt;&lt;br /&gt;
&amp;lt;/mapper&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Comparison of ORM Features=&lt;br /&gt;
&lt;br /&gt;
{|cellspacing=&amp;quot;0&amp;quot; border=&amp;quot;1&amp;quot;&lt;br /&gt;
!style=&amp;quot;width:8%&amp;quot;|Features &lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|ActiveRecord&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|Sequel&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|DataMapper&lt;br /&gt;
!style=&amp;quot;width:23%&amp;quot;|MyBatis&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Databases &amp;lt;/p&amp;gt;&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
| ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3&lt;br /&gt;
| SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API&lt;br /&gt;
| MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Migrations &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes&lt;br /&gt;
| Yes, but optional&lt;br /&gt;
| Yes, provides migration by Mybatis schema migration system.&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; EagerLoading &amp;lt;/p&amp;gt;&lt;br /&gt;
| Supported by scanning the SQL fragments&lt;br /&gt;
| Supported using eager (preloading) and eager_graph (joins) &lt;br /&gt;
| Strategic Eager Loading and by using :summary&lt;br /&gt;
| Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Flexible Overriding &amp;lt;/p&amp;gt;&lt;br /&gt;
| No. Overriding is done using alias methods.&lt;br /&gt;
| Using methods and by calling 'super'&lt;br /&gt;
| Using methods&lt;br /&gt;
| No&lt;br /&gt;
|-&lt;br /&gt;
! &amp;lt;p align=&amp;quot;left&amp;quot;&amp;gt; Dynamic Finders &amp;lt;/p&amp;gt;&lt;br /&gt;
| Yes. Uses 'Method Missing'&lt;br /&gt;
| No. Alternative is to use &amp;lt;Model&amp;gt;.FindOrCreate(:name=&amp;gt;&amp;quot;John&amp;quot;)&lt;br /&gt;
| Yes. Using the dm_ar_finders plugin&lt;br /&gt;
| Similar feature is supported in the form of Dynamic SQL&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=NonSQL databases=&lt;br /&gt;
Instead of ORM we can also use OODBMS or document oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form.&lt;br /&gt;
Document oriented databases also eliminates the user from retrieving  objects as data rows.Query languages like XQuery can be used to retrieve data sets.&lt;br /&gt;
&lt;br /&gt;
The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path.Also OODBMS limits the extent of processing SQL queries.A relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not in XML file. &lt;br /&gt;
&lt;br /&gt;
Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.&lt;br /&gt;
&lt;br /&gt;
NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene&lt;br /&gt;
&lt;br /&gt;
=Advantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.&lt;br /&gt;
&lt;br /&gt;
*Huge reduction in code. Ease of use, faster development, increased productivity.&lt;br /&gt;
&lt;br /&gt;
*ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.&lt;br /&gt;
&lt;br /&gt;
*Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.&lt;br /&gt;
&lt;br /&gt;
*Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.&lt;br /&gt;
&lt;br /&gt;
*Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.&lt;br /&gt;
&lt;br /&gt;
*Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.&lt;br /&gt;
&lt;br /&gt;
*Concurrency support. Support for multiple users updating the same data simultaneously.&lt;br /&gt;
&lt;br /&gt;
*Cache management. Entities are cached in memory thereby reducing load on the database.&lt;br /&gt;
&lt;br /&gt;
*Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.&lt;br /&gt;
&lt;br /&gt;
*Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.&lt;br /&gt;
&lt;br /&gt;
=Disadvantages of ORM=&lt;br /&gt;
&lt;br /&gt;
*High level of abstraction obscures as to what is actually happening in the code implementation.&lt;br /&gt;
*Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.&lt;br /&gt;
*Heavy reliance on ORM also leads to poorly designed databases.&lt;br /&gt;
*Data and behaviour are not separated.&lt;br /&gt;
**Façade needs to be built if the data model is to be made available in a distributed architecture.&lt;br /&gt;
**Each ORM technology/product has a different set of APIs and porting code between them is not easy.&lt;br /&gt;
*The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -&lt;br /&gt;
**Queries can’t be optimized.&lt;br /&gt;
**Queries select more data than needed, things get slower, more latency.&lt;br /&gt;
**Compiling queries from ORM code is slow (ORM compiler written in PHP).&lt;br /&gt;
**SQL is more powerful than ORM query languages.&lt;br /&gt;
**Database abstraction forbids vendor specific optimizations.&lt;br /&gt;
&lt;br /&gt;
=Further Reading=&lt;br /&gt;
To get more information on ORM and the different type of ORM's available for Ruby , please look into the [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Fall_2010/ch3_3j_KS Object-relational mapping Fall 2010] and [http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_517_Fall_2010/ch3_3f_ac Fall 2010 detailed ORM explanation] wiki pages.&lt;br /&gt;
=Future Work=&lt;br /&gt;
More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.&lt;br /&gt;
=References=&lt;br /&gt;
&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_mapping Wikipedia, Object-relational mapping (ORM)]&lt;br /&gt;
# [http://docforge.com/wiki/Object-relational_mapping  Object-relational mapping]&lt;br /&gt;
# [http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Matching Problem]&lt;br /&gt;
# [http://www.hibernate.org/about/orm  The Object-Relational Impedance Mismatch]&lt;br /&gt;
# [http://edgeguides.rubyonrails.org/active_record_basics.html#object-relational-mapping  Active Record]&lt;br /&gt;
# [http://loianegroner.com/2011/02/introduction-to-ibatis-mybatis-an-alternative-to-hibernate-and-jdbc/ MyBatis/iBatis]&lt;br /&gt;
# [http://rubydoc.info/gems/squeel/1.1.1/frames Squeel]&lt;/div&gt;</summary>
		<author><name>Semhatr2</name></author>
	</entry>
</feed>