CSC/ECE 517 Fall 2010/ch5 5b RR: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
 
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<h1>Variable Naming Conventions</h1>
<font size=5>Variable Naming Conventions</font><br>
 
<p>Almost all programming languages allow the programmer a great deal of freedom when naming variables in a program's source code.  But choosing good names is hard. This is a tutorial on how to choose good variable names! </p>
 
==Introduction==
<p>[http://en.wikipedia.org/wiki/Naming_conventions_(programming)/ Naming conventions] is one topic in programming that has probably not been emphasized enough. There has always been a debate about its importance and effort taken to learn and follow them. Naming conventions make programs more understandable by making them easier to read. They can also give information about the function of the identifier-for example, whether it's a constant, package, or class-which can be helpful in understanding the code thereby reducing development time specially when the code is large.</p>
<p>
As an example, say a variable holds the average age of students in a class. <b>Int averageAge</b> or <b>int AverageAge</b> gives a lot of information to the programmer about what the variable does than say just <b>int a</b>; Also capitalizing the very first letter of every word makes it more readable than just <b>int averageage</b>.</p>
<p>
In this chapter, we are mainly going to look at some rules and guidelines to choose good variable names.
</p>
 
==Advantages of using naming conventions==
<ol>
<li>Having good names provides additional information (metadata) to programmers about what the variables are used for. As in the example above, a good name goes a long way in giving out specific use of variable which makes code more readable.</li>
<li>Having a convention or standard helps in having a consistent look to the code. When the development team is large and various people are working on various parts of a project, looking at some other section of the code should not look very different from what someone has written themselves.</li>
<li>Conventions usually will have rules or guidelines to sort out naming of variables or other structures in code to avoid potential ambiguity.</li>
<li>When we use naming conventions, we are forced to use meaningful and professional names rather than cute or funny names. Having good names gives good appearance to the program.</li>
<li>In the case of code reuse after a long interval of time, having a well defined naming convention can help provide better understanding of code.</li>
<li>Well-chosen names make it significantly easier for subsequent generations of analysts and developers to understand what the system is doing and how to fix or extend the source code for new business needs.</li>
 
</ol>
 
<p>In this tutorial, we look at some rules and guidelines to pick good variable names.</p>
 
==Guidelines to choose good variable names==
 
===Length of variables:===
 
<p>This is the most fundamental element of all naming conventions. Some conventions give a fixed number of characters to be used and some are heuristics.</p>
<p>Variable names should be short yet meaningful. The choice of a variable name should be mnemonic- that is, designed to indicate to the casual observer the intent of its use. One-character variable names should be avoided except for temporary "throwaway" variables. Common names for temporary variables are i, j, k, m, and n for integers; c, d, and e for characters.</p>
<P>When choosing length of the name we have to be careful that they are not so short that they cannot be identified uniquely by search and replace tools and not so long that it looks cluttered. The name should be short enough to look neat and also long enough to encode enough information about its use and be found easily.</p>
<p>Example: suppose you were to declare an array to contain id’s of all students in class, you can have a variable name<b> int IdsOfStudentsInClass[]</b>. You need not have such a long name unless it is absolutely essential. You can instead have a name such as <b>int StudentIds[]</b>. This provides enough information about what the variable is used for and is short enough that it does not make the code look bad.</p>
 
<p>The names of variables declared class constants and of ANSI constants should be all uppercase with words separated by underscores ("_"). (ANSI constants should be avoided, for ease of debugging.)</p>
 
===Using Multiple-words in a variable:===
 
<p>Using multiple words in a variable might be helpful in some cases because one word might not describe its use entirely and clearly. However, if we use multiple words in a variable name, we need a way to delimit the two words.</p>
 
<p>Two ways are usually used to delimit multiple words: one way is to delimit them using an underscore or dash, or some other non alphanumeric character which looks readable.</p>
 
<p>Underscores are usually not used within names in object oriented languages as they are used to separate module prefixes from identifiers.</p>
 
          Forbidden: Read_Master_Data()<br>
          Allowed: ReadMasterData()<br>
                        Accnt_ReadMasterData()<br>
 
<p>If a variable is a constant throughout the code, one should use all upper case letters in order to name it. Underscore can even be used for good readability. </p>
 
            For example, MAX_WIDTH, RADIUS_EARTH etc.
 
<p>Another way of doing the same is by using capitalization where we capitalize the first letters of every word or capitalize the first letter of every new word apart from the first word. This is usually called camel casing.</p>
 
===camelCasing and CamelCasing:===
 
<p>[http://en.wikipedia.org/wiki/CamelCase/ Camel case] is the use of capitalization to add clarity to compound names assigned in code. It evolved as a common standard because of the need to avoid commonly available delimiters, such as hyphens (-) or underscores (_), that were used as keywords in the syntax of languages. There are two common variants of camel case that are used.</p>
<p>
Lower camel case (or camelCase) involves using a lower-case letter for the first word in the name with each subsequent word using an upper-case letter.</p>
 
        int setVariable(){
                    }
 
<p>Upper camel case (or CamelCase) involves using an upper-case letter for the first word in the name and for all subsequent words. CamelCase is sometimes referred to as PascalCase.</p>
 
          int SetVariable(){
                      }
 
<p>This convention can be used for variable names as well. camelCase or CamelCase is often times used as the foundation for language-specific naming conventions.</p>
 
===Prefixing:===
<p>It can be useful to use prefixes for certain types of data to remind you what they are: for instance, if you have a pointer, prefixing it with "p_" tells you that it's a pointer. If you see an assignment between a variable starting with "p_" and one that doesn't begin with "p_", then you immediately know that something fishy is going on. It can also be useful to use a prefix for global or static variables because each of these has a different behavior than a normal local variable. In the case of global variables, it is especially useful to use a prefix in order to prevent naming collisions with local variables (which can lead to confusion). For example, g can be used as prefix for any global variable. L can he used for local scope. When we want to represent data types we can use i for Integers f for Float (double or single) and so on.</p>
 
    Example: if you have an integer pointer, you can declare it as int p_studentIds; suppose this was a global variable, you could have it as
    int gp_studentIds. The prefix gp clearly shows that this is a global pointer. This also avoids collision with the local variable   
    p_studentIds incase you wanted to have them both.
 
 
<p>A common convention is to prefix the private fields and methods of a class with an underscore: e.g., _private_data. This can make it easier to find out where to look in the body of a class for the declaration of a method, and it also helps keep straight what you should and should not do with a variable. For instance, a common rule is to avoid returning non-const references to fields of a class from functions that are more public than the field. For instance, if _age is a private field, then the public getAge function probably shouldn't return a non-const reference since doing so effectively grants write access to the field!</p>
 
===Using Parts of Speech:===
<p>Parts of speech can be used extensively while defining the variable names, be it class-name, variable-name or method-name. Following are the standard form to use:</p>
 
<ol>
<li>Singular noun to describe class, interface, record, variable, field, accessor method, exception.</li>
    For example: JButton, JWindow.
<li>Plural noun to describe a variable or field holding a collection.</li>                                                                               
  For example: Items, Classes.
<li>Verbs to describe methods. </li>               
  For example: dispose, getItems, delete.
<li>Adjectives to describe interface and boolean.</li>
  For example: cloneable, IsEnabled, Comparable
 
 
</ol>
<p>Avoid using reserved word or a keyword related to any language as this can lead to ambiguity.</p>
 
===Abbreviations:===
 
<p>Abbreviations are dangerous--vowels are useful and can speed up code reading. Resorting to abbreviations can be useful when the name itself is extremely long because names that are too long can be as hard to read as names that are too short. When possible, be consistent about using particular abbreviations, and restrict yourself to using only a small number of them. </p>
 
<p>Common abbreviations include "itr" for "iterator" or "ptr" for pointer. Even names like i, j, and k are perfectly fine for loop counter variables (primarily because they are so common). Bad abbreviations include things like cmptRngFrmRng, which at the savings of only a few letters eliminates a great deal of readability. If you don't like typing long names, look into the auto-complete facilities of your text editor. You should rarely need to type out a full identifier. (In fact, you rarely want to do this: typos can be incredibly hard to spot.)</p>
 
<p> The above were guidelines to pick good variable names. These are not rules as such but following them will help in writing good and readable programs. However, the most important thing to follow is to be consistent. When we choose a particular guideline, we should make sure we follow it all through the program. One of the disadvantages about using naming conventions is that naming conventions may defeat the purpose of encapsulation. The "black box" aspect is eliminated when the programmer needs to know what specific type is being used or returned.</p>
 
===Hungarian Notation:===
 
<p>[http://en.wikipedia.org/wiki/Hungarian_notation/ Hungarian Notation] is a language independent identifier naming convention used in computer programming. The notation indicates explicitly the intended use or the type of the variable or function which is named. The variable name starts with a single letter or a group of letters also called as mnemonics, which is then followed by the intended variable name desired by the programmer. Here we can even use the Camel Case convention like capitalizing first letter to distinguish the variable from type indicating variables.</p>
 
  Following examples will cement the use of Hungarian Notation
  1. bCondition : Boolean
  2. fCondition : boolean (flag)
  3. chState : charr
  4. cItems : count of items
  5. dwLightYears : double word
  6. nItem : integer or count
  7. iSize : integer or index
  8. fpCost: floating-point
  9. dbPi : double
  10. pMemory : pointer
  11. rgMonth : array, or range
  12. szLastName : zero-terminated string
  13. u16Identifier : unsigned 16-bit integer
  14. stClassroom : clock time structure
  15. fnFunction : function name
 
<p>The scope of a variable can even be described by choosing following convention. This extension is often also used  without the Hungarian type-specification,</p>
 
1. g_nColour :  member of a global namespace, integer
2. m_nColour : member of a structure/class, integer
3. m_colours :  member of a structure/class
4. s_colours:    static member of a class
5. _colours :    local variable
 
<p>The various advantages of using Hungarian type notation are: the name denotes the type of variable, we can have many variable of same name but different datatypes like illength(integer) ,dwlength(double word) etc. However, there are some disadvantages of this notation as it can make the code look ugly. Also we can’t use editor completion support feature in writing a code.</p>
 
==Conclusion:==
 
<p>Most of software projects today are developed as open source, and more over they are developed with the participation of various teams working on different parts of the project. So the most important part of the code is readability in other words using proper naming conventions throughout the project. The advantages of using naming conventions outweigh its disadvantages. If proper conventions aren’t used, the overall cost of software development can increase multifold. Proper documentation of the convention used can save a lot of time and cost later. Even the maintainability of the code is highly achieved. One should use any of the above described naming conventions in order to develop healthy software development skills. </p>
 
==Links(Language specific Conventions)==
<p>(As our topic wasn’t specific to any language related conventions, if any further help is required one can see the following link to look for specific naming styles)</p>
<ul>
Language specific conventions:
<li>C++: GeoSoft's C++ Programming Style Guidelines[http://geosoft.no/development/cppstyle.html]</li>
<li>C#: Coding Standard: C# (Philips Medical Systems)[http://www.tiobe.com/standards/gemrcsharpcs.pdf]</li>
<li>D: The D Style[http://www.digitalmars.com/d/1.0/dstyle.html]</li>
<li>Erlang: Erlang Programming Rules and Conventions[http://www.erlang.se/doc/programming_rules.shtml]</li>
<li>Java: Sun official Java coding style[http://java.sun.com/docs/codeconv/]</li>
<li>Lisp: Riastradh's Lisp Style Rules[http://mumble.net/~campbell/scheme/style.txt]</li>
<li>Mono: Programming style for Mono[http://www.mono-project.com/Coding_Guidelines]</li>
<li>Perl: Perl Style Guide[http://perldoc.perl.org/perlstyle.html]</li>
<li>PHP::PEAR: PHP::PEAR Coding Standards[http://pear.php.net/manual/en/standards.php]</li>
<li>Python: Style Guide for Python Code[http://www.python.org/peps/pep-0008.html]</li>
<li>Ruby: Ruby and Rails Naming Conventions[http://itsignals.cascadia.com.au/?p=7]</li>
<li>ActionScript(Flex): Flex SDK coding conventions and best practices[http://opensource.adobe.com/wiki/display/flexsdk/Coding+Conventions]</li>
</ul>
 
==Other useful links on naming conventions==
<ul>
<li>Hungarian notation[http://en.wikipedia.org/wiki/Hungarian_notation]</li>
<li>CamelCase[http://en.wikipedia.org/wiki/CamelCase]</li>
<li>How to Improve the Readability of your software code[http://www.wikihow.com/Improve-the-Readability-of-Your-Software-Code]</li>
<li>Program Comprehension During Software Maintenance and Evolution[http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/mags/co/&toc=comp/mags/co/1995/08/r8toc.xml&DOI=10.1109/2.402076]</li>
<li>What is code readability?[http://www.perlmonks.org/?node_id=592616]</li>
<li>Variable and Function Naming Convention[http://faculty.ed.umuc.edu/~jrugg/policy/vrbl_naming_convention.htm] </li>
 
</ul>

Latest revision as of 02:11, 4 November 2010

Variable Naming Conventions

Almost all programming languages allow the programmer a great deal of freedom when naming variables in a program's source code. But choosing good names is hard. This is a tutorial on how to choose good variable names!

Introduction

Naming conventions is one topic in programming that has probably not been emphasized enough. There has always been a debate about its importance and effort taken to learn and follow them. Naming conventions make programs more understandable by making them easier to read. They can also give information about the function of the identifier-for example, whether it's a constant, package, or class-which can be helpful in understanding the code thereby reducing development time specially when the code is large.

As an example, say a variable holds the average age of students in a class. Int averageAge or int AverageAge gives a lot of information to the programmer about what the variable does than say just int a; Also capitalizing the very first letter of every word makes it more readable than just int averageage.

In this chapter, we are mainly going to look at some rules and guidelines to choose good variable names.

Advantages of using naming conventions

  1. Having good names provides additional information (metadata) to programmers about what the variables are used for. As in the example above, a good name goes a long way in giving out specific use of variable which makes code more readable.
  2. Having a convention or standard helps in having a consistent look to the code. When the development team is large and various people are working on various parts of a project, looking at some other section of the code should not look very different from what someone has written themselves.
  3. Conventions usually will have rules or guidelines to sort out naming of variables or other structures in code to avoid potential ambiguity.
  4. When we use naming conventions, we are forced to use meaningful and professional names rather than cute or funny names. Having good names gives good appearance to the program.
  5. In the case of code reuse after a long interval of time, having a well defined naming convention can help provide better understanding of code.
  6. Well-chosen names make it significantly easier for subsequent generations of analysts and developers to understand what the system is doing and how to fix or extend the source code for new business needs.

In this tutorial, we look at some rules and guidelines to pick good variable names.

Guidelines to choose good variable names

Length of variables:

This is the most fundamental element of all naming conventions. Some conventions give a fixed number of characters to be used and some are heuristics.

Variable names should be short yet meaningful. The choice of a variable name should be mnemonic- that is, designed to indicate to the casual observer the intent of its use. One-character variable names should be avoided except for temporary "throwaway" variables. Common names for temporary variables are i, j, k, m, and n for integers; c, d, and e for characters.

When choosing length of the name we have to be careful that they are not so short that they cannot be identified uniquely by search and replace tools and not so long that it looks cluttered. The name should be short enough to look neat and also long enough to encode enough information about its use and be found easily.

Example: suppose you were to declare an array to contain id’s of all students in class, you can have a variable name int IdsOfStudentsInClass[]. You need not have such a long name unless it is absolutely essential. You can instead have a name such as int StudentIds[]. This provides enough information about what the variable is used for and is short enough that it does not make the code look bad.

The names of variables declared class constants and of ANSI constants should be all uppercase with words separated by underscores ("_"). (ANSI constants should be avoided, for ease of debugging.)

Using Multiple-words in a variable:

Using multiple words in a variable might be helpful in some cases because one word might not describe its use entirely and clearly. However, if we use multiple words in a variable name, we need a way to delimit the two words.

Two ways are usually used to delimit multiple words: one way is to delimit them using an underscore or dash, or some other non alphanumeric character which looks readable.

Underscores are usually not used within names in object oriented languages as they are used to separate module prefixes from identifiers.

         Forbidden:	Read_Master_Data()
Allowed: ReadMasterData()
Accnt_ReadMasterData()

If a variable is a constant throughout the code, one should use all upper case letters in order to name it. Underscore can even be used for good readability.

           For example, MAX_WIDTH, RADIUS_EARTH etc.

Another way of doing the same is by using capitalization where we capitalize the first letters of every word or capitalize the first letter of every new word apart from the first word. This is usually called camel casing.

camelCasing and CamelCasing:

Camel case is the use of capitalization to add clarity to compound names assigned in code. It evolved as a common standard because of the need to avoid commonly available delimiters, such as hyphens (-) or underscores (_), that were used as keywords in the syntax of languages. There are two common variants of camel case that are used.

Lower camel case (or camelCase) involves using a lower-case letter for the first word in the name with each subsequent word using an upper-case letter.

       int setVariable(){
                    }

Upper camel case (or CamelCase) involves using an upper-case letter for the first word in the name and for all subsequent words. CamelCase is sometimes referred to as PascalCase.

         int SetVariable(){
                     }

This convention can be used for variable names as well. camelCase or CamelCase is often times used as the foundation for language-specific naming conventions.

Prefixing:

It can be useful to use prefixes for certain types of data to remind you what they are: for instance, if you have a pointer, prefixing it with "p_" tells you that it's a pointer. If you see an assignment between a variable starting with "p_" and one that doesn't begin with "p_", then you immediately know that something fishy is going on. It can also be useful to use a prefix for global or static variables because each of these has a different behavior than a normal local variable. In the case of global variables, it is especially useful to use a prefix in order to prevent naming collisions with local variables (which can lead to confusion). For example, g can be used as prefix for any global variable. L can he used for local scope. When we want to represent data types we can use i for Integers f for Float (double or single) and so on.

   Example: if you have an integer pointer, you can declare it as int p_studentIds; suppose this was a global variable, you could have it as
   int gp_studentIds. The prefix gp clearly shows that this is a global pointer. This also avoids collision with the local variable     
   p_studentIds incase you wanted to have them both.


A common convention is to prefix the private fields and methods of a class with an underscore: e.g., _private_data. This can make it easier to find out where to look in the body of a class for the declaration of a method, and it also helps keep straight what you should and should not do with a variable. For instance, a common rule is to avoid returning non-const references to fields of a class from functions that are more public than the field. For instance, if _age is a private field, then the public getAge function probably shouldn't return a non-const reference since doing so effectively grants write access to the field!

Using Parts of Speech:

Parts of speech can be used extensively while defining the variable names, be it class-name, variable-name or method-name. Following are the standard form to use:

  1. Singular noun to describe class, interface, record, variable, field, accessor method, exception.
  2. For example: JButton, JWindow.
  3. Plural noun to describe a variable or field holding a collection.
  4. For example: Items, Classes.
  5. Verbs to describe methods.
  6. For example: dispose, getItems, delete.
  7. Adjectives to describe interface and boolean.
  8. For example: cloneable, IsEnabled, Comparable

Avoid using reserved word or a keyword related to any language as this can lead to ambiguity.

Abbreviations:

Abbreviations are dangerous--vowels are useful and can speed up code reading. Resorting to abbreviations can be useful when the name itself is extremely long because names that are too long can be as hard to read as names that are too short. When possible, be consistent about using particular abbreviations, and restrict yourself to using only a small number of them.

Common abbreviations include "itr" for "iterator" or "ptr" for pointer. Even names like i, j, and k are perfectly fine for loop counter variables (primarily because they are so common). Bad abbreviations include things like cmptRngFrmRng, which at the savings of only a few letters eliminates a great deal of readability. If you don't like typing long names, look into the auto-complete facilities of your text editor. You should rarely need to type out a full identifier. (In fact, you rarely want to do this: typos can be incredibly hard to spot.)

The above were guidelines to pick good variable names. These are not rules as such but following them will help in writing good and readable programs. However, the most important thing to follow is to be consistent. When we choose a particular guideline, we should make sure we follow it all through the program. One of the disadvantages about using naming conventions is that naming conventions may defeat the purpose of encapsulation. The "black box" aspect is eliminated when the programmer needs to know what specific type is being used or returned.

Hungarian Notation:

Hungarian Notation is a language independent identifier naming convention used in computer programming. The notation indicates explicitly the intended use or the type of the variable or function which is named. The variable name starts with a single letter or a group of letters also called as mnemonics, which is then followed by the intended variable name desired by the programmer. Here we can even use the Camel Case convention like capitalizing first letter to distinguish the variable from type indicating variables.

 Following examples will cement the use of Hungarian Notation
 1.	bCondition : Boolean
 2.	fCondition : boolean (flag)
 3.	chState : charr
 4.	cItems : count of items
 5.	dwLightYears : double word 
 6.	nItem : integer or count 
 7.	iSize : integer or index 
 8.	fpCost: floating-point
 9.	dbPi : double 
 10.	pMemory : pointer
 11.	rgMonth : array, or range
 12.	szLastName : zero-terminated string
 13.	u16Identifier : unsigned 16-bit integer
 14.	stClassroom : clock time structure
 15.	fnFunction : function name

The scope of a variable can even be described by choosing following convention. This extension is often also used without the Hungarian type-specification,

1.	g_nColour :  member of a global namespace, integer
2.	m_nColour : member of a structure/class, integer
3.	m_colours :  member of a structure/class
4.	s_colours:    static member of a class
5.	_colours :     local variable

The various advantages of using Hungarian type notation are: the name denotes the type of variable, we can have many variable of same name but different datatypes like illength(integer) ,dwlength(double word) etc. However, there are some disadvantages of this notation as it can make the code look ugly. Also we can’t use editor completion support feature in writing a code.

Conclusion:

Most of software projects today are developed as open source, and more over they are developed with the participation of various teams working on different parts of the project. So the most important part of the code is readability in other words using proper naming conventions throughout the project. The advantages of using naming conventions outweigh its disadvantages. If proper conventions aren’t used, the overall cost of software development can increase multifold. Proper documentation of the convention used can save a lot of time and cost later. Even the maintainability of the code is highly achieved. One should use any of the above described naming conventions in order to develop healthy software development skills.

Links(Language specific Conventions)

(As our topic wasn’t specific to any language related conventions, if any further help is required one can see the following link to look for specific naming styles)

    Language specific conventions:
  • C++: GeoSoft's C++ Programming Style Guidelines[1]
  • C#: Coding Standard: C# (Philips Medical Systems)[2]
  • D: The D Style[3]
  • Erlang: Erlang Programming Rules and Conventions[4]
  • Java: Sun official Java coding style[5]
  • Lisp: Riastradh's Lisp Style Rules[6]
  • Mono: Programming style for Mono[7]
  • Perl: Perl Style Guide[8]
  • PHP::PEAR: PHP::PEAR Coding Standards[9]
  • Python: Style Guide for Python Code[10]
  • Ruby: Ruby and Rails Naming Conventions[11]
  • ActionScript(Flex): Flex SDK coding conventions and best practices[12]

Other useful links on naming conventions

  • Hungarian notation[13]
  • CamelCase[14]
  • How to Improve the Readability of your software code[15]
  • Program Comprehension During Software Maintenance and Evolution[16]
  • What is code readability?[17]
  • Variable and Function Naming Convention[18]