CSC/ECE 517 Fall 2011/ch3 3h rr
3h. Primitive objects. At the beginning of Lecture 11, we discovered that Fixnums and Bignums are handled differently behind the scenes in Ruby. Other languages, like Java, have made similar distinctions. By contrast, languages such as C# and Eiffel try to hide these implementation differences from users. Answer two questions: (1) How have different o-o languages implemented primitive objects? E.g., how are they represented in memory, how are they tested for, do comparisons do anything different than for class objects, etc. (2) What are the advantages and disadvantages of treating primitives differently from class objects in source code?
Introduction
Programming languages, whether statically or dynamically typed, have support for certain in-built data types. These data types, known as primitive types, are the basic representation of information in programs and have certain fixed attributes for a specific language[1]. Statically typed languages such as C++, Java, Perl etc. support primitive data types, whereas with dynamically typed languages such as Ruby, Smalltalk, Lisp etc. they are actually in the form of primitive objects. These primitive types are used to store the basic types of information that a computer can store and manipulate, and can also be used as building blocks for creating more complex data types. This article explains the way different primitive types are implemented in certain object oriented languages. An analysis of the benefits and drawbacks of such types and the methods used to operate on them is also presented.
Primitive Types
The primitive types commonly included in most programming languages are:
- Boolean
- Character
- Integer
- Floating-point number
- Fixed-point number
- Reference
Boolean
A Boolean is a primitive data type used to store one of two logical types: true or false. Boolean data types are most commonly used as input paramters to a conditional statement (such as an ‘if’ statement), or as the output of a comparison between two comparable data types. Booleans can be implemented in languages as either a discrete logical type, or implicitly as a numerical type. In many languages, booleans can be implicitly converted to and from integer types.
Character
A character is a data type that represents an element of a written language, such as a letter, number, or symbol. A character can also represent a control character, such as a carriage return or newline, which does not have a written meaning but controls how other characters are stored or displayed. Characters are commonly stored as integers, and encoded using a character map.
Integer
An integer is a data type that represents one element of a finite subset of mathematical integers. Integer, or Integral, data types can be either unsigned (able to store only positive whole numbers) or signed (able to store either positive or negative whole numbers). The range of values that can be represented by an integer depends on the number of bits used to store the integer, whether or not it is a signed integer, and the encoding scheme (if it is signed). Typically, an integer has a minimum and maximum value, and can store any integer in the range between those values. The minimum value for unsigned integers is typically 0, and the maximum value is typically determined by the amount of memory used to store the integer. For example, a un unsigned 8-bit number can store 2^8 (or 256) possible integral values; and would typically store any value from 0 to 255. More generally, an n-bit unsigned integer can store from 0 to (2^n)-1. For signed integers, modern computers use the Two’s Complement encoding scheme. This allows for a range of −2^(n−1) through 2^(n−1)−1. For example, an 8-bit signed integer could store any whole number in the range from -128 through +127.
Floating-Point Number
A floating point number is a data type used to represent real numbers in a large range with varying degrees of precision. In this representation, numbers are represented with a variable number of significant digits, and a variable number of exponential digits.
Fixed-Point number
A fixed-point number is a data type used to represent real numbers. Fixed-point numbers are called fixed-point because they have a set number of digits before and after a decimal mark. In this regard, fixed-point numbers are represented as an integer, but are scaled by a predetermined factor.
Fixed-point numbers are commonly used in microprocessors that do not have a floating-point unit, or in systems in which computational efficiency is critical. Fixed-point numbers can be treated as integers by an arithmetic logic unit (ALU) and scaled after a result is obtained, which can significantly lower the amount of time needed for a processor to obtain the result for some algorithms.
Implementing algorithms using fixed-point arithmetic requires great care, because of the potential for information loss. Fixed-point arithmetic operations -- multiplication in particular, has the potential to cause overflow. Algorithms must be written with care to ensure that each term of an equation has a similar range and that the result will not cause an overflow.
Reference
A Reference is a data type that enables a program to access another item in memory. A reference differs from other primitive data types in that it does not store data itself; instead it stores a value referring to another data object. References are commonly used to refer to objects of large non-primitive data types. References commonly store the physical memory address of the data that they are referring to. Accessing the data referred to by a Reference is called dereferencing.
Primitive Data Types in C++
C++ is a statically-typed object oriented language. C++ is based on the C programming language, which is procedural, and adds support for object-oriented code.
These data types are defined in C++: [2]
Name | Size | Description | Range |
---|---|---|---|
void | N/A | the void data type is used to explicitly identify that a data has no type | N/A |
int | 32 bits | simple numerical type | See [[3]] |
float | 32 bits | single-precision IEEE 754 floating point | See [[4]] |
double | 64 bits | double-precision IEEE 754 floating point | See [[5]] |
boolean | 1 bit | boolean | false, true |
char | 8 bits | a char is a single 8-bit character encoded using ASCII | Ascii character 0x00 through ascii character 0xFF |
C++ supports Pointers for all of the types listed in the table above, as well as more complex data types (such as structs). A Pointer in C++ is a data type that stores the physical address of some other data. Pointers are created in C++ by using the * operator. For example, a *Double[] is a pointer to an array of double-precision floating point numbers. C++ Also supports function pointers -- pointers that reference the beginning address of a function in memory. They are commonly used to implement callback functions [6]
Primitive Data Types in Java
Java is a statically-typed object oriented programming language. Primitive types are defined in the language, and conversion between them must be explicitly performed. Primitive data types are created using a keyword, which is also the name of the data type. These data types are defined in Java: [7]
Name | Size | Description | Range |
---|---|---|---|
byte | 8 bits | signed two's complement integer | -128 to 127 |
short | 16 bits | signed two's complement integer | -32,768 to 32,767 |
int | 32 bits | signed two's complement integer | -2,147,483,648 to 2,147,483,647 |
long | 64 bits | signed two's complement integer | -9,223,372,036,854,775,808 to 9,223,373,036,854,775,807 |
float | 32 bits | single-precision IEEE 754 floating point | See [[8]] |
double | 64 bits | double-precision IEEE 754 floating point | See [[9]] |
boolean | 1 bit | boolean | false, true |
char | 16 bits | a char is a single 16-bit character encoded using Unicode | Unicode character \u0000 through unicode character \uffff |
Java also defines a String class, which is used to create objects of many chars. The String class provides functionality commonly implemented using arrays of chars in other languages, such as C.
Java also defines the 'unsigned' keyword, which can be used to as a modifier to any of the integral types listed in the table above. If the 'unsigned' keyword is used, the integral type will be unsigned instead of signed, and its range will change correspondingly.
Java is capable of using any two objects of the same primitive data type for comparison. Java defines a class for each data type, which have the same name but a capitalized first letter (e.g. Float instead of float). These classes, called wrapper classes provide a series of methods that can manipulate their associated primitive data type, as well as convert to and from other data types.
Primitive Data Types in C#
C# is a statically-typed object oriented programming language. Primitive types are defined in the language, and conversion between them must be explicitly performed. Primitive data types are created using a keyword, which is also the name of the data type. C# has all of the data types that are available in Java, as well as some additional ones.
Similar to Java, C# defines a String class which is used to create objects of many chars. These data types are defined in C#: [10]
Name | .NET Class | Size | Description | Range |
---|---|---|---|---|
byte | Byte | 8 bits | signed two's complement integer | -128 to 127 |
sbyte | SByte | 8 bits | signed two's complement integer | -0 to 255 |
short | Int16 | 16 bits | signed two's complement integer | -32,768 to 32,767 |
ushort | UInt16 | 16 bits | unsigned integer | 0 to 65,535 |
int | Int32 | 32 bits | signed two's complement integer | -2,147,483,648 to 2,147,483,647 |
uint | UInt32 | 32 bits | unsigned integer | 0 to 4,294,967,295 |
long | Int64 | 64 bits | signed two's complement integer | -9,223,372,036,854,775,808 to 9,223,373,036,854,775,807 |
ulong | UInt64 | 64 bits | unsigned integer | 0 to 18,446,744,073,709,551,615 |
float | Float | 32 bits | single-precision IEEE 754 floating point | -3.402823e38 to 3.02823e38 |
double | Double | 64 bits | double-precision IEEE 754 floating point | -1.79769313486232e308 to 1.79769313486232e308 |
boolean | Boolean | 1 bit | boolean | false, true |
char | Char | 16 bits | a char is a single 16-bit character encoded using Unicode | Unicode character \u0000 through unicode character \uffff |
object | Object | N/A | Object is the base type of all other types | N/A |
string | String | N/A | String is the base type for a sequence of chars | N/A |
decimal | Decimal | 128 | Decimal is an integral type that can represent a decimal number with 29 significant digits | ±1.0 × 10e−28 to ±7.9 × 10e28 |
Like Java, each primitive data type in C# also has a class associated with it that. These classes serve a similar purpose to their associated ones in Java. They are used for comparison of objects, as well as conversion between other similar types.
Primitive Objects in Ruby
Ruby is a pure object oriented language as compared to languages such as Java or C#, which use a more hybrid approach. In Ruby, all data types are represented as Objects. There are some inbuilt classes that are provided to users in Ruby. However, only some of them are a basic building block for forming other types. This subset shown below gives us a list of primitive objects that can be used for data representation and manipulation:
Name | Description | Range |
---|---|---|
TrueClass | Singleton instance "true" allowed | true |
FalseClass | Singleton instance "false" allowed | false |
Integer [11] | Abstract class that forms the basis for Fixnum and Bignum | See Fixnum and Bignum |
Fixnum [12] | Integer representations that fit in native machine word | Machine architecture dependent. 2^30-1 to -2^30 on 32-bit machines. |
Bignum [13] | Integer representations that do not fit in Fixnum width | Machine architecture dependent. Values above Fixnum range. |
Float [14] | Real numbers using double precision representation | Value after decimal point can be formatted |
String [15] | Contains sequence of characters | No physical limit, but can be decided by machine architecture |
One interesting observation from the above table is that Ruby does not have a Boolean class; instead it has a separate TrueClass and FalseClass [16].
puts true.class => TrueClass puts false.class => FalseClass
Although types such as Array, Hash are also in-built types, they can be further composed of elements that are internally represented in one of the primitive types. Hence, they will not be treated by us as primitive objects, in the traditional definition of the term. Each of the primitive objects listed above also provide certain convenience methods that are applicable for the underlying type. For example, the Fixnum, Bignum and Float types provide support for arithmetic operations such as addition (+), subtraction(--), multiplication(*) and so on. As with all other classes in Ruby, users can add functionality to existing primitive objects by reopening classes. The amount of memory required to implement the primitive objects in Ruby is machine dependent in some cases.
Merit Analysis of Primitive Types
This section deals with a brief analysis of the relative merits and demerits of primitive data types. While we focus on Java or Ruby for this purpose, most of these points are applicable across all object oriented languages.
Advantages
Primitive types in object oriented languages have certain advantages over their class object counterparts.
- Simplicity: Primitive types/objects provide users a simple mechanism of manipulating data without relying on additional objects to achieve the same functionality. Operations on primitive types are more intuitive.
- Efficiency: This statement is applicable if the underlying primitive object definition is not modified (a feature that languages such as Ruby provide to users). As the representation in memory is designed to be make most efficient use of the underlying datatype, use of primitives can provide a benefit to the user, over the use of class objects to store the same data.
Eg. Java provides wrappers [17] for certain primitive types. There is a certain performance and space cost associated with these. So, to maximize efficiency, direct use of the primitive types would provide the most benefit.
- Ability to use inbuilt methods: Depending on the primitive type, languages such as Ruby provide methods that can be used specifically to probe or manipulate objects.
Eg. String primitive object provides convenience methods such as upcase to convert the entire string to upper case, or capitalize, which converts only the first character to upper case.
- Ease of testing for comparison: With primitive types, the equality testing operators such as == can be used. These essentially compare the values stored in the primitive types. Regular objects also offer the eql? method for testing equality. However, the following are not equivalent:
a=10 => 10 a==10 => true a==10.0 => true a.eql?(10.0) => false
The reason the .eql? fails is that this operator tests for value and type being the same. 10 is type Fixnum and 10.0 is type Float. The eql? can be overridden by == for primitive objects if you wish to compare only the values, but that can have a negative impact on performance [18].
Disadvantages
- Lack of inheritance capability: The primitive data types in languages such as Java cannot be inherited to create further subtypes.
- Unexpected results due to method overriding: There are certain examples such as [19], which show that overriding inbuilt methods such as == and eql? can lead to unexpected results.
Conclusion
Object oriented languages have varying levels of support for primitive data types and objects. Whether they are beneficial or not depends on the application to a great deal. If handled correctly, they can make object oriented programs more efficient. However, the user needs to be aware of the underlying representation of these types to handle any unexpected results.
References
- http://en.wikipedia.org/wiki/Primitive_data_type
- http://sparkcharts.sparknotes.com/cs/cplusplus/section2.php
- http://www.jk-technology.com/c/inttypes.html
- http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html
- http://newty.de/fpt/intro.html
- http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
- http://msdn.microsoft.com/en-us/library/ms228360%28v=vs.80%29.aspx
- http://ruby-doc.org/docs/ProgrammingRuby/html/builtins.html
- http://blog.vishnuiyengar.com/2009/09/primitive-obsession-in-ruby-aka-not.html
- http://www.glenmccl.com/tip_016.htm
- http://www.skorks.com/2009/09/ruby-equality-and-object-comparison/