CSC/ECE 517 Fall 2013/ch1 1w44 s

From Expertiza_Wiki
Revision as of 23:42, 7 October 2013 by Sagarn (talk | contribs)
Jump to navigation Jump to search

Non-relational Databases in Rails Applications

Active Record is basic to Rails and it is responsible to for allowing programs to treat records in relational dbs as objects. It employs Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. While Active Record is efficient in what it does, the complexities of ORM make way for a new approach for databases with rails, that is to move to different Database Models namely Object-Oriented Databases and No-SQL Databases.

Overview

Rails, as a web development framework, and Ruby, as a programming language, can be used with a choice of operating systems (Linux, Mac OS X, Windows), databases (SQLite, MySQL, PostgreSQL, and others), and web servers (Apache, Nginx, and others). The Rails stack can also include a variety of software libraries (gems) that add features to a website or make development easier. Sometimes the choice of components in a stack is driven by the requirements of an application

Rails uses the Active Record pattern which describes the mapping of an object instance to a row in a relational database table, using accessor methods to retrieve columns/properties, and the ability to create, update, read, and delete entities from the database.

Active Record is the layer of the system responsible for representing business data and logic. It facilitates the creation and use of business objects whose data requires persistent storage to a database. It employs Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.

Active Record gives us several mechanisms, the most important being the ability to:

Represent models and their data.
Represent associations between these models.
Represent inheritance hierarchies through related models.
Validate models before they get persisted to the database.
Perform database operations in an object-oriented fashion.

Object-relational impedance mismatch

The relational model on which Active Record relies on takes data and separates it into many interrelated tables that contain rows and columns. Tables reference each other through foreign keys that are stored in columns as well. When looking up data, the desired information needs to be collected from many tables (often hundreds in today’s enterprise applications) and combined before it can be provided to the application. Similarly, when writing data, the write needs to be coordinated and performed on many tables.

The Active Record pattern describes the mapping of an object instance to a row in a relational database table, using accessor methods to retrieve columns/properties, and the ability to create, update, read, and delete entities from the database. The pattern has numerous limitations referred to as the Object-relational impedance mismatch. Some of these technical difficulties are structural. In OO programming, objects may be composed of other objects. The Active Record pattern maps these sub-objects to separate tables, thus introducing issues concerning the representation of relationships and encapsulated data. Active Record uses macros to create relationships between objects and single table inheritance to represent inheritance.

Active Record is a solution for the Object-relational impedance mismatch, but this is assuming the datastore is relational. It’s also, fundamentally, a hack.

Non Relational Databases

Object Database

An object database (also object-oriented database management system). Object-Oriented Database Management System (OODBMS) is a Database Management System (DBMS) which allows information to be represented in the form of objects as used in object-oriented programming. OODBMSs provide an integrated application development environment by joining object-oriented programming with database technology. OODBMSs enforce object oriented programming concepts such as encapsulation, polymorphism and inheritance as well as database management concepts such as Atomicity, Consistency, Isolation and Durability. Since both the programming language and OODBMS use the same object-oriented model, the programmers can maintain the consistency easily between the two environments.

Comparing RDBMS with Object Database

Criteria RDBMS OODBMS
Object-oriented programming support Poor Extensive
Ease of Use Table schemas easy to understand for everyone Good for Programmers only
Ease in development Independent from application Objects are a natural way to model, can incorporate types and relationships
Extensiblity Low High
Data relationships Hard to model Easy to handle
Maturity Very Mature Relatively Mature

No-SQL Databases

Comparing No-SQL DB with RDBMS

Relational and Object database models are very different. For example, a document-oriented database takes the data you want to store and aggregates it into documents using the JSON format. Each JSON document can be thought of as an object to be used by your application. A JSON document might, for example, take all the data stored in a row that spans 20 tables of a relational database and aggregate it into a single document/object. Aggregating this information may lead to duplication of  information, but since storage is no longer cost prohibitive, the resulting data model flexibility, ease of efficiently distributing the resulting documents and read and write performance improvements make it an easy trade-off for web-based applications.

Document Object Model equivalent to Relational tables
Document Object Model equivalent to Relational tables

Another major difference is that relational technologies have rigid schemas. Relational technology requires strict definition of a schema prior to storing any data into a database. Changing the schema once data is inserted is a big deal, extremely disruptive and frequently avoided. In comparison, document databases are schemaless, allowing you to freely add fields to JSON documents without having to first define changes. The format of the data being inserted can be changed at any time, without application disruption.

Examples in Rails Apps

MongoDB

MongoDB is a document database that provides high performance, high availability, and easy scalability. Key features:

  • Flexibility

MongoDB stores data in JSON documents (which we serialize to BSON). JSON provides a rich data model that seamlessly maps to native programming language types, and the dynamic schema makes it easier to evolve your data model than with a system with enforced schemas such as a RDBMS.

  • Power

MongoDB provides a lot of the features of a traditional RDBMS such as secondary indexes, dynamic queries, sorting, rich updates, upserts (update if document exists, insert if it doesn’t), and easy aggregation. This gives you the breadth of functionality that you are used to from an RDBMS, with the flexibility and scaling capability that the non-relational model allows.

  • Speed/Scaling

By keeping related data together in documents, queries can be much faster than in a relational database where related data is separated into multiple tables and then needs to be joined later. MongoDB also makes it easy to scale out your database. Autosharding allows you to scale your cluster linearly by adding more machines. It is possible to increase capacity without any downtime, which is very important on the web when load can increase suddenly and bringing down the website for extended maintenance can cost your business large amounts of revenue.

  • Ease of use

MongoDB works hard to be very easy to install, configure, maintain, and use. To this end, MongoDB provides few configuration options, and instead tries to automatically do the “right thing” whenever possible. This means that MongoDB works right out of the box, and you can dive right into developing your application, instead of spending a lot of time fine-tuning obscure database configurations.

A MongoDB deployment hosts a number of databases. A database holds a set of collections. A collection holds a set of documents. A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection’s documents may hold different types of data.

MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs:

{ "item": "pencil", "qty": 500, "type": "no.2" }

MongoDB stores documents on disk in the BSON serialization format. BSON is a binary representation of JSON documents, though contains more data types than does JSON

MongoMapper and Mongoid are the two leading gems that make it possible use MongoDB as a datastore with Rails. MongoMapper is a simple ORM for MongoDB. Mongoid’s goal is to provide a familiar API to Active Record, while still leveraging MongoDB’s schema flexibility, document design, atomic modifiers, and rich query interface.

MongoDB in Rails

Firstly to create a new Rails app using MongoDB we need to ensure we create it without active record and install the necessary gems outlines before.

> rails new my_app --skip-active-record
> cd my_app
> gem install mongo
> gem install bson
> gem install bson_ext
> gem install mongomapper
> gem install mongoid
> rails server

Rails model files are altered to not have the classes inherit from ActiveRecord::Base, must include a module MongoMapper::Document, and define the schema in the actual file. Database migrations are not necessary.

class Story
 include MongoMapper::Document

 key :title,     String
 key :url,       String
 key :voters,    Array
 key :votes,     Integer, :default => 0

 # Cached values.
 key :comment_count, Integer, :default => 0
 key :username,      String

 # Note this: ids are of class ObjectId.
 key :user_id,   ObjectId
 timestamps!

 # Relationships.
 belongs_to :user

 # Validations.
 validates_presence_of :title, :url, :user_id
end

Additionally, you have a number of configuration options available, such as allow_dynamic_fields that allows you to define attributes on an object that aren’t in the model file’s schema. You can then add some logic in your model file if you need to do something different depending on the existence or absence of this field.

For fast lookups, we can create an index on this field. In the MongoDB shell:

db.courses.ensureIndex('voters');

To find all the Courses by name:

Course.all(:conditions => {:name => @suject.name})

In a relational database, comments are usually given their own table, related by foreign key to some parent table. This approach is occasionally necessary in MongoDB; however, it’s always best to try to embed first, as this will achieve greater query efficiency.

  • Linear Comments
class Story
  include MongoMapper::Document
  many :comments
end

 class Comment
  include MongoMapper::EmbeddedDocument
  key :body, String

  belongs_to :story
end

If we were using the Ruby driver alone, we could save our structure like so:

@stories  = @db.collection('stories')
@document = {:title => "MongoDB on Rails",
             :comments => [{:body     => "Revelatory! Loved it!",
                            :username => "Matz"
                           }
                          ]
            }
@stories.save(@document)
  • Nested Comments

To build threaded comments:

@stories  = @db.collection('stories')
@document = {:title => "MongoDB on Rails",
             :comments => [{:body     => "Revelatory! Loved it!",
                            :username => "Matz",
                            :comments => [{:body     => "Agreed.",
                                           :username => "rubydev29"
                                          }
                                         ]
                           }
                          ]
            }
@stories.save(@document)

We can also represent comments as their own collection. Relative to the other options, this incurs a small performance penalty while granting us the greatest flexibility. The tree structure can be represented by storing the unique path for each leaf.

class Comment
  include MongoMapper::Document

  key :body,       String
  key :depth,      Integer, :default => 0
  key :path,       String,  :default => ""

  # Note: we're intentionally storing parent_id as a string
  key :parent_id,  String
  key :story_id,   ObjectId
  timestamps!

  # Relationships.
  belongs_to :story

  # Callbacks.
  after_create :set_path

  private

  # Store the comment's path.
  def set_path
  ....
    end
    save
  end

CouchDB

CouchDB is often categorized as a “NoSQL” database. A CouchDB database lacks a schema, or rigid pre-defined data structures such as tables. A CouchDB server hosts named databases, which stores documents. Data stored in CouchDB is a JSON document(s). The structure of the data, or document(s), can change dynamically to accommodate evolving needs. Each document is uniquely named in the database, and CouchDB provides a RESTful HTTP API for reading and updating (add, edit, delete) database documents. Documents are the primary unit of data in CouchDB and consist of any number of fields and attachments. Documents also include metadata that’s maintained by the database system. Document fields are uniquely named and contain values of varying types (text, number, boolean, lists, etc), and there is no set limit to text size or element count. User can just chuck arbitrary JSON objects up at this thing and it’ll store it. No setup, no schemas, no nothing.


Key Charachteristics

  • Documents

A CouchDB document is a JSON object that consists of named fields. Field values may be strings, numbers, dates, or even ordered lists and associative maps. An example of a document would be:

{
    "Subject": "Foo Bar",
    "Author": "John Doe",
    "PostedDate": "10/7/2013",
    "Tags": ["foo", "bar"],
    "Body": "Lorem Ipsum..."
}

A CouchDB database is a flat collection of these documents. Each document is identified by a unique ID.

  • Views

To address this problem of adding structure back to semi-structured data, CouchDB integrates a view model using JavaScript for description. Views are the method of aggregating and reporting on the documents in a database, and are built on-demand to aggregate, join and report on database documents. Views are built dynamically and don’t affect the underlying document; you can have as many different view representations of the same data as you like. Incremental updates to documents do not require full re-indexing of views.

  • Schema Free

Unlike SQL databases, which are designed to store and report on highly structured, interrelated data, CouchDB is designed to store and report on large amounts of semi-structured, document oriented data. CouchDB greatly simplifies the development of document oriented applications, such as collaborative web applications.

In an SQL database, the schema and storage of the existing data must be updated as needs evolve. With CouchDB, no schema is required, so new document types with new meaning can be safely added alongside the old. However, for applications requiring robust validation of new documents custom validation functions are possible. The view engine is designed to easily handle new document types and disparate but similar documents.

  • Distributed

CouchDB is a peer based distributed database system. Any number of CouchDB hosts (servers and offline-clients) can have independent "replica copies" of the same database, where applications have full database interactivity (query, add, edit, delete). When back online or on a schedule, database changes can be replicated bi-directionally.

CouchDB has built-in conflict detection and management and the replication process is incremental and fast, copying only documents changed since the previous replication. Most applications require no special planning to take advantage of distributed updates and replication.

CouchDB on Rails

First of all you need install CouchDB.

sudo apt-get install couchdb -y

Or we can use the many pre compiled binaries on the CouchDB website.

After installation, check if all work fine by curl request

$ curl http://localhost:5984

{"couchdb":"Welcome","version":"1.1.1"}

We now install the Couchrest gem for ruby

gem install couchrest

To create a new model, we need to define a class that inherits from the CouchRest::Model::Base class in much the same way as an ActiveRecord model. Unlike a regular database mapper, CouchRest Model requires you to define the properties or attributes your model needs to use directly in the class. There are no migrations or automatic column detection in a Document Database.

class Person < CouchRest::Model::Base
  use_database 'people'
  property :first_name, String
  property :last_name, String
  timestamps!
end

CouchDB is accessed via HTTP requests, as such, there is no requirement to create a binary connection to a specific database. The use_database method tells the model which database it should use. This can be provided as a CouchRest::Database object or as simple string that will be concatenated to the connection prefix. A special property macro is available called timestamps! that will create the created_at and updated_at accessors.


Summary

Ruby developers should follow a certain criteria or guidelines during software development. Coding standards are set of rules, guidelines and regulations on the manner of writing a code that helps programmers and developers read and understand quickly the source code that conforms to style and help avoid introducing misunderstanding and faults.

Particularly in Ruby development, coding standards are extremely important; therefore Ruby developers should put an importance to them. This is because these standards offer higher uniformity and consistency when writing code by different programmers. This could result in a code that's simple to know and preserve, thus reducing the project’s overall expenses.

Some of the benefits of using coding standards are:

  • Easy to understand and maintained
  • Boost the code’s readability
  • Maintainable applications
  • Eradicates complexity
  • Separate documents look more appropriate

See Also

References

<references/>