CSC/ECE 517 Fall 2012/ch1b 1w62 rb
Saas 3.10 - Databases And Migrations
A Database<ref>http://en.wikipedia.org/wiki/Database</ref> is a coherent collection of data with inherent meaning. Random assortment of data is not a database. Data is organized in order to model relevant aspects of reality, so that it supports processes requiring this data. Data Migration is related to transfer of data between storage types, formats, or computer systems. This is performed programmatically to accomplish an automated migration so that humans are free from doing this repetitive task.
Database
A Database is nothing more than a collection of meaningful information. Databases can be of multiple types, for example Distributed Databases, Relational Databases, Flat File Databases. A database could be as simple as a text file with a list of names or it can even be very complex as a large relational database management system.
Examples:
- Banking Systems where accounts are maintained and it is made sure that money does not disappear as a result of system failure.
- Airline Reservation Systems where the plane details, the airport details and the customer details are maintained.
- Hotel Management Systems where the availability of rooms, the rates and the customer details are maintained.
Database Migration
Every application in reality has a database in the back end to store all the relevant data of the application. We should not test the application on database because it may contain valuable data like customer information in a banking system. So editing that data present in the database is not a good idea. The solution Rails<ref>http://guides.rubyonrails.org/getting_started.html</ref> provides for above problem is Defining three different environments: Development, Production and Testing each of which has a separate database and appropriate database types.
The Development Environment is what we use while developing the application. The production environment refers to the database that is used when the application is published in the real world. The testing environment is meant for testing tools. Testing of the application is done using Testing tools which automates the entire testing procedure. Since there are different databases in each environment, the problem that occurs is changes made in one Database do not reflect in the other.
A programmer is also responsible to tell the other developers what changes have been made in the database. Also one has to keep track of which changes need to be run against production machines during deployment. The solution for the above problem that rails offers is data migration. Databases of different types can also be migrated between the three different environments. For example, we may use SQLlite<ref>http://en.wikipedia.org/wiki/SQLite</ref> in Development environment, but we can still migrate into production environment where heroku<ref>http://en.wikipedia.org/wiki/Heroku</ref> may have been used. The source is portable and the back end understands what operations to do on the database.
In reality Rails migrations are similar to version control of databases. Rails migrations is actually used since databases change requires modifications to both code and data. Hence we cannot use a source code version control system like Subversion<ref>http://betterexplained.com/articles/a-visual-guide-to-version-control/</ref> or Sourcesafe<ref>https://wiki.library.ucsf.edu/display/~128507@ucsf.edu/Source+Safe+vs.+Subversion</ref>.
Creating A Migration
A migration is a sub class of ActiveRecord:: Migration which implements two methods: ‘up’ and ‘down’. The “up” method performs the required changes or transformations while the down methods reverses or roll backs them.
A migration can be created using the following command:
rake generate migration CreateCourse
Migration Created :
class CreateCourse < ActiveRecord::Migration def up create_table :course do |t| t.string :name t.text :description t.timestamps end end def down drop_table :course end end
The above migration CreateCourse has just been created, but has not been applied to the Database yet. This migration will add a table called courses with string column called name and the text column called description. A primary key column called id will also be created by default. The time stamp columns created_at and updated_at which ActiveRecord populates automatically will also be added. Reversing this migration is nothing but dropping the table.
Migrations can also be used to fix bad data in the database or generate new fields.
For Example:
class AddGradesToStudents < ActiveRecord::Migration def up change_table :students do |t| t.boolean :receive_grade, :default => false end User.update_all ["receive_grade = ?", true] end def down remove_column :students, :receive_grade end end
The above migration adds receive_grades to the students table. We want the default value to be false for new students. But existing students are considered to have a grade, So we use the student model to set the flag to true for existing students.
ActiveRecord<ref>http://api.rubyonrails.org/classes/ActiveRecord/Base.html</ref> provides methods that perform common data definition tasks in a database. A migration is like a Ruby class so you’re not limited to these functions. For example, after adding a column you can write a code to set the value of that column for existing records (if necessary using your models). The kind of object that is yielded as a result of the migration code is object representing table.
Updating Migrations
If you wish to make changes to the migration and you have already run the migration then you cannot just edit the migration and run it again. Rails will consider it has already run the migration, so it will do nothing on running “rake db:migrate”. The migration has to be rolled back and then make changes to the migration and run it. It is not recommended to edit existing migration and that too if it has been run on production systems. Instead, writing a new migration that performs the changes required is better. Making changes to a newly generated migration that has not been committed to source code is relatively safe.
Migrations are stored as files in the db/migrate directory, for every migration class. The name of the file is of the form YYYYMMDDHHMMSS_create_course.rb, that is nothing but a UTC timestamp identifying the migration followed by an underscore followed by the name of the migration.
Anatomy of Migrations
Migrations are a subclass of the Rails class ActiveRecord::Migration. The class must contain at least two methods i.e up and down.
class CreateCourses < ActiveRecord::Migration def self.up #... end def self.down #... end end
The ‘up’ method used to apply schema changes for this migration and the ‘down’ method is used to undo the changes. Example: the ‘up’ method creates a table with all the attribute description for the migration whereas the ‘down’ method can be used to drop the table for the same migration.
Class CreateCourses < ActiveRecord::Migration def self.up add_column :room_no, :integer end def self.down remove_column :room_no end end
Relationship between Model and Migration
In Rails, a model<ref>http://www.tutorialspoint.com/ruby-on-rails/rails-active-records.htm</ref> internally maps itself to a database table. The table in the database must be the plural form of the model’s class. If we generate a model called Course, Rails automatically looks for a table called courses in the database. You can use the Rails generator to generate both the model and a corresponding migration using the following commands:
rake generate model Course name: string description: text
will create a migration that looks like this
class CreateCourses < ActiveRecord::Migration def change create_table :courses do |t| t.string :name t.text :description t.timestamps end end end
Creating a Standalone Migration
If you are creating migrations for other purposes, then a migration generator is used:
$ rails generate migration AddSemesterToCourse
This will create an empty but appropriately named migration:
class AddPartNumberToProducts < ActiveRecord::Migration def change end end
Applying Migration to Development
Since CreateCourse migration has been created but not applied to the database, the following command is used to apply the migration to the development database:
rake db:migrate
Every Rails Database has a table called schema_migrations maintained by the migration code. Whenever a migration is applied successfully a new row will be added to schema_migrations table. The schema_migrations table has a version column. When you run rake db:migrate, the task first looks for the schema_migrations table. It will be created if it does not exist. The migration code then looks at all the migration files in db/migrate and skips from considering any that have a version number that is already in the database. It then continues by applying the remaining migrations in turn creating a new row in the schema_migrations table for every migration.
Applying Migration to Production
The following command is used to apply migration to the production database:
heroku rake db:migrate
In the above example the production database is Heroku.
Running Specific Migrations
If you need to run a specific migration up or down, the db:migrate:up and db:migrate:downtasks will do that on including the version also.
For example,
rake db:migrate:up VERSION=20080906120000
The above command will run the up method from the 20080906120000 migration. These tasks still check whether the migration has already run, so for example db:migrate:up VERSION=20080906120000 will do nothing if Active Record believes that 20080906120000 has already been run.
Rolling Back a Migration
The following command is used to rollback the last migration:
rake db:rollback
Rollback is performed when you made some mistake and instead of tracking down the version number of the previous migration, you can just rollback and run it after making changes.
If you need to rollback several migrations, a STEP parameter is used. For example,
rake db:rollback STEP=2
The above command will rollback the last 2 migrations. The db:migrate:redo task is an easy way for doing a rollback and then migrating again. If you need to redo several migrations, a STEP parameter is used. For example, in order to redo the last 4 migrations, the command is
rake db:migrate:redo STEP=4
“rake db:reset” is the command for resetting the database. This will drop the databse, recreate it and loads the current schema into it.
Problems with Migration
One major problem Migrations suffer is that most databases do not support the rolling back of create table or alter table i.e. DDL statements in general. Let’s consider an example where a migration tries to create two tables.
class Example < ActiveRecord::Migration def self.up create_table :first do ... end create_table :second do ... end end def self.down drop_table :second drop_table :first end end
In the above example the up method is used for creating the tables first and second and the down method is used to dropping the tables. Now, consider a situation where there is a problem creating the second table. Then the database will contain only the first table and not the second table. So if you try to rollback the migration, it won’t work as the original migration failed and the schema version in the database wasn’t updated. So you cannot roll back it. One solution to the above problem is, to manually change the schema information and drop the table first. But, it is recommended in these cases to simply drop the whole database, create the whole thing again and apply migrations to bring it back to the original state.
Advantages of Migration
- You can identify each migration and know when it has taken place.
- Some migrations can also be rolled back. We can specify what the roll back procedure is.
- Migrations can be managed with version control.
- Automation – Automate things to be done which makes it reliably repeatable. For example, In Ruby on Rails, we use Bundler instead of installing all gems manually. In short, specify what needs to be done and automate it.
Disadvantages of Migration
- One drawback of Rails migrations is that all migrations occur at the database level, not the table level.
- The whole discussion about Migration suggest that they are dangerous to use on production database. One should backup the database first and then use migrations on the production database.
- Most databases do not prop up the rolling back of data definition language(DDL)<ref>http://en.wikipedia.org/wiki/Data_definition_language</ref> statements.
Conclusion
In a nutshell, Database Migration is a very efficient way to handle the discrepancies that occur between databases. i.e. changes made in one database not reflecting in other. It is a convenient way to alter the database in structured and organized manner. Rails migrations are mostly similar to version control of databases. Rails migrations are database independent but SQL scripts are not.
References
<references/>
Additional Reading
- http://guides.rubyonrails.org/migrations.html#anatomy-of-a-migration
- http://www.ibm.com/developerworks/java/library/j-cb08156/index.html
- http://jacqueschirag.wordpress.com/2007/08/12/rants-about-rails-database-migrations
- http://www.oracle.com/technetwork/articles/kern-rails-migrations-100756.html
- Agile Web Development With Rails, Fourth Edition, Sam Rooby, Dave Thomas, David Heinemeier Hansson.