CSC/ECE 517 Spring 2015/ch1a 3 RF
Heroku vs OpenShift vs Digital Ocean
PAAS and Cloud Hosting
Platform as a service is defined as as a service that allows developers to host Web Applications without the hassle of setting up the entire infrastructure. Usually this infrastructure includes a server, or multiple ( for hosting ), load balancers to manage client requests and traffic, a cache and a database. All of these layers require know-how and time both for setup and installation, as well as for maintenance. With Platform as a service, the developer doesn't have to worry about any of the setup or maintenance required to run a successful web service, and instead can focus on building the application. The infrastructure management is taken care by the PASS providers.
Cloud Hosting is defined as a provision of virtual servers on the cloud. It is also referred to as Infrastructure as a Service. The service provides you Vanilla Virtual Servers. You are provided with root access to the server. You can install any packages to this server. It can be used as a load balancer, database server, cache server or whatever may be required for the application.
Heroku, OpenShift, Digital Ocean
Heroku, OpenShift and Digital Ocean are some of many options out there for hosting and/or managing your web applications. Below we will give detailed descriptions of how each of them work, and finish with a comparison along with recommendations based on what type of applications you're trying to run, or what type of background you have.
Heroku
What is it
Heroku is a company that provides what is commonly referred to as Platform as a Service (PaaS). PaaS essentially means that they provide and maintain everything that one would ordinarily have to do to host a website. It turns hosting and managing a web-service into a very simple process that only requires a few lines of configuration code and a git push to get going.
How does it work
Heroku let’s you deploy applications written in a multitude of languages, from Ruby, to Node.js, Python, PHP and others. Initially you have to tell Heroku which parts of your application are runnable, this is specified in a Procfile. For established frameworks like Ruby on Rails, Heroku can figure out what it needs to execute (‘rails server’) You can then deploy the application to heroku using git. This requires a single push command. Heroku automatically retrieves any dependencies required to build the application and assembles all the code along with any generated assets and the dependencies into a bundle called a slug. These slugs contain everything that the heroku nodes need to run the application. Heroku runs applications using Dynos. Dynos are virtual Unix containers that provide the environment required to run an application. These are preloaded with the slugs and run the command established in the Procfile (in the case of rails, ‘rails server’). In order to achieve the most throughput possible from the Dynos all requests initially go through Heroku’s routing mesh. This routing system decides which Dyno to allocate to the request, and makes sure to intelligently spread the workload across all Dynos at all times
What does it provide
Hosting
Heroku is hosted on Amazon Web Services (AWS) servers. They have Heroku hosting available in two locations. US East, and EU West.
Load Balancing
The router spreads all HTTP requests efficiently across web dynos. Effectively balancing the server load.
Database
Heroku uses a Postgres SQL database. The database is accessible from all languages supported by the Heroku platform.
Logging
Logging is an important part of running a web application. Heroku has a robust logging service that logs all events from the app, the heroku system(error pages, restarting processes, waking up a dyno), and the heroku API ( deploying new code, changing to maintenance mode). You can log from anywhere in your application using standard out or standard error. ( a puts command in ruby would be written to the log) Fetching the logs is as easy as running a command ( heroku logs ) which returns the last 100 log lines by default. Each line includes a timestamp, the source of the log ( app or heroku), the dyno that wrote the log, and the message itself.
Scaling
When running an application on Heroku you pay for a certain amount of dynos. This means that at any given time, you should have a maximum of that number of dynos handling requests to your application. If your application traffic requires you to scale up, a simple command (heroku ps:scale web+5) adds web dynos to your application, effectively scaling up your ability to handle requests.
Why would you use it?
Makes all the setup and maintenance easier. Instead of having to configure and manage the servers yourself, which might require installing and configuring a few dozen apps to guarantee speed, security, logging etc...as well as configuring and deploying a database. Heroku does all that for you, allowing your focus to be on developing the website, not maintaining its infrastructure. It also allows for very easy scaling up or down as needed. This means that as required by the website traffic, you can easily scale up to handle more requests, or scale down if requests are slowing down at a particular time. An example of this would be scaling up right before a major product release, or in the case of e-commerce, right around christmas or black friday, in order to ensure the adequate servicing of the increased number of requests. This scaling up can then be reversed once the product release or the shopping holiday is over.
Limitations
Ephemeral file system
The dynos have an ephemeral file system. This means that as a developer, you never know when files written to disk will disappear. If a dyno stops or is restarted, the files are gone. This means that any applications hosted in heroku shouldn't rely on any sort of local storage in order to work, since at the drop of a hat, that storage might be gone, and there’s nothing that you can do about it.
Limited locations
Heroku runs on AWS on US East and EU West, this means that if you want to host an application that is targeted at a market that is primarily located far away from either of these locations, ex: Japan, the latency they will experience will be very large. Another problem of the limited locations is that there is no regional redundancy. If you’re running an application on US East, and there’s a US East outage, there’s no backup servers that will allow your application to keep running, and so your application will experience that outage.
Heroku’s free plan uses shared databases. This means that your application will be sharing a postgreSQL database with other applications which means that you cannot have live access to the database. This might make any manual changes to database data, or debugging that requires manually digging into the database, impossible.
OpenShift
What is it
OpenShift is service defined as Platform as a Service provided by RedHat. Platform as a Service is layer of abstraction above Virtual Machine like Amazon EC2. Compared to a vanilla server box, it provides with all the packages and services required to host your production application. It also provides auxiliary services like monitoring, backup, scale up. The software running OpenShift is open sourced called OpenShift Origin.
How does it work
Openshift supports multiple languages and frameworks like Ruby on Rails Python, Java, Node.js The developer workflow for Openshift works as follows
- You create your application using IDE or command-line. It can be a Ruby on Rails, Django Node.js or Java servlet. As soon an application is created OpenShift will spin a gear for that application.
- Then you select the services from OpenShift which are needed for the application. These are called ‘Cartridges ‘ in OpenShift. For example if you are creating Ruby on Rails application, you would select Ruby, Postgres/Mysql, Redis, MongoDB, SendGrid.
- The deployment of the code is a simple ‘git push’. You can connect your CI server for automated testing.
- Incase of scaling adding multiple gears is as simple as clicking few buttons and it adds gears load balanced with HAProxy.
What does it provide
Hosting
OpenShift service can be used in three ways.
- OpenShift Online.
OpenShift resides on RedHat servers. This public version can be used by any developer to host their application.
- OpenShift Enterprise
The entire PAAS application can be on-premise. Many enterprise have policies that the applications should be hosted on their own data centres. This option can be used if OpenShift needs to used on self owned servers. It provides the same features compared to public Openshift.
- OpenShift Origin
OpenShift is totally open sourced. Any sysadmin or Devops can install their own Openshift on their own servers. It needs to be maintained by the sysadmin on the whole.
Load Balancing
It can easily scale horizontally by adding extra ‘gears’ which are load balanced using HAProxy.
Database
OpenShift provide database support for MySql, PostgresSQL. Now with the addition of the OpenShift Store it has started providing support for multiple database like MongoDB, CouchDB.
Marketplace
OpenShift has a marketplace for Cartridges where external Developers can create Apps/Plugins which can be integrated in OpenShift. This gives a range of choices for the developers using OpenShift. Ex. You get choice between CloudAMQP and IronMQ as your background Messaging service.
Scaling
The Scaling using OpenShift is similar to Heroku. Adding multiple gears as simple click of a button. It also provides auto scaling, It boots up new gears if your application suddenly gets spike in traffic.
Why would you use it?
OpenShift provides similar features as compared to Heroku. Unlike working on a dedicated box, you don’t have to setup the server , the packages, dependencies, DNS, Web server. OpenShift provides allied services like backup, security, logging, monitoring. It also has a marketplace of apps which can be integrated with your application. This removes the tasks of working on the setup and helps you focus on you main idea. There is no need for a database administrator or DevOps.
Limitations
Persistent Storage
Whenever a new version of the application is deployed, the files created by the application are lost. Hence files like image uploads, data files get deleted when a new version is deployed. To solve this issue a persistent storage cartridge or Amazon S3 needs to be used.
Support for DNS and SSL
OpenShift has very poor support for domain name management and SSL certificates. Setting these can become an issue with openShift.
Digital Ocean
What is it
Digital Ocean is a cloud based hosting provider. It provides virtual servers with full root level ssh access. It provides various distributions of Linux. Unlike Platform as a service, it provides Vanilla Boxes. One can install any package, application on it as required. It is built for developers and it can create a virtual server in less than 55 seconds. Unlike a dedicated server which is physically present and cannot be customized easily, DigitalOcean uses KVM virtualisation.
How it works
Digital ocean has a very user friendly interface. Digital Ocean(DO) calls Virtual servers, Droplets. You select the linux distro, the configuration(RAM, CPU), The datacenter in which you want it to be hosted. and within less than 55 seconds the droplet is ready. you can add your own SSH key or it sends you root password via email. Once you log in to your Box, you can download and install any packages or application you require. The box can be used as for any purpose. It can be used as a load balancer, host a Wordpress site, database server, a caching server. You can boot up multiple droplets and create a cluster of servers.
It also has a developer API, hence all the above tasks can also be done programmatically. The orchestration of the entire server cluster can be done using the API
What all does it provide
Digital Ocean provides virtual servers with all flavors of Linux distributions. Servers have SSD disks and new features like DNS management.
Limitations
Regional Hosting
Digital Ocean has data centers in the US and Singapore. It is difficult to provide hosting in other regions.
Orchestration
It does not provide an Orchestration Layer. Hence a full fledged Infrastructure management using configuration is not possible.
At a Glance Comparison
References
- Heroku Official Website
- Heroku Page on Wikipedia
- Heroku vs AWS on StackOverflow
- OpenShift Offical Site
- Getting started with OpenShift
- PAAS - Comparison
- Digital Ocean Offical Site