CSC/ECE 517 Fall 2013/rails infrastructure management tools

From Expertiza_Wiki
Jump to navigation Jump to search

Introduction

Infrastructure Managment Tools are becoming more widely used today and it is a skill set that is growing rapidly in development. With systems growing in size and complexity automated tools are becoming the norm as opposed to the exception.

Here we shall compare two popular Infrastructure Management Tools-

  • Puppet
  • Chef

We shall begin with an initial overview of both tools to familiarize the reader. We will then take a hypothetical scenarion and try and solve the problems in those scenario using Puppet and Chef.

We will then compare the solutions and compare how each tool handles the same problem. We do not intend to categorically state the superiority or suitability of one tool over the other but to illustrate their usage and let the reader choose them as per her/his need.

Overview of Puppet

Puppet is a Infrastructure Management tool written in Ruby. It's primary use is to manage the setup and configurations of machines or instances. For this reason it is widely used by system administrators and DevOps personnel to setup and maintain systems.

Puppet works in two fundamental modes -


Serverless Puppet

In Serverless Puppet each instance with Puppet installed is standalone i.e is independent and running without external supervision.

Effectively there are two processes-

  1. A master process
  2. An agent process

The master process consumes the configuration requirements fed it via Puppet and the agent applies it. Here the particular instance running Puppet has complete access to it's own configuration information.

Master / Agent Puppet

This is a typical client server pattern.

Here there is a central Master Puppet instance. This master instance contains the configuration information for all the Agent Puppet instances.

The Agent puppets are configured to periodically inform the Master with “Facts” which is basically information about the current state of the Puppet configuration.

The Master Puppet compares the state of each Agent with what it actually should be and sends it a “Catalog” which tells the Agent how it should configure itself.

After the Agent receives the Catalog and does the required changes it Reports back to the Master informing it of the status of the changes.

A point here to note is that no Agent can see the configuration information of other Agents except it's own which it receives via the Catalog from the Master


INSERT DIAGRAM


Resources

Up till now we have been discussing the “configuration” a machine in abstract terms. This is where one of the key features of Puppet comes up.

Puppet treats the configuration of each machine as being made up of atomic units called Resources.

A resource can be -

  • A file that needs to present at a particular location, with particular permissions, having some content etc
  • A package that needs to be installed
  • A service that needs to be running

and so on.

Puppet has a declarative DSL wherein a resource can be described which is in the following format-

resource_type { 'resource_tile': attribute_1 => value_1, attribute_2 => value_2, ….. }


An example would be -

service { 'rails s': ensure => 'running' }


We would include this resource as part of our Rails server configuration to ensure that our server is always running.

The attributes vary depending on the type of the resource and the values belong to range depending on the attribute and the resource type.

A very key feature to be noted here that Puppet focuses on the WHAT instead of the HOW.

What does this mean? Let us take the above example – the command to start a process would be different depending on the OS. But Puppet abstracts that complexity for us – we need only tell it HOW we want the machine to be i.e we want a particular process running and Puppet handles the rest – we tell it the state we want it to be and Puppet makes it so.

Overview of Chef

Opscode Chef is an infrastructure automation and configuration management framework.

Chef is used for configuration, deployment and scalability of servers and applications for eg: it can be used to deploy and manage network in cloud, on-site or a hybrid one with n number of servers. This helps organization to save time, solve mission-critical challenges, accelerate time to market, manage scale and complexity, and safeguard systems.

Chef client works using the abstraction definitions know as cookbooks and recipes that are written in Ruby. Chef client uses this definitions to build and manage servers in an infrastructure.

Elements

Basically an organization consists of following elements with chef configuration:

  1. Node : Any physical, virtual server is referred to as node.
  2. Knife : An interface between the local chef-repo and the server is known as knife. It is a command line tool using which users can manage other elements of the setup.
  3. Workstation : It is a computer used to interact with the single server. It is also used by users to run the knife command.
  4. Repository : usually referred as chef repo is the location wher all data objects are stored like cookbooks, configuration files etc.
  5. Hoster server: The stores manages cookbooks and all the configuration details that need to be applied to nodes.
  6. Chef-client : Chef client acts as an interface between the node and hosted server. Node gets all the configuration details from server via the client.
  7. Cookbook : Cookbook stores the configuration details depending upon the scenario for eg: in order to install Apache Tomcat server it contains details about installation, upgradation, components needed to support the server.

The Scenario

The scenario we will be using for the purposes of this comparison will be one that is a very prevalent one - a large scale web application infrastructure. We shall assume that the web application under discussion is being developed used Rails though this does not overly affect the infrastructure management - it is equally applicable to other frameworks.

The specific requirements of this sample project are as follows-

  • Zero downtime - the website's operation is a critical part of the client's business model. They require that the website should have zero downtime which means the infrastrucutre must handle failures, deployment and maintenance without interrupting the site's operation.
  • Large scale - The client anticipates a very high and dynamic load on the site which means the site should adapt to user demand and the infrastructure should be able to handle the required performance.
  • System Integration - The project is being built to be integrated with the client's legacy systems and need to integrate with them. The infrastructure must support the connection between these systems.

Tasks: Use Cases

To achieve the above goals various tasks need to be accomplished. We shall look at each of these tasks in turn and see how they can be accomplished using Puppet and Chef and compare them in each use case.

Installation and Setup

As part of our example scenario we shall assume that our environment consists of a collection of Amazon Web Services (AWS) instances.

Puppet

Puppet can be installed via packages for debian and yum based systems. The package source need only be configured the normal way and the package to be installed.

A particular instance needs to be chosen as the Puppet Master and the puppet-server package needs to be installed on it.

The remaining instances need to have the puppet package installed on it.

Once that is the Puppet Master needs to be configured with a valid host name. Then the agent nodes need to be configured to point to the host name of the server.

Finally the Puppet Master needs to sign of on the certificate of each Puppet Agent that will communicate with it as all the communication between the Puppet nodes is encrypted and the services of the master and agent nodes need to be started

Chef

  1. Install chef server on the server machine
  2. Setup workstation:
    1. Setup chef-repo in the workstation.
    2. The machine needs to have access to the server, as well one of the chef clients machine which can act as the first node.
  3. Setup chef client: Using knife tool in workstation, chef clients are setup on nodes.
  1. Install chef server on the server machine
  2. Setup workstation:
  • Pre-requisites : The workstation must have a Virtual box , vagrant(command line interface tool), Git( needed for Version Control Systems)
  • Use knife client command to establish an API client identity for the workstation in order to make authenticated requests to the server. Setup chef-repo in the workstation.
  1. Create Cookbooks:
  • In order to use community cookbooks one can just run the following command:
knife cookbook site install <cook_name>
eg: knife cookbook site install apt.
Note: Chef has maintained a list of community cookbooks which are commonly covered under the use cases like apache2, mysql,php. So a user can leverage the community cookbooks instead of creating their own cookbooks.
  • In order to create our own cookbook:
knife cookbook create cookbook_name
The file default.rb in the path chef-repo/cookbooks/aliases/recipes/default.rb can be edited in order to create recipes.
To make these cookbooks available to the nodes, it is first uploaded in the chef server.
  1. Setup chef client: Using knife tool in workstation, chef clients are setup on nodes.

How They Compare

Similarities

  • Separate packages need to be installed for the client/agent and server/master for Chef and Puppet respectively
  • Both use SSL encryption to communicate

Differences

  • Puppet uses hostnames to uniquely identify whereas Chef uses Fully Qualified Domain Names
  • With Puppet each individual node has to be accessed for the installation.However with Chef installation of an agent node can be done remotely
  • With Chef a chef-repo needs to be installed on a workstation to use various Chef tools but with Puppet the puppet package is sufficient

Configure a Node

Our sample project will be using MongoDB. So the next task will be setting up the MongoDB instances.

Puppet

As we have seen in Puppet a machine’s configuration is specified in terms of resources. To tell Puppet to set up those resources on a machine we use a Manifest that consists of various resources.

But setting up a Manifest that covers all the Resources will quickly become unwieldly.

Which is why Puppet uses Classes and Modules.

A Class is a named collection of resources. A Module encloses a Class and is something that can be included in other Puppet files thus enabling reuse. There is a large collection of Modules called the Puppet Forge that has been written covering different type of uses.

Thus setting up a manifest for a MongoDB server is as simple as-

class { ‘mongodb’ : init => ‘sysv’, version => ‘1.1.1’ # installing a specific version of mongodb }

include mongodb

# custom puppet code as per our requirments

Thus different Classes can be written which use different Modules for different uses. We can define a different class for our web server , for our file share and so on.

Once a Class has been defined by us that is tailor made to our uses setting up more instances is simply a question of provisioning the node and running the Puppet class on it.

This supports quick and simple scaling up.

Provision and setup a node

Node management - foreman and chef

Config files deployment

- NFS mounting,network, security-configuration of netowrk restrictions and Ip restrictions.

Authentication and key deployment

Environment setup

Other Infrastructure Management Tools

Conclusion

Further Reading

References