CSC/ECE 517 Fall 2015/ossA1550RAN: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 7: Line 7:
Apache Ambari is a software project of the [[Apache Software Foundation]], is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use [[Apache_Hadoop|Hadoop]] management web UI backed by its [[Representational_state_transfer|RESTful]] APIs. Ambari was a sub-project of [[Hadoop]] but is now a [[Apache_Software_Foundation#Projects|top-level]] project in its own right.
Apache Ambari is a software project of the [[Apache Software Foundation]], is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use [[Apache_Hadoop|Hadoop]] management web UI backed by its [[Representational_state_transfer|RESTful]] APIs. Ambari was a sub-project of [[Hadoop]] but is now a [[Apache_Software_Foundation#Projects|top-level]] project in its own right.


An Apache Hadoop cluster consists on a group or nodes/machines. Each node acts as an Ambari Agent that runs various service components (e.g. Datanode for HDFS). One of the agents acts as an Ambari Server that takes care of the task allocation, management, gathering information about the services status and other information from the agents and giving the information to Ambari UI for display. More information on Ambari architecture can be found at [[https://issues.apache.org/jira/secure/attachment/12559939/Ambari_Architecture.pdf]]
An Apache Hadoop cluster consists on a group or nodes/machines. Each node acts as an Ambari Agent that runs various service components (e.g. Datanode for HDFS). One of the agents acts as an Ambari Server that takes care of the task allocation, management, gathering information about the services status and other information from the agents and giving the information to Ambari UI for display. More information about Agent-Server-Web flow is available in the following sections.


=='''Current Implementation'''==
=='''Current Implementation'''==

Revision as of 01:08, 1 November 2015

A1550 - Web Socket Implementation in Apache Ambari

Ambari-Web uses simple ajax polling mechanism to fetch data from Ambari-Server. Constant polling is done to show current service status, alerts, service graphs, etc on Ambari-Web. With this mechanism, the performance of Ambari-Server can be affected on a large size cluster with multiple active browser sessions due to continuous heavy requests being made.

WebSocket is a protocol providing full-duplex communication channels over a single TCP connection. Implementing Web-Socket between Ambari-Web and Ambari-Server will be helpful to address this scenario.

What is Apache Ambari

Apache Ambari is a software project of the Apache Software Foundation, is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. Ambari was a sub-project of Hadoop but is now a top-level project in its own right.

An Apache Hadoop cluster consists on a group or nodes/machines. Each node acts as an Ambari Agent that runs various service components (e.g. Datanode for HDFS). One of the agents acts as an Ambari Server that takes care of the task allocation, management, gathering information about the services status and other information from the agents and giving the information to Ambari UI for display. More information about Agent-Server-Web flow is available in the following sections.

Current Implementation

Project Goals, Benefits and Challenges

Goals:

1. Understand architecture of Ambari-Server and Ambari-web

2. Replace the current pull-based mechanism with the push-based mechanism via Web-Socket

3. Write test cases for the Web Socket implementation to test the functionalities of WebSocket Client and WebSocket Server

Benefits:

1. Web Socket will allow Ambari-Server to perform robustly an a large cluster with multiple browser sessions and when continuous heavy requests are being made

Challenges:

1. Ambari server uses Jetty 8.x version, while support for Web Socket was made available after Jetty 9.x

2. Ambari has a huge codebase with multiple modules and multiple frameworks and design patterns adopted. Understanding the flow of the project and taking care of dependencies is a huge challenge

3. For testing the code requires a cluster of around 3 nodes. Running 3 virtual machines requires a high performance machine.

Learning Outcomes

1. We have observed that Ambari project adopts various design patterns like Singleton Pattern in the Data Access Object Classes

2. We also noticed in one of the classes called HeartBeatHandler, that it obeys the Law of Demeter

3. For testing, project using the Mockito framework to mock the heartbeat and cluster functionality

4. For configuration management, "Puppet" configuration management tool is used

5. Ambari-Web is implemented in Ember.js which is a open-source Javascript framework that follows MVC pattern similar to Rails framework.

6. Ambari-server uses Jetty which is a web-server to handle all the http requests made onto the ambari-server from the UI

Github Location

https://github.com/apache/ambari

Forked Repository:

https://github.com/nisarg64/ambari

References

https://ambari.apache.org/

https://en.wikipedia.org/wiki/Apache_Ambari

https://cwiki.apache.org/confluence/display/AMBARI/Ambari

https://issues.apache.org/jira/secure/attachment/12559939/Ambari_Architecture.pdf