CSC/ECE 517 Spring 2014/ch1a 1o sr: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 32: Line 32:
==References==
==References==
<references/>
<references/>
* http://www.sas.com/en_us/insights/big-data/what-is-big-data.html

Revision as of 12:35, 13 May 2014

This page covers the usage of Big Data with respect to Ruby on Rails.

Background

What is Big Data?

Big data means a massive volume of both structured and unstructured data that is so large that it's difficult to process using traditional database and software techniques. In most enterprise scenarios the data is too big or it moves too fast or it exceeds current processing capacity. <ref> http://www.webopedia.com/TERM/B/big_data.html </ref> The term may be used to refer the volumes of data, as well as the tools or techniques used to process, manage, analyze the data.

Usage

Challenges

As costs of storage are decreasing, it becomes trivial to store huge amounts of data, which leads to a much bigger challenge: determining relevance within volumes of data, and use analytics to create value from relevant data.

The real problem is not to acquire huge amounts of data, but how to make sense out of it to make any useful deduction. For example, if Google records and each and every search query that any of it's user makes, indexed with the Google account (where singed in) and IP address where not, the problem is not storage. The problem is predicting browsing habits, optimizing search results, creating profiles of google users, letting the profiles evolve with additional data, but only relevant data, deciding which data should be considered relevant etc.

Big data is often received at a very speed. We can consider the same example as mentioned above. Not only is data pouring in huge amounts, but it has to be structured, categorized, stored, managed at a very high speed .

Another challenge is that there's no set structure for big data (why would there be? big data is just data, just in huge volumes). For example, an international car parts supplier can index and store the sale of parts based on their make, model, item number, manfacturing data, location of purchase etc. This is a very structured form of big data, but as in our last example, where we only had a text string to determine all the variable, extraction of variables can be a challenge on unstructured big data.

Another challenge of big data, as with all emerging technologies is elastic scalability. If we're recording huge amounts of data, let's say views for a news website or shopping on a shopping portal, the data won't be consistent. There will be predictable (Christmas for the shopping website) and unpredictable (Any newsworthy outside event) occasions that will cause peaks and troughs in the inflow of data. Any system designed to store, manage, analyze big data should account for these as well.

Big Data and Rails

Challenges of big data and rails

Examples

Big Data Usage in Rails Applications

How easy it is to use Big Data in Rails Applications

Are there gems that facilitate it?

Rails versus other frameworks for processing big data

Important Terms

References

<references/>