Thursday, March 7, 2013

getting started Big Data

From a technology point of view Hadoop and its ecosystem is the opensource implementation of the GFS (Google File System).

Below are the few features as i think are core to Hadoop

1) Map - Reduce framework
2) Storage systems: HDFS, HBase, Hive, Cassandra
3) Execution Framework: Pig
4) Cordination System for Distributed Applications: Zookeeper
5) Machine Learning and Data Mining Library: Mahout
6) Data Serialization Framework: AVRO

Below are few links which give a better understanding of hadoop:

http://www.snia.org/sites/default/files2/ABDS2012/Tutorials/RobPeglar_Introduction_Analytics%20_Big%20Data_Hadoop.pdf

http://hadoop.apache.org

No comments:

Post a Comment