MapReduce

From BC$ MobileTV Wiki
Jump to: navigation, search

MapReduce is a programming model and software framework for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes.


Hadoop

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing and an open source implementation of the MapReduce algorithm.

[1]

Spark

Apache Spark is a fast and general engine for large-scale data processing. Spark runs on Hadoop (using YARN), Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, and S3.






Resources


Tutorials


External Links

See Also

NoSQL | ML | Mahout
  1. Hadoop Security Basics (In Under 5 Minutes): https://blog.dataiku.com/sound-smart-on-hadoop-security