Archive for the ‘Big Data’ Category

Apache Hadoop is an open source framework that supports data-intensive distributed, licensed under the Apache v2 license.It enables applications to work with thousands of computation-independent computers and petrabytes of data. Hadoop was derived from Google’s MapReduce and Google File System (GFS) papers.

First we need to set up Java 6. Since I am using Ubuntu 12.04 for my set up, I used for reference.

Once Java 6 is setup, just follow the instructions on the URLs provided below minus the Java 6 setup part.

Running Hadoop On Ubuntu Linux (Single-Node Cluster):

Running Hadoop On Ubuntu Linux (Multi-Node Cluster):