Visualizing MapReduce Algorithm with WordCount Example:
In this blog-post, we would visualize how MapReduce Algorithms operates to perform a Word Count on a Text Input:
First of all, for all programmers out there, Here is the code (Javascript):
[sourcecode language=”javascript”]
var map = function (key, value, context) {
var words = value.split(/[^a-zA-Z]/);
for (var i = 0; i < words.length; i++) {
if (words[i] !== "") {
context.write(words[i].toLowerCase(), 1);
}
}
};
var reduce = function (key, values, context) {
var sum = 0;
while (values.hasNext()) {
sum += parseInt(values.next());
}
context.write(key, sum);
};
[/sourcecode]
Courtesy: Microsoft Hadoop on Azure Samples
Now, let’s visualize this using an example.
Suppose the Text is “Hadoop on Azure sample Hadoop is on Windows Azure Hadoop is on Windows server” – Then this is how you can think of what happens to your input when it is processed first by Map function and then by Reduce function:
INPUT MAP REDUCE
Hadoop on Azure sample
Hadoop is on Windows Azure
Hadoop is on Windows server
Hadoop 1 Hadoop 3 On 1 Azure 1 on 3 Sample 1 Hadoop 1 Azure 2 Is 1 On 1 Sample 1 Windows 1 Azure 1 Is 2 Hadoop 1 Is 1 Windows 2 On 1 Windows 1 Server 1 Server 1
Conclusion:
In this blog post, we visualized how MapReduce Algorithm operates for a WordCount Example.
