But as attractive as Hadoop is, there is still a steep learning curve involved in understanding what role Hadoop can play for an organization, and how best to deploy it.
[ FREE DOWNLOAD: | ]
By understanding what Hadoop can, and can't do, you can get a clearer picture of how it can best be implemented in your own data center or cloud. From there, best practices can be laid out for a Hadoop deployment.
What Hadoop can't do
We're not going to spend a lot of time on what Hadoop is, since that's well covered in documentation and media sources. It's suffice to say that it's important to know the two major components of Hadoop: the Hadoop distributed file system for storage and the MapReduce framework that lets you perform batch analysis on whatever data you have stored within Hadoop. That data, notably, does not have to be structured -- which makes Hadoop ideal for analyzing and working with data from sources like social media, documents, and graphs: anything that can't easily fit within rows and columns.