What do Walmart, Facebook, Yahoo, Twitter, IBM, Google, EBay, Teradata, LinkedIn, Hulu, The New York Times, MicroStrategy, and P&G have in common? They are all harnessing the power of Hadoop to store, serve, slice, and rationalize Big Data to advance their business. To peer into volumes of data like never before. Volumes of data that have been too big to be this nimble and uncovering things about their customers that they never dreamed possible – until today.
The biggest downside to standard data warehousing and BI tools today is that you have to know the questions you want to ask ahead of time. This creates a never ending search for patterns, outliers, and relationships in your data. If you dream up a question your existing architecture doesn’t support, you have to involve IT or software vendors and re-architect the whole data warehouse.
What if you could gaze into a magic 8-ball and it would tell you everything you needed to know about your retail category – all of the SKU changes to maximize sales, your out-of-stocks and phantom inventory, your sales by geography or store traits, plus patterns in your data that you did not even know to look for. Welcome to the next generation of BI data warehousing in retail category management – Hadoop!
- Hadoop is powering today’s Big Data initiatives and is gaining more and more acceptance across many different business units. Coupled with Hive, Pig, Scoop, MapReduce, and numerous others, there are multiple robust ways to attack and slice your data.
- Your original data formats are unchanged, so you can reuse them in their raw form at a later date. This guarantees no data loss in case you think of some way to explore your data in the future that you have not thought of today. It also does not lock you into a proprietary third party data format.
- No ETL is required. Data is loaded into the HDFS and then you are done. Then use coupled tools to go unearth the data you are looking for rather than churning it into a cookie cutter format that you hope will give you insights.
- Hadoop is scalable using inexpensive hardware. Add nodes to your cluster all day long, using junker PCs you have lying around in the closet. No longer do you need a $50K RAID SAN to house and protect your data. Running out of space after 5 years of category data? Just load up some more nodes and you will be good for another few years.
- Hadoop couples with several analytics vendors – MicroStrategy, Pentaho, Zoomdata, SSRS, Tableau, SAS, with other open source products as well as numerous several built-in packages.
We are breaking new ground focusing on implementing Retaillink or other Demand Signal data in a Hadoop cluster, and applying several analytics packages on top of that to let this new Big Data platform shine in the category management space like never before.