Creating a Data-Driven Enterprise with DataOps, published by O’Reilly explains what is required to become a truly data-driven organization that adopts a self-service data culture. It covers the need for democratizing data to infrastructure, process, and cultural considerations. It also outlines the stages of transformation that companies go through on their journey to become truly data-insights driven, along with the challenges characteristic of each stage in the model and strategies for getting to the next level. Finally, big data industry pioneers from eBay, LinkedIn, Twitter and Uber share their journeys, and Ashish and Joy share highlights from the Facebook transformation.
The DataOps book is a great tool to understand from other experts the right use cases for deploying a data lake, as well as their experiences and by-product technologies that the authors have helped to develop in part of these initiatives (Apache Hive, Heron, Presto, and more).
Here are some highlights in the book:
Many leading companies of today have developed data lakes (either cloud or hybrid infrastructure) as a critical core of agile innovation at their firms. This has paved the way for the data lake concept into an actionable, proven strategy - and no longer just Silicon Valley jargon - which is what the DataOps book aims to cover and more!!
- Infrastructure lessons learned on the way and best practices from shifting Facebook's infrastructure
- How to handle operations in moving from traditional data warehousing to big data
- What really is a data lake and the business value it could mean for your organization
- Additional tech tales from other industry leaders in analytics and data engineering
Ashish Thusoo, Co-founder and CEO at Qubole
Before co-founding Qubole, Ashish ran Facebook’s Data Infrastructure team (2007-2011). The team, under Ashish’s leadership, built one of the world’s largest data processing and analytics platforms. The platform achieved not just the bold aim of making data accessible to data analysts, engineers and scientists, but drove the “big data” revolution. In the process of scaling Facebook’s big data infrastructure, Ashish helped drive the creation of a host of tools, technologies and templates that are used industry wide today, including Apache Hive.
Joydeep Sen Sarma, Co-founder and Head, Qubole, India
Before co-founding Qubole, Joydeep worked at Facebook where he boot-strapped the data processing ecosystem based on Hadoop, started the Apache Hive project and led Facebook’s Data Infrastructure team. Joydeep was also a key contributor on the Facebook Messages architecture team and brought the power of Apache Hbase to Facebook and to the transactional and reporting backends for Facebook Credits.