When Aesop created the fable about the shepherd boy who cried wolf, the message was clear:
If analysts like myself and others as well as the mainstream press (ie Forbes, Harvard Business Review, Economist ) continue to shout about Big Data in Data Management and no one sees Big Data change, will we run out of credibility?
For 2013, I am saying that Data Management will be the most impacted area of the EMA Business Intelligence Continuum by the “wolf” that is Big Data…. And it’s a pretty big wolf.
- Big data isn’t so big…. But, it is complex: We have all been inundated with information about Big Data and how Big it is. Well, in the EMA Study regarding Big Data last year, end users across multiple industries said that Big Data wasn’t SOOO big that it couldn’t be handled by firms not named Facebook, WalMart, and CitiGroup. What EMA did learn was that Big Data was more complex than most thought, with various data sources and data management platforms playing a role in a new ecosystem.
- Relational isn’t dead: Just as the 2010-2012 news cycle might have you believe that Big Data was the only thing and that unstructured/multi-structured data was the only data associated with Big Data, you might have gotten the impression that relational/structured data was a thing of the past. That simply isn’t true. Structured relational data will continue to have a strong role in data management. However, it will no longer be the center of the data management world. It will be a partner to multi-structured data.
- Data governance MUST be redefined: Data governance is a favorite topic of mine. It can mean many things to many people. It can be Data Quality as in proper values by row and column. It can be Data Stewardship in terms of how metadata is managed and expressed. However, with multi-structured data and NoSQL becoming such a prominent force in data management, we need to change how Data governance is defined. We need to focus on data lineage or where data came from and its freshness as opposed to the values within row, columns or records. We need to focus on understanding data as it comes into data management platforms as opposed to keeping out only the “blessed” data.
- Schemas must be flexible: With the concept that data will have variable data structures when it is created and accepted into data management platforms, the schemas that support SQL and other structured access methods will need to be flexible vs rigid. Schemas were once used as gatekeeping devices to keep unworthy data out of relational data management platforms. Now schemas must move from the gatekeeper role to the end analytical lens role. We have the ability to process data on the fly and apply schemas later and later in the process. We need to have flexibility to make that happen and speed the implementation of operational and analytical processing.
- Data modelers must get into productization and less art: With the previous two trends for data management in 2013, the teams that will have the most adaption placed on them will be our data modelers. Just as data modeling underwent (and is still undergoing) a massive upheaval in the move from 3NF operational schemas to denormalized analytical schemas, we will ask data modelers to forsake a “perfect” living model for a series of “good enough” productized models. This may take some of the art out of data modeling in trade for speed of implementation of an iterative and componentized set of data models.
What say the readers?
Has big data been cried “wolf” too many times? Would your data stewards sooner quit than allow uncleansed data in your environments? Will data modelers accept changes to their job similar to what programmers did with object oriented coding? ETL developers did with the move away from hand coding?
Next week, I will cover Business Analytics…. I hope you continue the journey with me.
NOTE – For those unfamiliar with the song “88Lines about 44 Women” by the Nails, I highly recommend you give it a try. At the very least, it was the inspiration for this series of blogs.