“Back in the day”, Pablo Picaso once said:
Give me a museum and I’ll fill it.
Data Integration in 2013 seems to follow along similar lines – Give us a Platform and we will fill it with data.
This is part two of a series of blog postings on important topics for the world of business intelligence and data warehousing for 2013. While I wouldn’t call it my “predictions” for 2013, it is a listing of important topics and concepts for the domains of the EMA Business Intelligence Continuum.
This week’s installment is on Data Integration, which is the beginning of all environments – operational and analytical. And, as I stated last week – data being the new oil of business – the integration with various data sources will be the key to creating impactful and value-added business intelligence environments. Here are my thoughts for 2013 on bringing that “oil” into BI/DW.
- Data Integration is Becoming Standardized: Before the release of the Porsche Cayenne, I used to opine that “You could in fact tow a boat with a Porsche (then 911), but why would you want to?” The same applies to the world of data integration. In the past, hand coding in PERL and other languages was an option. However, I believe that for most organizations, toolsets that encapsulate hand coding from ETL professionals and others managing data integration environments should become the minimum standard.
- Let All The Data In: As we fully mature into the era of Big Data, the decision to allow only “blessed” data into our business intelligence and data warehousing environments will be a mistake. There will be too many business opportunities missed by using gating factors to keep certain data out of analytical environments. We should let all the data in and let the users, data scientists, and business analysts make the decision on the value of the data.
- Data Virtualization and ETL Need to Work Together: Lately, I have been teaching a class on Data Virtualization. Data Virtualization is nothing new, but it is making a comeback (DV vendors would probably prefer that, like LL Cool J, I don’t call it a comeback…) However, this comeback has put ETL professionals on edge. They view DV as a threat. However, these are complementary technologies that can and should be used together to maximize the value of data across the enterprise.
- Not Enough Room or Bandwidth is No Longer an Excuse: As mentioned above, all the data should be brought into our analytical environments, and not just the “blessed” versions. Since disk space and connectivity bandwidth prices have fallen and technology hurdles have been overcome, data integration teams can no longer reasonably use “disk space” or “bandwidth” as excuses to limit their activities. They should embrace the capacity they now enjoy to say “yes” rather than “no”.
- If You Thought this Crop of Big Data was Big…. : Back in the day, transactional data from the WalMart point of sale (POS) system or telecom networks was the original “Big Data”. We then linked people together with social platforms and we got “bigger” Big Data. HAH! Those two sources will be relatively small when we link the objects of our lives (i.e. cars, fridges, houses, cell phones, vans, etc.) together. Whether you call it “Internet of Things” or “Machine to Machine” (M2M), number, diversity and load from those data sources will really make data integration “fun”…errr… challenging.
What say the readers?
Have I missed something with hand coding in the age of Hadoop? Have I put too much stock on Big Data and not enough on Data Quality? Am I over-hyping the information in all that sensor data (noise vs signal)?
Provide your comments below and/or ping me via twitter at @JohnLMyers44 with the hashtag #100LinesOnBIDW.
Next week, I will cover Data Management. I hope you continue the journey with me.
NOTE – For those unfamiliar with the song “88 Lines about 44 Women” by the Nails, I highly recommend you give it a try. At the very least, it was the inspiration for this series of blogs.