The Requirements for a Unified Analytics Warehouse

Aug 21, 2020 12:38:29 PM

To assess the likely winners in the race for the unified analytics warehouse, it is important to understand the various requirements of modern analytics programs and the unified analytics warehouse.


Data Requirements

The modern enterprise produces and utilizes a broad range of data types. For example, a customer-facing application may capture semi-structured web stream data, mobile app data, raw text from email, location data, and structured data from both marketing and sales automation systems. Customer engagement requires a single platform to easily capture, combine, and analyze these different streams and sets of data. The support of multi-structured data is vital to the success of the unified analytics warehouse.

Today’s digital enterprise also needs to respond intelligently to real-time business and customer events. Therefore, the UAW must enable organizations to ask questions of data while it is in motion and while it is at rest, or to enable queries across both states of data.

Enterprise Requirements

Both longstanding companies and digital-first companies require specific enterprise capabilities to meet corporate, regulatory, and competitive requirements. These include, but are not limited to, management, orchestration, security, privacy, performance, scalability, agility, resiliency, and affordability. A successful unified analytics warehouse must be enterprise-ready.

Because of the size and complexity of modern data warehouses and data lakes, plus the modernization of data platforms, AI-enabled automation also becomes a key requirement. Automation can be broken down into recommendations and process automation. The most advanced UAW technology should make recommendations for at least structure, schema, data relationships, and performance tuning. These platforms should also automate actions like metadata generation, tuning, maintenance, elasticity, query routing, data tiering, change management, and code generation.

Infrastructure Requirements

Historically, businesses moved their data from databases to file systems to save money. Now, they are moving from file systems to object storage. In the world of analytics, it is important to remember that cheap storage is limited. If it is not accessible for analysis, cheap is not enough. For this reason, the unified analytics warehouse must be able to provide a rich and consistent set of analytical capabilities across all storage tiers.

More advanced UAWs will automate the movement of data in and out of file systems and object storage when needed. In order to make the UAW cost-effective, it should not be tied too tightly to the hardware. Cloud and appliance vendors with tight ties to their underlying infrastructure have little motivation to optimize their software and provide the most cost-effective technology offering.

Hybrid Requirements

Recent EMA research shows that 53% of all data is now in the cloud. Because of this massive shift of data, both multi-cloud and hybrid support is essential for the unified analytics warehouse. A legitimate UAW will enable management of these different systems in a single pane of glass for all environments in the cloud and on-premises.

Modern requirements dictate that certain workloads may need to move from one cloud provider to another, from on-premises to cloud, or from cloud to on-premises. In the best-case scenario, the unified analytics warehouse software should live and act in the same way across different cloud providers and on-premises.

Analytical Processing Requirements

Analytics are the primary use case of the unified analytics warehouse. Therefore, at a minimum, the unified analytics warehouse must include a common set of prebuilt analytical algorithms and functions to address every step of the machine learning process. Those algorithms should be embedded in the platform for ease of use and high performance on the largest data volumes, without any down-sampling of data. Additionally, it must support the rapid development and execution of ad hoc analytics.

Ideally, the UAW should perform well for different analytical processing types of workloads. The performance spectrum should cover data intensive with low compute, compute intensive with low data, data intensive with high compute, and a high number of concurrent users or queries.

User Expectations

At the highest level, users expect the unified analytics warehouse to provide seamless unification of all interactions with data and analytics. Users do not anticipate having to move to different environments to access data or having to use different interfaces to manage diverse data. Users also expect a platform that allows them to put aside religious beliefs about how analytics should be done. With the UAW, data engineers, data scientists, and data analysts no longer need to fight about who is right and who is wrong. They have a single environment where they can collaborate for the greater good of the enterprise.

To support a unified data workforce, the UAW must also support a broad set of different approaches to analytics. The data scientist must be able to use R, Python, and notebooks to execute discovery analytics or advanced analytics like machine learning on multi-structured data. The data analyst must have ready access to multi-structured data using SQL, the lingua franca of analytics. Business users must be able to simply combine data, build reports, or construct dashboards for use in their everyday work. Finally, the platform must enable easy-to-access to high-performance analytics via pre-built embedded algorithms. It must be straightforward to combine all analytics for greater insight and ask questions of data in near-real time.

For the rest of the story, read the full white paper, The Emergence of the Unified Analytics Warehouse – Data Lakes and Data Warehouses Merge.

John Santaferraro

Written by John Santaferraro

John is the research director for analytics, business intelligence, and data management at EMA. His 23 years of experience in the data and analytics market span everything from startups to executive positions at Fortune 50 companies. His deep understanding of the industry comes from years of leadership in product and marketing organizations, along with multiple big data imagineering efforts for finance, communications, retail, manufacturing, healthcare, events, oil and gas, and utilities. John's coverage area also includes data integration, data discovery, metadata management, artificial intelligence, machine learning, data science, digital marketing, and innovation.

  • There are no suggestions because the search field is empty.

Lists by Topic

see all

Posts by Topic

see all

Recent Posts