Paid Feature The cloud has a habit of transforming on-premises technologies that have existed for decades.
It absorbed business applications that used to run exclusively on local servers. It embraced the databases they relied on, presenting an alternative to costly proprietary implementations. And it has also driven new efficiencies into one of the most venerable on-premises data analytics technologies of all: the data warehouse.
Data warehousing is a huge market. Allied Market Research put it at $21.18bn in 2019, and estimates that it will more than double to $51.18bn in 2028. The projected 10.7 percent CAGR between 2020 and 2028 comes from a raw hunger for data-driven insights that we've never seen before.
It isn't as though data warehousing is a new concept. It has been around since the late eighties, when researchers began building systems that funneled operational data through to decision-making systems. They wanted that data to help strategists understand the subtle currents that made a business tick.
This product category initially targeted on-premises installations, with big iron servers capable of handling large computing workloads. Many of these systems were designed to scale up, adding more processors connected by proprietary backplanes. They were expensive to buy, complex to operate, and difficult to maintain. The upshot, AWS claims, was that companies found themselves spending a lot on these implementations and not getting enough value in return.
As companies produced more data, it became harder for these implementations to keep up. Data volumes exploded, driven not just by the increase in structured records but also by an expansion in data types. Unstructured data, ranging from social media posts to streaming IoT data, has sent storage and processing requirements soaring.
Cloud computing evolved around the same time, and AWS argues that it changed data warehousing for the better. Data Warehousing has been popular with customers in sectors like financial services and healthcare, which have been heavy analytics users.
Manage data at any scale and velocity while remaining cost effective
But the cloud has opened up the concept to far more companies thanks to lower prices and better performance, according to AWS. Applications previously restricted to multinational banks and academic labs are now open to smaller businesses. For example, youre able to perform data analytics in the cloud with benefits like scale, elasticity, time to value, cost efficiency and readily available applications.
The numbers bear this out. According to Research and Markets, the global market for data warehouse as a service (DWaaS) products will enjoy a 21.7 percent CAGR between 2021 and 2026, growing from $1.7bn to $4.5bn.
The largest cloud players have leaped on this trend, with Microsoft offering its Synapse service and Google running BigQuery. AWS announced Redshift as the first cloud data warehouse to address the market in 2012. The idea was pretty simple, AWS told us. The company wanted to give customers a scalable solution, where they could use the flexibility of the cloud to manage data at any scale and velocity while remaining cost effective.
Unlike online transaction processing databases like Amazon Aurora, Redshift targets online analytics processing (OLAP), offering support for fast queries thanks to scalable nodes with massive parallel processing (MPP) in a cluster. The cloud-based data warehouse follows the AWS managed database ethos. Rather than relying on a customer's administrators to take care of maintenance tasks, the company handles it behind the scenes in the cloud.
Aside from standing up hardware, this includes patching the software and handling backups and recovery. That means developers can focus on building applications ranging from modernizing existing data warehouse strategies through to accelerating analytics workloads, which it does using back-end parallel processing to spread queries over up to 128 nodes. Companies can use it for everything from analyzing global sales data to crunching through advertising impression metrics.
AWS also highlights other applications that can draw on cloud-based data warehouse technology, including predictive analytics, which enable companies to mine historical data for insights that could help to chart future events. Redshift also helps customers with applications that are often time critical, AWS says. These include recommendation and personalization, and fraud detection.
Performance at the right price is key, asserts AWS, which reports that customers latency requirements for processing and analyzing their data are shortening, with many wanting to make things almost real time.
AWS benchmarked Redshift against other market players and found price performance up to three times better than the alternatives. The system's ability to dynamically scale the number of nodes in a cluster helps here, as does its ability to access data in place from various sources across a data lake.
Data sharing is a cumbersome process, traditionally, where files are uploaded manually from one system and copied to another. This system, AWS says, does not provide complete and up-to-date views of the data as the manual processes introduce delays, human error and data inconsistencies, resulting in stale data and poor decisions.
In response to feedback from customers who wanted to share data at many levels to enable broad and deep insights but also minimize complexity and cost, AWS has introduced a capability that overcomes this issue.
Announced late last year, Amazon Redshift data sharing enables you to avoid copies. The new capability enables customers to query live data at their convenience, and get up to date views across organizations, customers and partners as the data is updated. In addition, Redshift integrates with AWS Data Exchange, enabling customers to easily find and subscribe to third-party data in AWS Data Exchange without extracting, transforming and loading it.
Amazon Redshift data sharing is already proving a hit with AWS customers, who are finding new use cases such as data marketplaces and workload isolation.
Data lakes have evolved as companies draw in data of different types from multiple sources. When unstructured data comes in such as machine logs, sensor data, or clickstream data from websites, you don't know about its quality or what insights you're going to find from it.
AWS told us many customers have asked for data stores where they can break free of data silos and land all of this data quickly, process it, and move it to more SLA-intensive systems for query and reporting like data warehouses and databases.
The cloud is the perfect place to put this data thanks to commodity storage. Storing data in the cloud is cheap thanks to a mixture of economies of scale on the cloud service provider side, and tiered storage that lets you put data in lower-cost tiers such as S3.
Data gravity is the other driver. A lot of data today begins in the cloud whether it comes from social media, machine logs, or cloud-based business software. It makes little sense to move that data from the cloud to on-premises applications for processing. Instead, why not just shorten the time it takes to get insights from it, AWS says.
The company designed the data warehouse to share information in the cloud, folding in API support for direct access. Redshift can pull in data from S3's cheap storage layer if necessary for fast, repeated processing, or it can access it in place. It also features different types of nodes optimized for storage or compute. It can interact with data in Amazon's Aurora cloud-native relational database, and other relational databases via Amazon Relational Database Services (RDS).
It also includes support for other interface types. Developers can import and export data from other data warehousing systems using open data formats like Parquet and optimized row columnar (ORC). Client applications also access the system via standard SQL, ODBC, or JDBC interfaces, making it easy to connect with business intelligence and analytics tools.
The ability to scale the storage layer separately to the compute nodes makes the system more flexible and eliminates network bottlenecks, the cloud service provider says.
Cloud databases also provide application developers with other services that they can use to enhance those insights. One of the most notable for AWS is its machine learning capability. ML algorithms are good at spotting probabilistic patterns in data, making them useful for analytics applications, but inference - the application of statistical models when processing new data - takes a lot of computing power. Scalable cloud computing power makes that easier, AWS says.
Cloud-based machine learning services are also easy for companies to consume because they are pluggable with data warehouses via application programming interfaces (APIs). AWS makes these available to anyone who knows SQL. Customers can use SQL statements to create and use machine learning models from data warehouse data using Redshift ML, a capability of Redshift that provides integration with Amazon SageMaker, a fully managed machine learning service.
In 2019, Amazon Redshift also introduced support for geospatial data by adding a new data type to Redshift: geometry. That supports coordinate data in table columns, making it possible to handle geospatial polygons for mapping purposes. This makes it possible to combine location information with other data types when making conventional data warehousing queries and building machine learning models for Redshift.
As data warehousing continues its move to the cloud, it shows no sign of slowing down. Customers can choose offerings from the largest cloud service providers or from third-party software vendors alike. Evaluation criteria will depend on each customer's individual strategy, but the need to scale compute and storage capabilities is sure to factor highly in any decision. One thing's for sure: the cloud will help customers as their big data gets bigger still.
This article is sponsored by AWS.
Continue reading here:
The rise of the cloud data warehouse - The Register
- Setting up a Virtual Server on Ninefold - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- ScaleXtreme Automates Cloud-Based Patch Management For Virtual, Physical Servers [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Secure Cloud Computing Software manages IT resources. [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Dell unveils new servers, says not a PC company [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Wyse to Launch Client Infrastructure Management Software as a Service, Enabling Simple and Secure Management of Any ... [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- As the App Culture Builds, Dell Accelerates its Shift to Services with New Line of Servers, Flash Capabilities [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Terraria - Cloud In A Ballon - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Ethernet Alliance Interoperability Demo Showcases High-Speed Cloud Connections [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- RSA and Zscaler Teaming Up to Deliver Trusted Access for Cloud Computing [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- [NEC Report from MWC2012] NEC-Cloud-Marketplace - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- IBM SmartCloud Virtualized Server Recovery - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- BeyondTrust Launches PowerBroker Servers Windows Edition [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Ericsson joins OpenStack cloud infrastructure community [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- ScaleXtreme Cloud-Based Patch Management Open for New Customers [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- RootAxcess - Getting Started - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- How to Create a Terraria Server 1.1.2 (All Links Provided) - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Dell #1 in Hyperscale Servers (Steve Cumings) - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Managing SAP on Power Systems with Cloud technologies delivers superior IT economics - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- AMD Acquires Cloud Server Maker SeaMicro for $334M USD [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Web Host 1&1 Provides More Flexibility with Dynamic Cloud Server [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Leap Day brings down Microsoft's Azure cloud service [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- RightMobileApps White Label Program - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- bzst server ban #2 - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- “Cloud storage served from an array would cost $2 a gigabyte” [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- More Flexibility with the 1&1 Dynamic Cloud Server [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Hub’s future jobs may be in cloud [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cloud computing growing jobs, says Microsoft [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- TurnKey Internet Launches WebMatrix, a New Application in Partnership with Microsoft [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cebit 2012: SAP Cloud Computing Strategy - Introduction - Video [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Dome9 Security Launches Industry's First Free Cloud Security for Unlimited Number of Servers [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Servers Are Refreshed With Intel's New E5 Chips [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Samsung's AllShare Play pushes pictures from phone to cloud and TV [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Google drops the price of Cloud Storage service [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- New Intel Server Technology: Powering the Cloud to Handle 15 Billion Connected Devices [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Swisscom IT Services Launches Cloud Storage Services Powered by CTERA Networks [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- KineticD Releases Suite of Cloud Backup Offerings for SMBs [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- First Look: Samsung Allshare Play - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Bill The Server Guy Introduces the New Intel XEON e5-2600 (Romley) Server CPU's - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- New Cisco servers have Intel Xeon E5 inside [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Cisco rolls out UCS servers with Intel Xeon E5 chips [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- From scooters to servers: The best of Launch, Day One [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Computer Basics: What is the Cloud? - Video [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- Could the digital 'cloud' crash? [Last Updated On: March 10th, 2012] [Originally Added On: March 10th, 2012]
- Dome9 Security Launches Free Cloud Security For Unlimited Number Of Servers [Last Updated On: March 10th, 2012] [Originally Added On: March 10th, 2012]
- Cloud computing 'made in Germany' stirs debate at CeBIT [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- New Key Technology Simplifies Data Encryption in the Cloud [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- Can a private cloud drive energy efficiency in datacentres? [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Porticor's new key technology simplifies data encryption in the cloud [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Borders + Gratehouse Adds Three New Clients in Cloud Sector [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Dell to invest $700 mn in R&D, unveils 12G servers [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Defiant Kaleidescape To Keep Shipping Movie Servers [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Data Centre Transformation Master Class 3: Cloud Architecture - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- DotNetNuke Tutorial - Great hosting tool - PowerDNN Control Suite - part 1/3 - Video #310 - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Cloud Computing - 28/02/12 - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- SYS-CON.tv @ 9th Cloud Expo | Nand Mulchandani, CEO and Co-Founder of ScaleXtreme - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Oni Launches New Cloud Services for Enterprises Using CA Technologies Cloud Platform [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- SmartStyle Advanced Technology - Video [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- SmartStyle Infrastructure - Video [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- The Hidden Risk of a Meltdown in the Cloud [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- FireHost Launches Secure Cloud Data Center in Phoenix, Arizona [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Panda Security Launches New Channel Partner Recruitment Campaign: "Security to the Power of the Cloud" [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- NetSTAR, Inc. Announces Safe and Secure Web Browsers for iPhones, iPads, and Android Devices [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Amazon Cloud Powered by 'Almost 500,000 Servers' [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- NetSTAR Announces Secure Web Browsers For iPhones, iPads, And Android Devices [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Be Prepared For When the Cloud Really Fails [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Dr. Cloud explains dinCloud's hosted virtual server solution - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- New estimate pegs Amazon's cloud at nearly half a million servers [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Amazon’s Web Services Uses 450K Servers [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Saving File On Internet - Cloud Computing - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- DotNetNuke Tutorial - Great hosting tool - PowerDNN Control Suite - part 2/3 - Video #311 - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Linux servers keep growing, Windows & Unix keep shrinking [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Cloud Desktop from Compute Blocks - Video [Last Updated On: March 16th, 2012] [Originally Added On: March 16th, 2012]
- Amazon EC2 cloud is made up of almost half-a-million Linux servers [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- HP trots out new line of “self-sufficient” servers [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Cloud Web Hosting Reviews - Australian Cloud Hosting Providers - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Using Porticor to protect data in a snapshot scenario in AWS - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- CDW - Charles Barkley - New Office - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Nearly a Half Million Servers May Power Amazon Cloud [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Morphlabs CEO Winston Damarillo talks about their mCloud Rack - Video [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- AMD reaches for the cloud with new server chips [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]