It seems like a question a child would ask: Why are things the way they are?
It is tempting to answer, because thats the way things have always been. But that would be a mistake. Every tool, system, and practice we encounter was designed at some point in time. They were made in particular ways for particular reasons. And those designs often persist like relics long after the rationale behind them has disappeared. They live on sometimes for better, sometimes for worse.
A famous example is the QWERTY keyboard, devised by inventor Christopher Latham Sholes in the 1870s. According to the common account, Lathams intent with the QWERTY layout was not to make typists faster but to slow them down, as the levers in early typewriters were prone to jam. In a way it was an optimization. A slower typist who never jammed would produce more than a faster one who did.
New generations of typewriters soon eliminated the jamming that plagued earlier models. But the old QWERTY layout remained dominant over the years despite the efforts of countless would-be reformers.
Its a classic example of a network effect at work. Once sufficient numbers of people adopted QWERTY, their habits reenforced themselves. Typists expected QWERTY, and manufacturers made more QWERTY keyboards to fulfill the demand. The more QWERTY keyboards manufacturers created, the more people learned to type on a QWERTY keyboard and the stronger the network effect became.
Psychology also played a role. Were primed to like familiar things. Sayings like better the devil you know and If it aint broke, dont fix it, reflect a principle called the Mere Exposure effect, which states that we tend to gravitate to things weve experienced before simply because weve experienced them. Researchers have found this principle extends to all aspects of life: the shapes we find attractive, the speech we find pleasant, the geography we find comfortable. The keyboard we like to type on.
To that list I would add the software designs we use to build applications. Software is flexible. It ought to evolve with the times. But it doesnt always. We are still designing infrastructure for the hardware that existed decades ago, and in some places the strain is starting to show.
Hadoop offers a good example of how this process plays out. Hadoop, you may recall, is an open-source framework for distributed computing based on white papers published by Google in the early 2000s. At the time, RAM was relatively expensive, magnetic disks were the main storage medium, network bandwidthwas limited, files and datasets were large and it was more efficient to bring compute to the data than the other way around. On top of that, Hadoop expected servers to live in a certain place in a particular rack or data center.
A key innovation of Hadoop was the use of commodity hardware rather than specialized, enterprise-grade servers. That remains the rule today. But between the time Hadoop was designed and the time it was deployed in real-world applications, other facts on the ground changed. Spinning disks gave way to SSD flash memory. The price of RAM decreased and RAM capacity increased exponentially. Dedicated servers were replaced with virtualized instances. Network throughput expanded. Software began moving to the cloud.
To give some idea of the pace of change, in 2003 a typical server would have boasted 2 GB of RAM and a 50 GB hard drive operating at 100 MB/sec, and the network connection could transfer 1Gb/sec. By 2013, when Hadoop came to market, the server would have 32 GB RAM, a 2 TB hard drive transferring data at 150 MB/sec, and a network that could move 10 Gb/sec.
Hadoop was built for a world that no longer existed, and its architecture was already deprecated by the time it came to market. Developers quickly left it behind and moved to Spark (2009), Impala (2013), Presto (2013) instead. In that short time, Hadoop spawned several public companies and received breathless press. It made a substantial albeit brief impact on the tech industry even though by the time it was most famous, it was already obsolete.
Hadoop was conceived, developed, and abandoned within a decade as hardware evolved out from under it. So it might seem incredible that software could last fifty years without significant change, and that a design conceived in the era of mainframes and green-screen monitors could still be with us today. Yet thats exactly what we see with relational databases.
In particular, the persistence is with the Relational Database Management System, or RDBMS for short. By technological standards, RDBMS design is quite old, much older than Hadoop, originating in the 1970s and 1980s. The relational database predates the Internet. It comes from a time before widespread networking, before cheap storage, before the ability to spread workloads across multiple machines, before widespread use of virtual machines, and before the cloud.
To put the age of RDBMS in perspective, the popular open source Postgres is older than the CD-ROM, originally released in 1995. And Postgres is built on top of a project that started in 1986, roughly. So this design is really old. The ideas behind it made sense at the time, but many things have changed since then, including the hardware, the use cases and the very topology of the network,
Here again, the core design of RDBMS assumes that throughput is low, RAM is expensive, and large disks are cost-prohibitive and slow.
Given those factors, RDBMs designers came to certain conclusions. They decided storage and compute should be concentrated in one place with specialized hardware and a great deal of RAM. They also realized it would be more efficient for the client to communicate with a remote server than to store and process results locally.
RDBMS architectures today still embody these old assumptions about the underlying hardware. The trouble is those assumptions arent true anymore. RAM is cheaper than anyone in the 1960s could have imagined. Flash SSDs are inexpensive and incredibly responsive, with latency of around 50 microseconds, compared with roughly 10 milliseonds for the old spinning disks. Network latency hasnt changed as much still around 1 millisecond but bandwidth is 100 times greater.
The result is that even now, in the age of containers, microservices, and the cloud, most RDBMS architectures treat the cloud as a virtual datacenter. And thats not just a charming reminder of the past. It has serious implications for database cost and performance. Both are much worse than they need to be because they are subject to design decisions made 50 years ago in the mainframe era.
One of the reasons relational databases are slower than their NoSQL counterparts is that they invest heavily in keeping data safe. For instance, they avoid caching on the disk layer and employ ACID semantics, writing to disk immediately and holding other requests until the current request has finished. The underlying assumption is that with these precautions in place, if problems crop up, the administrator can always take the disk to forensics and recover the missing data.
But theres little need for that now at least with databases operating in the cloud. Take Amazon Web Services as an example. Its standard Elastic Block Storage system makes backups automatically and replicates freely. Traditional RDBMS architectures assume they are running on a single server with a single point of storage failure, so they go to great lengths to ensure data is stored correctly. But when youre running multiple servers in the cloud as you do if theres a problem with one you just fail over to one of the healthy servers.
RDBMSs go to great lengths to support data durability. But with the modern preference for instant failover, all that effort is wasted. These days youll failover to a replicated server instead of waiting a day to bring the one that crashed back online. Yet RDBMS persists in putting redundancy on top of redundancy. Business and technical requirements often demand this capability even though its no longer needed a good example of how practices and expectations can reinforce obsolete design patterns.
The client/server model made a lot of sense in the pre-cloud era. If your network was relatively fast (which it was) and your disk was relatively slow (which it also was), it was better to run hot data on a tricked-out, specialized server that received queries from remote clients.
For that reason, relational databases originally assumed they had reliable physical disks attached. But once this equation changed, and local SSDs could find data faster than it could be moved over the network, it made more sense for applications to read data locally. But at the moment we cant do this because its not how databases work.
This makes it very difficult to scale RDBMS, even with relatively small datasets, and makes performance with large data sets much worse than it would be with local drives. This in turn makes solutions more complex and expensive, for instance by requiring a caching layer to deliver the speed that could be obtained cheaper and easier with fast local storage.
RAM used to be very expensive. Only specialized servers had lots of it, so that is what databases ran on. Much of classic RDBMS design revolved around moving data between disk and RAM.
But here again, the cloud makes that a moot point. AWS gives you tremendous amounts of RAM for a pittance. But most people running traditional databases cant actually use it. Its not uncommon to see application servers with 8 GB of RAM, while the software running on them can only access 1 GB, which means roughly 90 percent of the capacity is wasted.
That matters because theres a lot you can do with RAM. Databases dont only store data. They also do processing jobs. If you have a lot of RAM on the client, you can use it for caching, or you can use it to hold replicas, which can do a lot of the processing normally done on the server side. But you dont do any of that right now because it violates the design of RDBMS.
Saving energy takes energy. But software developers often choose not to spend it. After all, as the inventor of Perl liked to say, laziness is one of the great programmers virtues. Wed rather build on top of existing knowledge than invent new systems from scratch.
But there is a cost to taking design principles for granted, even if it is not a technology as foundational as RDBMS. We like to think that technology always advances. RDBMS reminds us some patterns persist because of inertia. They become so familiar we dont see them anymore. They are relics hiding in plain sight.
Once you do spot them, the question is what to do about them. Some things persist for a reason. Maturity does matter. You need to put on your accountants hat and do a hard-headed ROI analysis. If your design is based on outdated assumptions, is it holding you back? Is it costing you more money than it would take to modernize? Could you actually achieve a positive return?
Its a real possibility. Amazon created a whole new product the Aurora database by rethinking the core assumptions behind RDBMS storage abstraction.
You might not go that far. But where theres at least a prospect of positive ROI, its a good sign that change is strategic. And thats your best sign that tearing down your own design is worth the cost of building something new in its place.
Avishai Ish-Shalom is developer advocate at ScyllaDB.
Go here to see the original:
The World Has Changed Why Havent Database Designs? - The Next Platform
- Setting up a Virtual Server on Ninefold - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- ScaleXtreme Automates Cloud-Based Patch Management For Virtual, Physical Servers [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Secure Cloud Computing Software manages IT resources. [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Dell unveils new servers, says not a PC company [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Wyse to Launch Client Infrastructure Management Software as a Service, Enabling Simple and Secure Management of Any ... [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- As the App Culture Builds, Dell Accelerates its Shift to Services with New Line of Servers, Flash Capabilities [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Terraria - Cloud In A Ballon - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Ethernet Alliance Interoperability Demo Showcases High-Speed Cloud Connections [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- RSA and Zscaler Teaming Up to Deliver Trusted Access for Cloud Computing [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- [NEC Report from MWC2012] NEC-Cloud-Marketplace - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- IBM SmartCloud Virtualized Server Recovery - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- BeyondTrust Launches PowerBroker Servers Windows Edition [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Ericsson joins OpenStack cloud infrastructure community [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- ScaleXtreme Cloud-Based Patch Management Open for New Customers [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- RootAxcess - Getting Started - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- How to Create a Terraria Server 1.1.2 (All Links Provided) - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Dell #1 in Hyperscale Servers (Steve Cumings) - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Managing SAP on Power Systems with Cloud technologies delivers superior IT economics - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- AMD Acquires Cloud Server Maker SeaMicro for $334M USD [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Web Host 1&1 Provides More Flexibility with Dynamic Cloud Server [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Leap Day brings down Microsoft's Azure cloud service [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- RightMobileApps White Label Program - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- bzst server ban #2 - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- “Cloud storage served from an array would cost $2 a gigabyte” [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- More Flexibility with the 1&1 Dynamic Cloud Server [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Hub’s future jobs may be in cloud [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cloud computing growing jobs, says Microsoft [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- TurnKey Internet Launches WebMatrix, a New Application in Partnership with Microsoft [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cebit 2012: SAP Cloud Computing Strategy - Introduction - Video [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Dome9 Security Launches Industry's First Free Cloud Security for Unlimited Number of Servers [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Servers Are Refreshed With Intel's New E5 Chips [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Samsung's AllShare Play pushes pictures from phone to cloud and TV [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Google drops the price of Cloud Storage service [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- New Intel Server Technology: Powering the Cloud to Handle 15 Billion Connected Devices [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Swisscom IT Services Launches Cloud Storage Services Powered by CTERA Networks [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- KineticD Releases Suite of Cloud Backup Offerings for SMBs [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- First Look: Samsung Allshare Play - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Bill The Server Guy Introduces the New Intel XEON e5-2600 (Romley) Server CPU's - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- New Cisco servers have Intel Xeon E5 inside [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Cisco rolls out UCS servers with Intel Xeon E5 chips [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- From scooters to servers: The best of Launch, Day One [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Computer Basics: What is the Cloud? - Video [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- Could the digital 'cloud' crash? [Last Updated On: March 10th, 2012] [Originally Added On: March 10th, 2012]
- Dome9 Security Launches Free Cloud Security For Unlimited Number Of Servers [Last Updated On: March 10th, 2012] [Originally Added On: March 10th, 2012]
- Cloud computing 'made in Germany' stirs debate at CeBIT [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- New Key Technology Simplifies Data Encryption in the Cloud [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- Can a private cloud drive energy efficiency in datacentres? [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Porticor's new key technology simplifies data encryption in the cloud [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Borders + Gratehouse Adds Three New Clients in Cloud Sector [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Dell to invest $700 mn in R&D, unveils 12G servers [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Defiant Kaleidescape To Keep Shipping Movie Servers [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Data Centre Transformation Master Class 3: Cloud Architecture - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- DotNetNuke Tutorial - Great hosting tool - PowerDNN Control Suite - part 1/3 - Video #310 - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Cloud Computing - 28/02/12 - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- SYS-CON.tv @ 9th Cloud Expo | Nand Mulchandani, CEO and Co-Founder of ScaleXtreme - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Oni Launches New Cloud Services for Enterprises Using CA Technologies Cloud Platform [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- SmartStyle Advanced Technology - Video [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- SmartStyle Infrastructure - Video [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- The Hidden Risk of a Meltdown in the Cloud [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- FireHost Launches Secure Cloud Data Center in Phoenix, Arizona [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Panda Security Launches New Channel Partner Recruitment Campaign: "Security to the Power of the Cloud" [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- NetSTAR, Inc. Announces Safe and Secure Web Browsers for iPhones, iPads, and Android Devices [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Amazon Cloud Powered by 'Almost 500,000 Servers' [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- NetSTAR Announces Secure Web Browsers For iPhones, iPads, And Android Devices [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Be Prepared For When the Cloud Really Fails [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Dr. Cloud explains dinCloud's hosted virtual server solution - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- New estimate pegs Amazon's cloud at nearly half a million servers [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Amazon’s Web Services Uses 450K Servers [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Saving File On Internet - Cloud Computing - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- DotNetNuke Tutorial - Great hosting tool - PowerDNN Control Suite - part 2/3 - Video #311 - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Linux servers keep growing, Windows & Unix keep shrinking [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Cloud Desktop from Compute Blocks - Video [Last Updated On: March 16th, 2012] [Originally Added On: March 16th, 2012]
- Amazon EC2 cloud is made up of almost half-a-million Linux servers [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- HP trots out new line of “self-sufficient” servers [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Cloud Web Hosting Reviews - Australian Cloud Hosting Providers - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Using Porticor to protect data in a snapshot scenario in AWS - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- CDW - Charles Barkley - New Office - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Nearly a Half Million Servers May Power Amazon Cloud [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Morphlabs CEO Winston Damarillo talks about their mCloud Rack - Video [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- AMD reaches for the cloud with new server chips [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]