On mySimon: Burt's Bees Lip Shimmers
BNET Business Network:
BNET
TechRepublic
ZDNet

June 20th, 2007

Is Relational Relevant?

Posted by John Newton @ 11:09 am

Categories: Database, Information Management, Web 2.0

Tags: Database, Theory, RDBMS, Object-oriented, John Newton

Last week, some friends of mine from Ingres, the early relational database management system, attended a retrospective on relational database systems held at the Computer History Museum in Silicon Valley with other database pioneers from Oracle, Informix, IBM and Sybase. I was an early employee at Ingres which was the second best selling relational database until it unwound itself and eventually got sold. Back in the 1980s when Ingres started, relational was one of the hot topics of computer science. Today the developments in the retrospective are treated with the same distance as World War II and viewed with the same level of relevance as assembler code despite their widespread use today.

I was very early in studying relational databases and was fortunate enough to attend the University of California at Berkeley where a lot of the research in relational theory was happening. I then joined my professors, Mike Stonebraker, Larry Rowe and Gene Wong, as one of the first engineers in the commercial version of Ingres. We competed in very intense deals with a nascent Oracle and Informix and watched in astonishment as Sybase sprung out of the ashes of an early database machine company. In those days, we tracked closely what the researchers in universities and IBM were doing in the field of databases, scaling, theory and data distribution. This became the foundation for running operations in banks and manufacturing, providing the backbone for ERP and CRM, and storing everything from tiny log items to flight plans to battle orders.

(Joke: How many database theoreticians does it take to change a light bulb? Three: one to do the delete, one to do the insert and one to maintain concurrency. Told to me by Gene Wong with typical deadpan delivery.)

By the beginning of the 1990s, Oracle, IBM and Microsoft hired most of the brains in and coming out of the universities and from each other. A good example is the missing Jim Gray who provided a lot of the early database material that I read when I was a student in the late 1970s. You can’t blame bright researchers for trying to find a more lucrative opportunity in a red hot software market. The result was that the big database vendors prevented the others from getting the innovative and basic research breakthroughs that had characterized the previous couple of decades. Some fundamental problems of database management were left unsolved such as developing efficient and effective models for distributed database systems.

There was always an on-going war between the various alternative database models like the relational, hierarchical and network models, followed by object oriented models. In the early 1980s it seemed like the relational model had won, until Rick Cattell of Sun reignited the debate championing the network model as more of an object-oriented model. Although object databases went nowhere, there has always been an impedance mismatch between object-oriented systems and relational databases. Relational databases smashed object databases with standardization, scalability and integrity. However, object-oriented programmers, even ones that understood the core of relational theory, chose design patterns that suited an object-oriented world.

Thus were born various Object-Relational mapping techniques. It is not a well understood fact that enterprise systems such as SAP, Siebel and Documentum are object-relational systems at their core with object-relational development paradigms if not languages. From this perspective the Object-Relational boom was as big as the client/server and enterprise software boom. To simplify new enterprise development, object-relational mapping tools such as TopLink and Hibernate were developed. In the mean time, the database industry created object-oriented extensions in standards like SQL-1999 that nobody uses. SQL, with its COBOL-like language, was increasingly relegated to more of access method for storing information or a Byzantine reconstructor of stored objects.

In the Web 2.0 world populated with unstructured content and XML in huge stores of information flung across the globe, the ultimate set of distributed database is being test, but without the tools of the major vendors being used. Simple databases built upon MySQL are being joined and unioned in a concept known as shards. One of the precepts of relational theory is to hide the physical representation of the database from the logical representation. Shards throw that precept out the window. Huge on-line databases like Flickr, Digg and Salesforce.com are taking the shard approach and managing the query themselves. The complexity of XML structures means that concepts like normalization go out the window. Object-oriented development drives users to construct and marshal objects out of a data store in as efficiently as possible. How can relational theory be relevant when all the rules are being broken?

Of course relational theory is relevant. It is just that the semantics of the data being managed and how we access it is clouding how we perceive the database. Relational theory decomposes the data management problem into the smallest possible chunks of manageability. It provides models of locking and concurrency that ensure that information is updated or retrieved with a high level of integrity. It provides the basic operations of joining, selecting, projecting and unioning that are being used in massive data stores. It provides the notion that you can describe what is being accessed as a declarative, rather than a procedural description. I like to compare relational theory in relation to enterprise object-oriented systems like nuclear physics upon which organic chemistry is described.

What is becoming less relevant is the view that everything will be stored in a centralized database that somehow magically replicates itself all over the world. Or that all queries of information can be expressed in SQL. Or that a database management system is at the core of managing fragments of XML or JSON. As the patterns of local and global storage become well understood, database management systems will be replaced by distributed libraries that provide the relational operators and transaction control to perform on data. It remains to be seen if new models like XQuery, developed by Don Chamberlain, the same guy who helped developed SQL, will fill part of the void.

The losers in this transformation are likely to be the big database vendors. The winners are likely to be the small, lightweight databases like MySQL acting as transactional stores. The big vendors still hold on to some of the talent that would be working on modernizing the relational theory to today’s requirements, but hopefully new research and talent will emerge to help rationalize these changes into a coherent, new relational model.

John Newton has spent 25 years building information management software, including co-founding Documentum with Howard Shao in 1990. He is currently chairman and CTO of Alfresco. See his personal disclosure page for John's industry affiliations.
  • Talkback
  • Most Recent of 5 Talkback(s)
SAP is pre-relational really
SAP was first developed in the 1980s and SQL-DBMS were even more backward with respect to constraint definition than today's products.

Thus SAP were compelled to put a lot of logic that should ... (Read the rest)
Posted by: jorwell Posted on: 07/09/07 You are currently: a Guest | | Terms of Use
We're doomed...  Erik Engbrecht | 06/21/07
Great Blog! I agree that MySQL is poised to grow here. We will figure out  DonnieBoy | 07/03/07
Relational still definitely the way to go  jorwell | 07/09/07
A new relational model? Let's implement the one we have right first!  jorwell | 07/09/07
SAP is pre-relational really  jorwell | 07/09/07

What do you think?

SponsoredWhite Papers, Webcasts, and Downloads

advertisement

Recent Entries

Top Rated

    advertisement

    Archives

    ZDNet Blogs

    White Papers, Webcasts, and Downloads

    Meet Doc

    • Here to help you with your Document Management Needs
    • Doc is an enigma. Born to a Russian ballerina and a German electrical engineer, he grew up in various locations in the United States. He’s seen the insides of more brands, versions, and generations of printer and printer-related hardware than almost anyone.
    • To learn more about this mysterious figure check out his blog on ZDNet and his Workspace on TechRepublic. You’ll be glad you did.
    • Produced by
      ZDNet and