Archive for the ‘Database’ Category

DBMS 2.0

Monday, February 18th, 2008

The Reg Developer posting Time to rewrite DBMS, says Ingres founder begins

Abandon SQL

Database management systems (DBMS) are 20 years out of date and should be completely rewritten to reflect modern use of computers.

That’s according to a group of academics including DBMS pioneer Mike Stonebraker, Ingres founder and a Postgres architect taking his second controversial outing so far this year…

In a paper entitled The end of an architectural era (It’s time for a complete rewrite), the group - drawn from DBMS specialists at MIT and in industry - have said that modern use of computers renders many features of mainstream DBMS obsolete.

If you are interested in database evolution, read the Abstract. If it perks your interest, get yourself a copy of the paper.

ABSTRACT

In previous papers, some of us predicted the end of “one size fits all” as a commercial relational DBMS paradigm. These papers presented reasons and experimental evidence that showed that the major RDBMS vendors can be outperformed by 1-2 orders of magnitude by specialized engines in the data warehouse, stream processing, text, and scientific database markets.

Assuming that specialized engines dominate these markets over time, the current relational DBMS code lines will be left with the business data processing (OLTP) market and hybrid markets where more than one kind of capability is required. In this paper we show that current RDBMSs can be beaten by nearly two orders of magnitude in the OLTP market as well. The experimental evidence comes from comparing a new OLTP prototype, H-Store, which we have built at M.I.T. to a popular RDBMS on the standard transactional benchmark, TPC-C.

We conclude that the current RDBMS code lines, while attempting to be a “one size fits all” solution, in fact, excel at nothing. Hence, they are 25 year old legacy code lines that should be retired in favor of a collection of “from scratch” specialized engines. The DBMS vendors (and the research community) should start with a clean sheet of paper and design systems for tomorrow’s requirements, not continue to push code lines and architectures designed for yesterday’s needs.

The first two paragraphs of the Introduction eloquently sums up the history of relation DBMSs.

INTRODUCTION

The popular relational DBMSs all trace their roots to System R from the 1970s. For example, DB2 is a direct descendent of System R, having used the RDS portion of System R intact in their first release. Similarly, SQL Server is a direct descendent of Sybase System 5, which, borrowed heavily from System R. Lastly, the first release of Oracle implemented the user interface from System R.

All three systems were architected more than 25 years ago, when hardware characteristics were much different than today. Processors are thousands of times faster and memories are thousands of times larger. Disk volumes have increased enormously, making it possible to keep essentially everything, if
one chooses to. However, the bandwidth between disk and main memory has increased much more slowly. One would expect this relentless pace of technology to have changed the architecture of database systems dramatically over the last quarter of a century, but surprisingly the architecture of most DBMSs is essentially identical to that of System R.

I recall the “database wars” of the mid 1970’s. The established Codasyl camp and the relational model new kids on the block, slugged it out. The relational model sort of won. The Codasyl didn’t really lose, it is still very much alive today on mainframes (Data Stays Mainly on the Mainframe).

I’m looking forward to the next “database wars.” Who knows, maybe something new and improved will emerge.

…John

Hey DBA, Use Protection

Thursday, November 15th, 2007

The Reg Developer posting Databases still open to basic attack point to a survey that says about 500,000 databases are riding bare back on the Internet.

I don’t have a clue why DBAs of these systems seem to be clueless.

…John

There is Nothing Easy about Programming

Thursday, November 15th, 2007

The eWeek posting Microsoft Pushes Cloud Development begins by saying

“Microsoft is at work on a project to enable everyday developers to build Web applications, just as easily as folks have built Visual Basic applications over the years. The project to enable this is called Volta and it is headed up by Erik Meijer…”

I have a problem with this statement. Easily is defined as “without difficulty or effort.” Nothing is easy about any programming, not even using Visual Basic, or maybe it is more accurate saying, especially using Visual Basic, since so many incompatible technologies have been folded into it over the years, making it anything but easy.

Check out the Erik Meijer videos Volta - Wrapping the Cloud with .NET - Part 1 and Volta - Wrapping the Cloud with .NET - Part 2. After viewing them, I walk away with the view that Volta requires understanding the realities and issues of object-oriented classes and methods, graphical user interfaces, Web servers, databases, etc… Anything but easy!

One thing Volta does do is make developing multi-tiered applications easier, mind you not easy, by having not to commit early on in the development on which tier a class or method resides.

…John

Dirty Data

Friday, March 2nd, 2007

Survey says 25% of all critical information used by Fortune 1000 companies has flaws.

Flawed data includes information that is inaccurate, incomplete or duplicated.

Read more about it in the silicon.com posting Dirty data holding businesses back.

I use the term dangling data for describing another data flaw. Many database folks seem to be oblivious to referential integrity. A common mistake is deleting a record that contains a value referred to by another table thus breaking referential integrity and leaving the data in the other table dangling. It is a sad state of affairs because when referential integrity is used, the database kernels inforce it.

If I were a betting man, I’d wager that financials reported by many companies aren’t anywhere close to reality because of dangling data.

…John


The Internet Traffic Report monitors the flow of data around the world. Internet Storm Center Infocon Status