Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / Databases / June 2004

Tip: Looking for answers? Try searching our database.

POD speed

Thread view: 
Roedy Green - 01 Jun 2004 18:23 GMT
I saw this note on the Prevayler website www.prevayler.com

Queries with Prevayler are more than 9000 times faster than querying
Oracle through JDBC.

Queries with Prevayler are more than 3000 times faster than querying
MySQL through JDBC.

Prevaler is a persistent object database.  see
http://mindprod.com/jgloss/pod.html

There are two surprises.

1. MySQL is 3 times faster than ORACLE but Oracle is far more
expensive?

2. PODs are that much faster than SQL.

Signature

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.

Chris Smith - 01 Jun 2004 23:31 GMT
> I saw this note on the Prevayler website www.prevayler.com
>
[quoted text clipped - 13 lines]
>
> 2. PODs are that much faster than SQL.

Roedy,

What you're seeing is Prevayler agreeing with MySQL to look at the world
through red-tinted glasses, while Oracle sees with blue-tinted glasses.  
The resulting pictures are much different.  Notice that you don't get to
see which queries are faster, or what the data looks like for the
queries, or how the database is being used concurrently for other tasks
at the same time.  That's because these details are being tweaked to be
as friendly as possible to the simple object-access case that this
object database (and to a lesser extent MySQL, as well) is tuned for.

This is a common division of database vendors.  Though there are a lot
of exceptions, most object database systems don't really target the
high-scalability, high-reliability audience.  If you aren't doing the
work that makes for that kind of scalability and reliability, it's
really easy to post performance numbers that look out of this world.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Silvio Bierman - 01 Jun 2004 23:44 GMT
> I saw this note on the Prevayler website www.prevayler.com
>
[quoted text clipped - 13 lines]
>
> 2. PODs are that much faster than SQL.

Roedy,

A storage system for persisted objects has nothing to do with a relational
database system apart from the fact that the latter could be used to emulate
the former. Comparing them is plain stupid and I am afraid this tells us a
lot about the guys behind the product.

Silvio Bierman
Roedy Green - 02 Jun 2004 01:52 GMT
>A storage system for persisted objects has nothing to do with a relational
>database system apart from the fact that the latter could be used to emulate
>the former. Comparing them is plain stupid and I am afraid this tells us a
>lot about the guys behind the product.

There are projects that could go POD or SQL.  I think people tend to
overlook the POD approach simply because the SQL approach is more
familiar.

I used to work for Univac, so I am well familiar with tweaking
benchmarks.

However, they are talking a many orders of magnitude difference. Even
if this only happens under special circumstances, it means POD's
deserve a second look.

Signature

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.

Silvio Bierman - 02 Jun 2004 22:16 GMT
> >A storage system for persisted objects has nothing to do with a relational
> >database system apart from the fact that the latter could be used to emulate
[quoted text clipped - 11 lines]
> if this only happens under special circumstances, it means POD's
> deserve a second look.

Roedy,

As I already stated I think the POD approach is a draconic simplification
that serves no practical use other than the most trivial applications. I
also said that a RDBMS can be used as an awkward POD storage system so
whenever you consider a POD solution the RDBMS is always an option.

It has nothing to do with familiarity just like serializing objects has
nothing to do with a database. A relational database is a stylized and
standardized way to store data for efficient retrieval through multiple
access paths and through multiple applications. Serializing objects is a
program-(language)-local way of storing a memory-object for recreation at a
later moment.

People who mix up the two have usually no experience whatsoever whith
developing mission critical enterprise applications...

Regards,

Silvio Bierman
Roedy Green - 02 Jun 2004 22:42 GMT
>As I already stated I think the POD approach is a draconic simplification
>that serves no practical use other than the most trivial applications

I don't see that.  PODs give you transaction processing, persistence,
infinite RAM.  The one thing you don't get is to hide information.

Signature

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.

Lee Fesperman - 03 Jun 2004 00:22 GMT
> >As I already stated I think the POD approach is a draconic simplification
> >that serves no practical use other than the most trivial applications
>
> I don't see that.  PODs give you transaction processing, persistence,
> infinite RAM.  The one thing you don't get is to hide information.

Too bad you snipped his putdown of you, which you deserved. You are out of your element
here.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Roedy Green - 03 Jun 2004 00:38 GMT
>Too bad you snipped his putdown of you, which you deserved. You are out of your element
>here.

You explained nothing.  Claiming superior knowledge, using a putdown,
without sharing that knowledge is a cheap shot.

Why should I quote his rude remarks? My post had nothing to do with
them.

Signature

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.

Lee Fesperman - 03 Jun 2004 09:17 GMT
> > Too bad you snipped his putdown of you, which you deserved. You are
> > out of your element here.
>
> You explained nothing.  Claiming superior knowledge, using a putdown,
> without sharing that knowledge is a cheap shot.

You can't explain the whole of database concepts in a newsgroup posting.

To put it simply, POD is not a Database Management System (DBMS) because it does not
manage the database. It does not provide data integrity, security and access. It is
nothing more than an object persistence layer.

You are comparing apples to oranges. Without a DBMS, the data cannot be trusted. It is
little more than garbage. Fast access to garbage is meaningless.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Roedy Green - 03 Jun 2004 12:54 GMT
> It does not provide data integrity, security and access. It is
>nothing more than an object persistence layer.

The POD I used, ObjectStore, did provide integrity, by using
transactions.  Granted, it did not let you give selective access to
different fields the way SQL does.  I not sure what you mean by
security, but perhaps you are referring the password type, which would
not be hard to implement in a POD.

Signature

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.

Lee Fesperman - 04 Jun 2004 10:37 GMT
> > It does not provide data integrity, security and access. It is
> >nothing more than an object persistence layer.
[quoted text clipped - 4 lines]
> security, but perhaps you are referring the password type, which would
> not be hard to implement in a POD.

Database integrity involves much more than transactions, security much more than
passwords. We are talking on two different levels. You have a very simplistic view of
database and database management.

I specialize in databases. I participate in comp.lang.java.programmer and read your
strange rant on SQL databases, but I don't worry about such stuff on c.l.j.p.. However
if you want to post uninformed opinions about database on comp.lang.java.databases, I
for one will call you on it.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Silvio Bierman - 04 Jun 2004 13:02 GMT
First of all I want to state that putdowns posted by me earlier where not
directed towards Roedy but towards the guys promoting the POD product by
making a speed comparison with RDBMS systems.

In reaction to this thread I have to agree with Lee here. An RDBMS is a way
to put the information (data if you want) above all applications using it.
These applications may be written by you but could very easily be standard
packages like reporting and datawarehousing tools. Access to the data by
these applications can and should be controlled and monitored in a very
detailed way. For enterprise level applications this is the only way to go.
It is not a questions IF users will want to do analysis or reporting on
operational use or gathered data but WHEN and HOW. The same holds for
integration with other software systems.

It is a totally different thing from storing some data that is meaningfull
to some application in such a way that it can be retrieved during later
runs. This could be achieved with a RDBMS but, as Roedy and many others have
stated can also be done differently. Object serialization or any other type
of binary or text formats come to mind, possibly to plain files or a data
management system.

The value of an RDBMS can be compared to that of widely accepted standard
network protocols. It allows the development of software components in such
a way that they can be combined with others without bothering with
programming-languages, platform specific details and application-local
(nonstandard) encoding of data. That is why architectural boundaries should
be defined in such terms instead of proprietary terms.

Regards,

Silvio Bierman
Chris Smith - 03 Jun 2004 23:43 GMT
> You are comparing apples to oranges. Without a DBMS, the data cannot
> be trusted. It is little more than garbage. Fast access to garbage
> is meaningless.

Ya know (and I'm sure I'll regret jumping in here), it's statements like
the one you just made that convince large parts of the software industry
that most people with database knowledge are idiots.  Of *course* data
that's not in a DBMS is both useful and trustworthy, and of *course*
it's not garbage or anything close to it.  The world has depended on it
for thousands of years.

Database software provides some very nice features that meet the data
storage and retrieval requirements of a fairly large class of
applications.  However, the great majority of applications don't need
database software to manage their data's integrity or security.  That's
just fact.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lee Fesperman - 04 Jun 2004 10:57 GMT
> > You are comparing apples to oranges. Without a DBMS, the data cannot
> > be trusted. It is little more than garbage. Fast access to garbage
[quoted text clipped - 6 lines]
> it's not garbage or anything close to it.  The world has depended on it
> for thousands of years.

You really should watch your language (idiots). On comp.lang.java.programmer, I called
Dale King on his intemperate language and caused him to stop. You need the same lesson.

You should know I've posted on c.l.j.p. for a long time. It is it your opinion that I am
an idiot (just because I specialize in database?)

Keep on topic. We're not talking about thousands of years. We're talking about
application programs. What gives you the idea that they handle data properly?

> Database software provides some very nice features that meet the data
> storage and retrieval requirements of a fairly large class of
> applications.  However, the great majority of applications don't need
> database software to manage their data's integrity or security.  That's
> just fact.

"Don't need" is opinion, not fact. You might say that about the simplest, single program
application, that will never change (if such is possible), otherwise it is not fact.
Although decisions are made every day to follow that uninformed opinion, they are bad
decisions.

Like Roedy, you're out of your element. I quite aware of the limits of your knowledge.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Roedy Green - 04 Jun 2004 13:26 GMT
>You really should watch your language (idiots). On comp.lang.java.programmer, I called
>Dale King on his intemperate language and caused him to stop. You need the same lesson.

What we are complaining about is your sudden attack of grand majesty
as if we should all bow to your opinion even though you refuse to
backup or explain your statements.

You try at persuade simply by claiming you know more than others.
That may be so, but it is still a very unconvincing argument.

I also think you are presuming that every problem is like the ones you
specialize in.  Sometimes a sledgehammer is not the appropriate size
tool.

Signature

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.

Lee Fesperman - 04 Jun 2004 21:35 GMT
> > You really should watch your language (idiots). On comp.lang.java.programmer,
> > I called Dale King on his intemperate language and caused him to stop.
[quoted text clipped - 3 lines]
> as if we should all bow to your opinion even though you refuse to
> backup or explain your statements.

Actually, there were no sudden attacks. I've been posting on c.l.j.d for a long time.
You're the newbie over here.

I assume you agree with Smith that most people with database knowledge are idiots. I
suggest you look in the mirror.

Database concepts are a big, complicated area. I'm not going to feed you little
sound-bites because you refuse to educate yourself. Besides, you'll just twist what I
say to fit your simplistic view of the subject (as you already have).

I suggest you read a book or two by Chris Date or Fabian Pascal. If you want a quick
fix, hie yourself over to Database Debunkings (http://www.dbdebunk.com) ... both Chris
and Fabian hang out over there. BTW, I'm one of the founders of Database Debunkings.

> You try at persuade simply by claiming you know more than others.
> That may be so, but it is still a very unconvincing argument.

While you prefer to revel in your lack of knowledge.

> I also think you are presuming that every problem is like the ones you
> specialize in.  Sometimes a sledgehammer is not the appropriate size
> tool.

Your cliches are a waste of time. Relational database has been one of the driving forces
in the great improvements in application development that have occurred in the last 20
years.

You're the one with the sledgehammer --- Object-Orientation (OO). OO is just an ad-hoc
collection of programming techniques with no theoretical foundation. Relational
database, OTOH, has a very solid theoretical, mathematical foundation solidified by over
30 years of research and use.

Any large corporation who doesn't base their core business on a RDBMS is foolish. The
same is true for smaller business, however they've been pretty much ignored by the big 3
-- Oracle, Microsoft (SQL Server) and IBM (DB2). That's a niche my company is trying to
serve while providing better relational capability than those 3.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 04 Jun 2004 23:54 GMT
> I assume you agree with Smith that most people with database knowledge
> are idiots.

Again, I regret that misunderstanding, and I hope it's been cleared up
by now.  I *don't* think that most people with database knoweldge are
idiots, nor do I necessarily think that fewer applications overall
should make use of databases (though there are certainly specific
examples of misuse).  Quite the contrary, in fact.  I'd like to see
databases used to their potential.

What I do lament, then, is the kind of hyperbole that you demonstrated
in your earlier response in proclaiming that data outside of a DBMS is
useless.  This kind of claim, though I'm sure you don't mean it
literally, is obviously false at face value.  The result I've seen over
and over again is that you convince others that your interest in
databases is *your* problem.  After all, the theory goes, if you can't
write a useful application without a database and I can, then doesn't
that indicate a way that your development skills are deficient?  And
these people have, of course, demonstrated time and again that they can
write useful software without a database.

I fight this battle frequently, and I hate seeing the roots of it pop up
here.

My apologies if anything else was read into what I said.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lee Fesperman - 06 Jun 2004 22:59 GMT
> What I do lament, then, is the kind of hyperbole that you demonstrated
> in your earlier response in proclaiming that data outside of a DBMS is
[quoted text clipped - 9 lines]
> I fight this battle frequently, and I hate seeing the roots of it pop up
> here.

Let me get this straight. You're on a misson to smite down database whereever it rears
its ugly head. You've come on to a database group to show the database idiots the error
of their ways. Your argument is that people went without database for thousands of years
with absolutely no problem, why should they need them now.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 07 Jun 2004 14:16 GMT
> Let me get this straight. You're on a misson to smite down database
> whereever it rears its ugly head. You've come on to a database
> group to show the database idiots the error  of their ways. Your
> argument is that people went without database for thousands of years
> with absolutely no problem, why should they need them now.

Well, no.  That's about as far from the truth as you can get.  I don't
know how I can be any clearer, so let's drop it.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lee Fesperman - 08 Jun 2004 03:38 GMT
> > Let me get this straight. You're on a misson to smite down database
> > whereever it rears its ugly head. You've come on to a database
[quoted text clipped - 4 lines]
> Well, no.  That's about as far from the truth as you can get.  I don't
> know how I can be any clearer, so let's drop it.

You are irritating. You make intemperate remarks, assert gross generalizations and now
you're jumping around, causing confusion. You've attached this to a proper sub-thread I
have going with Green (acrimonious though it may be.) You and I have another sub-thread
going on exactly the same topic. I'd already posted a solid response on that sub-thread
that brings things back on topic and tries to respond to your assertions. Thus, my
flippant response here. I realize that you and Green are tag-teaming.

Until (and if) you respond to that on topic response, I'll satisfy myself with
examining your motivations, because this is very familiar territory for me (google my
usenet discussions with Carl Rosenberger or James/Neo, for examples.)

You are an OO (Object Orientation) zealot. You see OO as the great panacea, a solution
to all software problems. Don't get me wrong; I absolutely love it for what it is --- a
software development technique.

However, I take a much longer and broader view of things. To me, OO is great simply
because it is better than previous main-stream development techniques. It is only a
small step, though.

OO is just an ad-hoc collection of techniques used by good programmers since the 60's
... I know because I was there ;^) It has no theoretical foundation. Even OO experts
will disagree whether a given, simple object design is correct.

The relational model, OTOH, has a very solid theoretical foundation based on
mathematics. It has been the subject of extensive research for over 30 years. An
integral part of the relational model is Normalization. Normalization is a complete
technique for database structure design. Though there is some ambiguity at the extreme
edges, a database expert can easily determine if a design is correct.

Data structure design in OO has no real theory. The simple question of "which class
should a given field be placed in" has no solid answer in OO. The general advice is "put
it where it belongs." The only guidelines are related to how the field is to be used by
the application. However, this immediately falls apart when functionality is added or
changed.

Normalization basically ignores how data is used and works instead with the meaning of
data, in modelling the real world. Adding/changing application functionality has no
effect on the correctness and usability of a normalized design.

OO is shallow; RM is deep.

OO concepts are only applicable in one small place in a DBMS (ignoring presentation
issues) --- in modelling domains. Of course, I use OO in implementing DBMSs, but it is
no more than the ink I use in my pen.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 08 Jun 2004 16:10 GMT
> You and I have another sub-thread  going on exactly the same topic.
> I'd already posted a solid response on that sub-thread that brings
> things back on topic and tries to respond to your assertions. Thus,
> my flippant response here.

Your response never made it to my news server.  It did reach Google
though, so I can continue the conversation there.

Let me take a moment to suggest, though, that you give others the
benefit of the doubt.  You have three times now posted to tell *me* what
*I* am up to, and it's just not so.  You're being horrendously rude to
the point that I suspect you are intentionally misinterpreting what I
say so as to find opportunities to publicly berate me.  If so, I'll be
sure of it soon enough and stop communicating with you.  That's simply
not useful conversation at all.  All you seem to have to say is "you're
so stupid that it's beneath me to explain this to you".  No one is
forcing you to spend time explaining your position on the matter... but
if you don't want to do so, why are you posting?

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Roedy Green - 05 Jun 2004 00:39 GMT
>Database concepts are a big, complicated area. I'm not going to feed you little
>sound-bites because you refuse to educate yourself. Besides, you'll just twist what I
>say to fit your simplistic view of the subject (as you already have).

Surely there is some middle ground between soundbites and your current
Queen Victoria impersonation.  

Sorry, I don't hold you in such God-like esteem so that I believe what
you say without evidence.  

Your very touchiness on the matter suggests a strong emotional bias.

Signature

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.

Lee Fesperman - 06 Jun 2004 22:52 GMT
> >Database concepts are a big, complicated area. I'm not going to feed you little
> >sound-bites because you refuse to educate yourself. Besides, you'll just twist what I
> >say to fit your simplistic view of the subject (as you already have).
>
> Surely there is some middle ground between soundbites and your current
> Queen Victoria impersonation.

Oh! You've been working on your insults.

Actually, you're somewhat discerning; I have taken the gloves off lately. But, what the
hey, you're the one coming into the group and acting like a troll.

Some middle ground sounds like you could make some movement yourself, like learning a
little about db...

> Sorry, I don't hold you in such God-like esteem so that I believe what
> you say without evidence.

I've pointed you to resources; I assume you've ignored them.

Let me give you another kind of evidence...

The database market is dominated by SQL-DBMSs. Large, medium and many small corporations
store their core data in SQL-DBMSs. They do it because SQL-DBMS provide proper
integrity, security, access and other capabilities for data, which no other data model
(or lack of) can come close to.

Are you aware that SQL-DBMSs are a major component in J2EE?

> Your very touchiness on the matter suggests a strong emotional bias.

Now the ad hominems are going towards psychoanalysis. What was the emotional bias of
your "SQL Rant" on c.l.j.p.?

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 04 Jun 2004 14:49 GMT
> You should know I've posted on c.l.j.p. for a long time. It is it your opinion that I am
> an idiot (just because I specialize in database?)

Lee, I can see how my comments my be interpreted as such, and I
apologize if they were.  No, I don't think you are an idiot.  Instead, I
think you were grossly exaggerating your point.

I meant my comment quite literally, actually; I've encountered a lot of
mocking and belittling aimed at the database crowd, and it nearly always
stems from this worldview that everything in the world must start with a
database.

> Keep on topic. We're not talking about thousands of years. We're talking about
> application programs. What gives you the idea that they handle data properly?

They manage data in a quite sufficient way to provide me and others with
functionality that we need, and that's enough to establish that their
data is quite useful.  Clearly, useful software was created prior to the
existence of databases, and useful software can continue to be created
without databases today.  The question isn't whether a DBMS can provide
additional benefits to a software package -- I agree that often it can;
the question is whether it is even remotely reasonable to characterize
data not maintained in a database as "garbage".  I think it's pretty
clear, to pretty much everyone, that it's not.  Most of the data that I
manage on a daily basis is *not* managed by a DBMS package, and yet I
trust it and use it to make life decisions.

To me, that's a pretty good guage for whether a DBMS is required to
raise data from "garbage" to usefulness; and it's not.

> > Database software provides some very nice features that meet the data
> > storage and retrieval requirements of a fairly large class of
[quoted text clipped - 9 lines]
> Like Roedy, you're out of your element. I quite aware of the limits of
> your knowledge.

Okay, I give up.  I trust it's obvious to the rest of the world that
most applications (say, Word or Photoshop or Maple) shouldn't be
retrofitted with a database to preserve their data.  No one needs to be
a database expert to see that.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lee Fesperman - 06 Jun 2004 03:22 GMT
> I think you were grossly exaggerating your point.

I'll respond to that below.

> I meant my comment quite literally, actually; I've encountered a lot of
> mocking and belittling aimed at the database crowd, and it nearly always
> stems from this worldview that everything in the world must start with a
> database.

Actually, it's because of lack of knowledge.

> > Keep on topic. We're not talking about thousands of years. We're talking about
> > application programs. What gives you the idea that they handle data properly?
[quoted text clipped - 10 lines]
> manage on a daily basis is *not* managed by a DBMS package, and yet I
> trust it and use it to make life decisions.

You're overeaching again. First, you talked about the world depending on data for
thousands of years. You seem to think that using ad-hoc, error-prone techniques is
perfectly fine...

Sure, people created useful software before the existence of databases (or, more to the
point, before the relational model was generally accepted.) I created a lot myself.
However, this was done using primitive tools and ad-hoc technologies. It was hard and
error-prone ... I can attest to that ;^) Your argument is that we shouldn't be using
advanced techniques developed since, because we got by without them in the past. That
kind of Luddite view seems inappropriate in a newsgroup devoted to advanced software
techniques.

Let me try (once again) to bring you back on topic. The original discussion was about
using POD vs. a SQL DBMS, today. My statement was made in that context -- it is my
'opinion' that an object persistence utility (like POD) should only be used for "the
simplest, single application, that will never change." Otherwise, a RDBMS, realistically
a SQL-DBMS (excluding weak systems like MySQL + HSQL,) should be used. POD and such are
ad-hoc and error-prone.

If you like, we can continue the discussion on that basis.

> > > ....., the great majority of applications don't need
> > > database software to manage their data's integrity or security.  That's
[quoted text clipped - 9 lines]
> retrofitted with a database to preserve their data.  No one needs to be
> a database expert to see that.

I'm somewhat non-plused by your way of arguing. You make great generalizations and claim
them as fact. You portray things as me against the rest of the world (though I'm used to
that.)

I'm afraid I don't know what Maple is, but the other applications you listed are
horizontal, even utilities rather than applications. I believe the 'great majority' of
applications are vertical, which tends to vitiate your point.

Unlike you, I believe most software out there is mediocre at best. I'm embarrassed by my
craft; I'm sure that they could do better. Word is a perfect example. I'm stunned by the
number of software professionals who believe that Microsoft's agenda is the production
of useful software.

Let me make clear that I don't think that relational database is some kind of panacea, a
solution to these problems. It is just one part of the solution, having enormous
advantages over ad-hoc techniques. (Note: I do feel that a general-purpose relational
programming language would be better than the main-stream choices, which I am also
embarrssed by.)

Finally, I'd like to add that I am amazed by your attitude towards expertise and
experts. Perhaps, you lack experience.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 08 Jun 2004 16:56 GMT
> > I meant my comment quite literally, actually; I've encountered a lot of
> > mocking and belittling aimed at the database crowd, and it nearly always
> > stems from this worldview that everything in the world must start with a
> > database.

Lee Fesperman wrote:
> Actually, it's because of lack of knowledge.

Of course it's because of lack of knowledge.  That's pretty much a
circular statement.  We're talking about the reaction of people without
database understanding to working with people that do the database side
of their projects.  The very definition of the prior group is that they
lack knowledge in this area.

Beyond that, though, I can talk to most people who have knowledge I
don't have (for example, I recently worked with some physicists from a
local college to build a fluid dynamics simulation, and I had no clue
about most of their knowledge), and I am not at all tempted to write
them off as loons.  The difference is that they don't make statements
that are obviously false.  It's clear that there's a rather large
movement of database programmers and administrators who put forward the
idea that anyone who is not using a database is not doing serious
software development work.  Common experience is sufficient to disprove
that extreme version of the value of a database.

I *don't* want to give out the impression that I want to stop the
adoption of relational databases or to remove them from areas where they
are in common use.  I simply get put off by the "you're an idiot because
you would dare advocate anything that conflicts with the purity of my
relational model" approach.

> You seem to think that using ad-hoc, error-prone techniques is
> perfectly fine...

Nope.  But I do think that there are costs to making a design decision
that involves an RDBMS versus a more transparent form of persistence.  
It adds complexity to a project by ensuring that you are working with
two representations of data.  Even the best true O/R mappers (by which I
mean those that map data to existing database schema; those that
generate schema from the app-specific data model are better described as
an OO data persistence layer that happens to use a relational database
as part of the implementation) impose substantial restrictions on your
use of a programming language, in order to ensure that they understand
how to map the result to the database.

(And I don't know if you reject O/R mapping entirely; I know that a lot
of the more extreme "relation data is the world" crowd do.  Personally,
I find them to be rather useful tools; though I do think they are
inappropriate for a substantial class of applications, specifically in
data mining.)

> Your argument is that we shouldn't be using advanced techniques
> developed since, because we got by without them in the past. That
> kind of Luddite view seems inappropriate in a newsgroup devoted to
> advanced software
> techniques.

I'm afraid that you're not correct about my argument.  My argument is
that it's a poor idea to speak in a public forum as if the entire world
must rest on relational databases in order to be worth anything.  That
approach is counterproductive and more than a bit misleading.

> Let me try (once again) to bring you back on topic. The original
> discussion was about using POD vs. a SQL DBMS, today. My statement
[quoted text clipped - 3 lines]
> RDBMS, realistically a SQL-DBMS (excluding weak systems like MySQL +
> HSQL,) should be used. POD and such are ad-hoc and error-prone.

And I disagree with pieces of this, if you want to talk in that
direction.  Specifically, I would say that:

1. The difference needs to focus on the scope and nature of the data,
not the simplicity of the application.  There is a large amount of data
that is either so application-private that it makes no sense to expose
it via a universal relational model because no one else will want it
(for example, a parameterized strategy for a particular kind of problem-
solving activity that the software performs).  There is a second class
of data that's sufficiently inappropriate for the relational model that
it's best to steer clear (for example, I'd shudder to imagine
representing an arbitrary math formula in relational tables and trying
to perform useful translations on it).

2. I disagree with the "that will never change".  This isn't just a
relational DB thing; I generally disagree with the idea of designing for
what a business software system will be like years from now.  Experience
has shown that in business software development, we rarely have better
than a vague idea of what the application will look like six months from
now.  If the application moves in a direction where a relational
database is helpful, it's easy enough to migrate the data over,
especially if it's local enough that it made sense to use a transparent
object persistence layer for storage.

(Incidentally, this is different in system-level software such as
compilers or the databases themselves; the difference is that you have a
more solid picture of what the final system should look like and the
problem that it solves is more exactly described.  These are
characteristics that will never be true of most business software simply
because it's fundamentally rooted in human factors.)

> I'm afraid I don't know what Maple is, but the other applications you
> listed are horizontal, even utilities rather than applications. I
> believe the 'great majority' of applications are vertical, which tends
> to vitiate your point.

(Incidentally, Maple is a computer algebra package.)

I don't see that.  I have no numbers, but I tend to believe that back-
end enterprise software systems comprise by no means the average
software system in use.  In a typical day, I use at least half a dozen
to a dozen end-user software applications, and I interact with about
four enterprise-level applications.  When you consider that the majority
of the world is not employed in IT or software engineering, I'd tend to
think that the numbers differ even further for others.

Regardless, it seems worthwhile to restrict discussion to enterprise
business software, since that's generally the target audience for this
newsgroup.

> Unlike you, I believe most software out there is mediocre at best.
> I'm embarrassed by my craft; I'm sure that they could do better.

That would be a difference between us.  Again, I would tend to link it
to our differing perspectives on database as well.  I tend to think that
software development is doing fairly well.  Sure there are horrendous
failures, but there are plenty of successses as well.  Software
development is fundamentally about human communication... interpreting
what people want and translating it into a precise description in
software.  Given that, we're never going to get it down to the same
level of predictable processes and results as, say, welding or house
framing; but we are doing fairly well.

Predictably, the most common reason for the failure of software projects
is poor requirements.  Databases aren't going to change that.  What will
change that is hiring software developers who know how to talk to
people, understand what they want, and help the people they are working
with to understand what they want.

> Finally, I'd like to add that I am amazed by your attitude towards
> expertise and experts. Perhaps, you lack experience.

I guess you couldn't get by without one last personal attack.  Ah well,
perhaps you'll respond to the conversation above.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lee Fesperman - 11 Jun 2004 09:48 GMT
> Lee Fesperman wrote:
> > You seem to think that using ad-hoc, error-prone techniques is
[quoted text clipped - 10 lines]
> use of a programming language, in order to ensure that they understand
> how to map the result to the database.

Nope? Can I get something more to the point? Do you believe that such alternatives (such
as, simple object persistence systems) are not ad-hoc and error-prone? I know of no
theoretical basis for these techniques. Can you enlighten me on this issue?

I do see a number of difficulties with using O/R mappers. I expect only small
improvements in the future. A good O/R mapping solution is always going to require
significant participation by a 'developer' with problem domain knowledge.

The double representations of data is natural. The application view is concerned with
the functionality of the individual application. An application written in an OO
language would prefer data expressed as objects oriented to operations performed by the
application. The database view is concerned with persistence, use by different
applications, integrity and other database considerations. These are fundamentally
different concerns.

Would you see it as reasonable that two applications accessing the same database tables
might use a differnt object/relational mapping -- that is, different objects? To do
otherwise would be violating encapsulation.

> (And I don't know if you reject O/R mapping entirely; I know that a lot
> of the more extreme "relation data is the world" crowd do.  Personally,
> I find them to be rather useful tools; though I do think they are
> inappropriate for a substantial class of applications, specifically in
> data mining.)

Object/Relational mapping is OK. It elucidates the dichotomy between application
orientation and database orientation. Application considerations shouldn't influence the
database structure design. O/R mapping could help prevent that.

> > Let me try (once again) to bring you back on topic. The original
> > discussion was about using POD vs. a SQL DBMS, today. My statement
[quoted text clipped - 17 lines]
> representing an arbitrary math formula in relational tables and trying
> to perform useful translations on it).

Certainly, the scope and nature of the data should be a focus point. However, you give 2
specialized examples without sufficient description to understand them. I can't tell
from that whether an RDBMS is appropriate. It might be, but I don't see any reason to
delve into them just to judge the validity of your assertions.

> 2. I disagree with the "that will never change".  This isn't just a
> relational DB thing; I generally disagree with the idea of designing for
[quoted text clipped - 5 lines]
> especially if it's local enough that it made sense to use a transparent
> object persistence layer for storage.

What you're talking about might be called "design for the future". I am talking about
designing for the future but not in the sense of predicting changes, including hooks and
'extraneous' code, etc.. I was referring to designing a normalized database structure,
with no extraneous tables/columns.

Because Normalization and RM are concerned with the meaning of data in modeling the real
world, a properly designed database is much more amenable to change than any other data
model. RM eases migration to new structures with powerful abstractions like views,
allowing existing applications to run without change. Views have a number of other vital
uses (the lack of them is one reason I call MySQL a weak system.)

> (Incidentally, this is different in system-level software such as
> compilers or the databases themselves; the difference is that you have a
> more solid picture of what the final system should look like and the
> problem that it solves is more exactly described.  These are
> characteristics that will never be true of most business software simply
> because it's fundamentally rooted in human factors.)

I have been considering system software as different from applications, though I have
written compilers/interpreters that used a relational database.

> > I'm afraid I don't know what Maple is, but the other applications you
> > listed are horizontal, even utilities rather than applications. I
[quoted text clipped - 10 lines]
> of the world is not employed in IT or software engineering, I'd tend to
> think that the numbers differ even further for others.

We seem to be counting different things. You are counting users, and I am counting
different programs (developers). That's why I considered horizontal software unimportant
-- there are only a few main-stream word processors.

> Regardless, it seems worthwhile to restrict discussion to enterprise
> business software, since that's generally the target audience for this
> newsgroup.

I was considering a different range of applications, those using or contemplating using
a database, from POD to Oracle. Limiting the discussion to enterprise business
applications would virtually eliminate POD from consideration. I think the former better
fits this group, anyway. OTOH, I believe discussions of whether a database should be
used or not are also germane to the group (there is no
comp.lang.java.databases.advocacy.)

> > Unlike you, I believe most software out there is mediocre at best.
> > I'm embarrassed by my craft; I'm sure that they could do better.
[quoted text clipped - 8 lines]
> level of predictable processes and results as, say, welding or house
> framing; but we are doing fairly well.

Not surprisingly, I'm more concerned with improving current tools, rather than improving
people skills. Certainly training will help with the latter.

With main-stream languages, programs are complex, fragile and error-prone. One of the
big reasons is that they are basically procedural and use variables.

As to turning software development into a craft or engineering discipline, I don't see
that as a reachable goal. Programming is too dynamic. I've always felt that the primary
reason for developing a program is because one doesn't already exist.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 11 Jun 2004 16:13 GMT
Lee,

Thanks for a very civil and thoughtful response.

> Nope? Can I get something more to the point? Do you believe that such
> alternatives (such as, simple object persistence systems) are not
> ad-hoc and error-prone? I know of no theoretical basis for these
> techniques. Can you enlighten me on this issue?

My fault.  Again, I seem that have misinterpreted your meaning.  I don't
consider use of such a POD product to be without disadvantages.  It's in
that sense that I meant I don't think it's "fine".  But I do believe
that there are situations where these disadvantages can be justified.  
So I do think it's okay to use a POD if other concerns override their
disadvantages.

As for the two adjectives, I am not familiar with the product mentioned
there, so I don't know if it provides referential integrity guarantees.  
That would be my primary concern with calling it "error-prone".  Outside
of these referential integrity guarantees, pretty much anything that
could be checked in the database could also be checked in application
code.  Since I'd never advocate using a product such as that for data
that spans applications, that would be perfectly sufficient.  The OO
model would, in this case, provide one place (in the class's mutator
methods) to check the validity of data within an object).

I'm not sure of any definition of "ad hoc" that makes it sensible to
debate whether or not any persistence product is "ad hoc" or not.  Just
to ensure that you'll return to dismissing my opinion as unworthy, I
simply don't see the point in this insistence that relational databases
are based on mathematical theory.  Yes they are, of course, and that
makes is much more possible to write query optimizers that do a good job
of finding data; but it doesn't implicitly mean that the database is any
more appropriate for representing data from the real world.  (I'm not
disputing, by the way, that the relational model is complete, in that
any data can be expressed according to that model.  That much can be
proven mathematically.)

> The double representations of data is natural.  The application view is
> concerned with the functionality of the individual application.

That's true, but doesn't change the fact that there are two different
representations of data that the application code needs to deal with.  
This is really very simple.  A POD doesn't require you to think about
your relational model of data (since it doesn't exist), and as far as I
can see there are no functional O/R mappers that don't require it.  
Conclusion: when both are applicable, the POD is the simpler solution in
that way.

It doesn't change things to say that there are different tasks at hand.  
That may be true, but the POD can do the persistence part without
exposing the additional representation of data to the application.  The
O/R mapper takes you a good distance in that direction, but in the end
often requires very non-OO ways of writing those classes or of working
with them.

> Would you see it as reasonable that two applications accessing the
> same database tables might use a differnt object/relational mapping
> -- that is, different objects?

It certainly seems reasonable, and even expected.  I'm not entirely
clear on what you mean; I'd expect the majority of differences between
O/R mapping to come from (in approximately this order):

1. Using different relations.
2. Using different fields of relations.
3. The way that relationships between entities are mapped.

You've excluded #1, but the remaining two still apply.  If you're
talking about something else, though, then I'd like to hear more.  The
only other differences that come to mind would not be part of the O/R
mapper (for example, the classes would likely declare different methods
for doing things to the data once it's present).

[...]

> What you're talking about might be called "design for the future". I
> am talking about designing for the future but not in the sense of
> predicting changes, including hooks and 'extraneous' code, etc.. I
> was referring to designing a normalized database structure,
> with no extraneous tables/columns.

Sure, but I still think the concern is still there.  If it's more
complex to use a relational database than a transparent object
persistence layer, then there's still not a very good reason to place
application-private data into a relational database on the suspicion
that it might change to become more useful.  A data migration tool from
the POD to the relational database could be written and run in probably
less than a day as part of the future changes.

Of course, this doesn't apply to applications where it's *not* more
complex to use a relational database (such as when there's a requirement
to select certain data based on rather involved criteria; something that
PODs are conventionally not so good at or that often require learning
their own rather complex alternative query APIs).  In that case, I'd go
for the relational database from the beginning.

(Regarding that last sentence: by and large, i don't buy the argument
that PODs should be used because they are faster, which Roedy used to
start this thread; in the few I've tested in this way, the advantage is
reversed when the quantity of data or the number of concurrent requests
gets too high.  That's not to say I don't think single apps with low
quantities of private data are *important*, but just that I don't think
performance is a major design factor in data access for those
application.)

> RM eases migration to new structures with powerful abstractions like
> views, allowing existing applications to run without change.

Perhaps so, but aren't we talking about data that is application-
private, and that only later changes to become more widely useful?  In
that case, change to the original application is nearly certain to go
hand in hand with the change to make the original data more widely
useful.

> Limiting the discussion to enterprise business applications would
> virtually eliminate POD from consideration.

Sorry; forget I said enterprise.  I should have had more forethought,
and realized that enterprise has as many meanings as people using the
word, and that in the past decade or so it has changed common meanings
to "very large" or "big enough that you should use my product".  What I
really meant was shared, not end-user installed.

> With main-stream languages, programs are complex, fragile and error-
> prone. One of the big reasons is that they are basically procedural
> and use variables.

Really, though, the most important reason that main-stream programs are
complex is that they solve complex problems.  From there, "error-prone"
and "fragile" naturally follow.  That's not to say that tools can't
help, but just that we should evaluate tools on the basis of knowing
that they will be used in complex software.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lee Fesperman - 14 Jun 2004 10:01 GMT
> > Nope? Can I get something more to the point? Do you believe that such
> > alternatives (such as, simple object persistence systems) are not
[quoted text clipped - 7 lines]
> So I do think it's okay to use a POD if other concerns override their
> disadvantages.

It's my experience that people don't investigate enough to determine the disadvantages.
More below...

> As for the two adjectives, I am not familiar with the product mentioned
> there, so I don't know if it provides referential integrity guarantees.
[quoted text clipped - 5 lines]
> model would, in this case, provide one place (in the class's mutator
> methods) to check the validity of data within an object).

Good, you do agree that the simple object persistence model is not appropriate when data
spans applications.

Enforcement of database constraints in application code is one solution generally
recognized as 'error-prone'. Here's some reasons:

+ Constraint enforcement is hidden in application code. It is hard to check (validate)
and requires programming language knowledge. The code is procedural.

+ Data constraint checking is difficult to code correctly.

+ Even in a single application, constraints may need to be applied in several places in
the program. Incomplete coverage is common.

+ Anyone maintaining must be aware of all constraint requirements.

+ If the database system supports a generic tool for manipulation of data, application
constraints will be ignored.

> I'm not sure of any definition of "ad hoc" that makes it sensible to
> debate whether or not any persistence product is "ad hoc" or not.  Just
[quoted text clipped - 7 lines]
> any data can be expressed according to that model.  That much can be
> proven mathematically.)

Good, you agree that RM has a solid mathematical foundation.

However, the Relational Model is not an academic exercise in math. It is explicitly
intended for modeling real world entities and relationships. Decades of use have shown
that is does a very good job of it.

OO has been sold for its real world modeling but has not met that promise. OO's emphasis
is on programming artifacts, not real world entities/relationships. See Date's writings
on this subject (3rd Manifesto or check dbdebunk.com.)

> > The double representations of data is natural.  The application view is
> > concerned with the functionality of the individual application.
[quoted text clipped - 6 lines]
> Conclusion: when both are applicable, the POD is the simpler solution in
> that way.

Only if you insist that persistent data and dynamic data have identical considerations
and concerns. I'm not sure that even applies to POD.

> > What you're talking about might be called "design for the future". I
> > am talking about designing for the future but not in the sense of
[quoted text clipped - 9 lines]
> the POD to the relational database could be written and run in probably
> less than a day as part of the future changes.

I don't see the migration as easy. Simply, a persistent data design that ignores
database design concepts is going to be harder to get back on track.

> > RM eases migration to new structures with powerful abstractions like
> > views, allowing existing applications to run without change.
[quoted text clipped - 4 lines]
> hand in hand with the change to make the original data more widely
> useful.

Lockstep (hand in hand) changes are much harder; they tend to be all or nothing. RM
abstractions allow piecemeal changes. This is useful even when only one application is
involved.

> > Limiting the discussion to enterprise business applications would
> > virtually eliminate POD from consideration.
[quoted text clipped - 4 lines]
> to "very large" or "big enough that you should use my product".  What I
> really meant was shared, not end-user installed.

OK, a broader category is good. It's a fact of life that enteprise in Java implies J2EE
;^)

> > With main-stream languages, programs are complex, fragile and error-
> > prone. One of the big reasons is that they are basically procedural
[quoted text clipped - 5 lines]
> help, but just that we should evaluate tools on the basis of knowing
> that they will be used in complex software.

I disagree. Truly better tools would dramatically improve the robustness of
applications. Procedural code is error-prone.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 14 Jun 2004 15:41 GMT
> Good, you do agree that the simple object persistence model is not
> appropriate when data spans applications.
[quoted text clipped - 5 lines]
> check (validate) and requires programming language knowledge. The code
> is procedural.

Hmm.  This really strikes me as the expression of a personal preference.  
After all, I could just as easily respond that when you check data
constraints in databases, the constraint enforcement is hidden in the
database, and is hard to check without knowledge of the relational
model, and the code is SQL.  Without accepting that SQL is fundamentally
easier to deal with than Java (which I don't), my argument sounds as
convincing as yours - which is to say, not at all.  There's no
fundamental difference in readability between these two:

   if (value > 100) throw new IllegalArgumentException();
versus
   CHECK (value <= 100)

Of course, if you're comfortable with SQL then the latter will be
familiar, and if you're comfortable with the Java language then the
former will be familiar.

> + Data constraint checking is difficult to code correctly.

Really?  Outside of referential integrity (which is sometimes a problem
for these kinds of products and which I mentioned earlier), what's so
hard?  You make your instance variables private, write a mutator, and
fail the mutator if the data isn't good.  The concept has been taught in
every intro-level class for an OO programming language for the past
decade.  It's no harder to write if statements in Java than to write
check constraints in SQL.

> + Even in a single application, constraints may need to be applied in
> several places in the program. Incomplete coverage is common.

Maybe you should give an example.  I don't see this being true in
practice.  People do practice encapsulation in OO code at least at the
lowest levels, and duplication of something like mutator-level data
validation is quite rare.  Code duplication problems in practice arise
with application logic instead.  Perhaps there are things possible with
SQL data constraints that I'm not imagining, in which case I'd like to
find out.

> + Anyone maintaining must be aware of all constraint requirements.

Well, or they need to just not mess with argument checking in the
mutator methods.  I'd venture someone would be no more likely to remove
that argument checking than they would be to remove arbitrary data
constraints from the SQL database.

> + If the database system supports a generic tool for manipulation of
> data, application constraints will be ignored.

Sure, that's true.  If such a tool is used for data entry or
modification, though, then the data is no longer application-private.  I
could imagine using such a tool for testing and still considering the
data app-private, but then I don't care about data integrity, because
I'm just testing.  If I get something wrong, a test may fail, and then
I'll find it and fix it.

> Good, you agree that RM has a solid mathematical foundation.

It would be hard not to agree.

> However, the Relational Model is not an academic exercise in math. It
> is explicitly intended for modeling real world entities and
[quoted text clipped - 5 lines]
> entities/relationships. See Date's writings on this subject (3rd
> Manifesto or check dbdebunk.com.)

I can only say that, if we're talking about the same thing, I couldn't
disagree more.  People can be taught to translate real-world data into
relational tables, and if that's what it takes to get good access to
data, then it's certainly a small price to pay.  It is a translation,
though.  I definitely find that object-oriented programming provides a
great way to model real-world concepts.

I have read C.J. Date's 3rd Manifesto, incidentally, and found some very
interesting ideas there, but not a convincing argument that object-
oriented programming is less suited to real-world modeling of data than
the relational model.  Date seems convinced that by proving that OO
models are less rigorous than relational models he has done this job,
when that misses the point entirely.  The point is that data from the
real world *doesn't* conform naturally to a rigorous model.

> Only if you insist that persistent data and dynamic data have identical
> considerations and concerns. I'm not sure that even applies to POD.

So name a "concern" that applies to persistent data but not dynamic
data, which would result in a relational model being more appropriate.  
As I understand it, we're excluding performance here; I'd just like to
see a case where the relational model is clearly, of itself, more
appropriate just because data isn't in use at the moment.

> I don't see the migration as easy. Simply, a persistent data design
> that ignores database design concepts is going to be harder to get
> back on track.

That doesn't match my experience.  I write data migration utilities all
the time.  They aren't hard.  They are so easy that for the most part, I
just throw away the code after it's written, because it's as easy to
rewrite it the next time around.

> Lockstep (hand in hand) changes are much harder; they tend to be all
> or nothing. RM abstractions allow piecemeal changes. This is useful
> even when only one application is involved.

Again that just doesn't match my experience.  If you'd like to explain
why this is useful for application-private data, please go ahead.  I've
had little problem with managing these changes myself.

(To provide context for these last few comments, the application I spend
a lot of time maintaining in my job does use a PostgreSQL database, but
maintains large amounts of data outside the database; This data we have
of recent attempted to move into the database, but then reversed the
changes because PostgreSQL had trouble with storing large amounts of
data (up to 40 MB) in a single field.  Writing the code to move data
from the filesystem into these table fields and vice versa, and
arranging for them to be run during scheduled maintenance, proved to be
fairly trivial and problem-free.)

> I disagree. Truly better tools would dramatically improve the robustness
> of applications. Procedural code is error-prone.

Is that last sentence meant in general, or is there some context that
I've missed?

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lee Fesperman - 18 Jun 2004 10:01 GMT
> > Enforcement of database constraints in application code is one solution
> > generally recognized as 'error-prone'. Here's some reasons:

I'm skipping most of your responses to my list of reasons because you are assuming a
specific implementation. There are a large number of ways of implementing this
functionality, most of which exhibit the problems that I listed.

You could claim (without evidence) that a majority would code it your way, or even a
vast majority, but that would still leave 'some' minority which will have mistakes. You
can't guarantee that a proper implementation will be used. OTOH, SQL's declarative
syntax is guaranteed not to have the problems I listed.

I do have comments on one item ...

> > + If the database system supports a generic tool for manipulation of
> > data, application constraints will be ignored.
[quoted text clipped - 5 lines]
> I'm just testing.  If I get something wrong, a test may fail, and then
> I'll find it and fix it.

You expound on an uninteresting case. The real issue is when such a tool is used on a
production database. Note that the tool might not even exist when the application is
written.

> > However, the Relational Model is not an academic exercise in math. It
> > is explicitly intended for modeling real world entities and
[quoted text clipped - 12 lines]
> though.  I definitely find that object-oriented programming provides a
> great way to model real-world concepts.

Your claim is not supportable. OO programming is based entirely on programming artifacts
invented for programming convenience and not for modeling the real world.

> I have read C.J. Date's 3rd Manifesto, incidentally, and found some very
> interesting ideas there, but not a convincing argument that object-
[quoted text clipped - 3 lines]
> when that misses the point entirely.  The point is that data from the
> real world *doesn't* conform naturally to a rigorous model.

Yes, Date proves that the OO model is ad-hoc thus error-prone. RM deals with 'facts'
that can be represented as values in a rigorous way. Not all data is of this form, but
data that isn't is not suitable for deriving additional meaning or conclusions through
automated processing.

> > Only if you insist that persistent data and dynamic data have identical
> > considerations and concerns. I'm not sure that even applies to POD.
[quoted text clipped - 4 lines]
> see a case where the relational model is clearly, of itself, more
> appropriate just because data isn't in use at the moment.

Persistent data always needs to be consistent across the database. Dynamic data does not
always need to be consistent with persistent data or within the application.

> > I don't see the migration as easy. Simply, a persistent data design
> > that ignores database design concepts is going to be harder to get
[quoted text clipped - 4 lines]
> just throw away the code after it's written, because it's as easy to
> rewrite it the next time around.

Good, you've learned some things outside your experience on usenet.

Improperly structured data and associated program logic require more changes. Each
change increases the risk of error.

> > Lockstep (hand in hand) changes are much harder; they tend to be all
> > or nothing. RM abstractions allow piecemeal changes. This is useful
[quoted text clipped - 3 lines]
> why this is useful for application-private data, please go ahead.  I've
> had little problem with managing these changes myself.

The idea that all or nothing is more difficult than a piecemeal approach is outside your
experience?

> (To provide context for these last few comments, the application I spend
> a lot of time maintaining in my job does use a PostgreSQL database, but
[quoted text clipped - 5 lines]
> arranging for them to be run during scheduled maintenance, proved to be
> fairly trivial and problem-free.)

Sounds like your all or nothing approach failed you.

You obviously needed an industrial strength RDBMS.

> > I disagree. Truly better tools would dramatically improve the robustness
> > of applications. Procedural code is error-prone.
>
> Is that last sentence meant in general, or is there some context that
> I've missed?

It's meant in general. For example, proving procedural code is correct is so hard that
it is rarely attempted.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 18 Jun 2004 17:04 GMT
> You could claim (without evidence) that a majority would code it your
> way, or even a vast majority, but that would still leave 'some'
> minority which will have mistakes. You can't guarantee that a proper
> implementation will be used. OTOH, SQL's declarative syntax is
> guaranteed not to have the problems I listed.

Well, "guaranteed" is an interesting word.  I think you mean guaranteed
if someone does it right and the constraints can be expressed
declaratively, but then you're in the same boat.  I wonder how many data
integrity checks can be done in Java, but can't be done in a database
declarative constraint.  For example, let's say that an integer value is
supposed to be guaranteed to have neutral disparity -- which means an
equal number of 1 and 0 bits in the binary representation.  Can you
write a CHECK constraint to ensure that this remains the case?  (I don't
know that you can't, but I'd be interested to find out.)

I frequently see people asking about implementing certain constraints on
data, and being told to use a trigger to write pseudo-procedural code,
because a declarative constraint can't do what they want.  Of course,
many more people in similar situations will just leave out the
constraint and forget about it altogether.

It would be interesting, if only it were possible, to find out how often
constraints on data are typically enforced in databases versus
application code.  I don't know and neither do you, of course; but I'm
pretty sure that despite your "guarantee", the answer would not be 100%.  
You're willing to reject encapsulation in an OO model because people
might not use it, despite its being universally taught in every
programming class (professional, college, or whatever) I've seen as one
of the most important aspects of doing OO programming; but you're
willing to assume that everyone will go out of their way to make sure
they are checking all the data in their database?  This, again, flies in
the face of my experience.

> > Sure, that's true.  If such a tool is used for data entry or
> > modification, though, then the data is no longer application-private.  I
[quoted text clipped - 6 lines]
> is used on a production database. Note that the tool might not even exist
> when the application is written.

Here you assume that the customer needs a full-fledged database, and
then argue that this lightweight solution is inappropriate because it
fails when used in that way.  Not everyone wants to manage data outside
of the application that uses it.  The desire to do so, I agree, is a
definite indicator that it's time to put that data into a relational
database.

All I'm doing here is clarifying what I mean by application-private
data.  I find that such data exists (and, in fact, is often much more
common than shared data), and part of its definition is that it does
*not* need to be managed outside of the application.

> Your claim is not supportable. OO programming is based entirely on
> programming artifacts invented for programming convenience and not
> for modeling the real world.

I see that as a false dichotomy.  Modeling real-world concepts (i.e.,
the concepts that applications primarily deal with) is exactly what
provides programming convenience.  I didn't say OO programming isn't
intended to be convenient, but rather that it's appropriate *because*
it's convenient.  Or maybe I'm misinterpreting what you mean by
programming convenience?

> Yes, Date proves that the OO model is ad-hoc thus error-prone.

Which is, of course, overstating the point and not even relevant to this
piece of the discussion.

> RM deals with 'facts' that can be represented as values in a rigorous
> way. Not all data is of this form, but data that isn't is not suitable
> for deriving additional meaning or conclusions through
> automated processing.

Which all misses the point.  It may be true that data which can't be
represented in the relational model isn't suitable for further reasoning
(though if I wanted to know, I guess I'd take it up with a philosophy
professor at a local university).  The question, though is about whether
it's more convenient to work with data in an OO form or a relational
form in OO application code?

As a data point on this matter, of OO applications that access a
database, at LEAST 90% (and that's a number from personal experience)
encapsulate the relational code into data access objects.  People who
write OO code encapsulate things because they think the original has
complexities that aren't appropriate to expose to the rest of the
application.  Implicit in that is that the OO public interface does
*not* have these complexities.

So, unless we believe they are all mistaken and should be passing
ResultSet objects around their entire application, the only remaining
question on this point is whether the implementation of that object is
simpler with fields, or with a database connection and SQL.  I know how
I would answer that question.

> Persistent data always needs to be consistent across the database.
> Dynamic data does not always need to be consistent with persistent
> data or within the application.

Unfortunately, what you mean by "consistent" is left ambiguous here.  I
don't doubt that you know exactly what you mean, but there are at least
half a dozen things you could mean by that word.

[Insulting remarks snipped.]

> > (To provide context for these last few comments, the application I spend
> > a lot of time maintaining in my job does use a PostgreSQL database, but
[quoted text clipped - 9 lines]
>
> You obviously needed an industrial strength RDBMS.

Really?  It seems to me that an industrial strength RDBMS had problems
that required removing some data from it to be managed manually instead.  
Fortunately it wasn't too much of an issue, because contrary to your
assertions, migrating data to and from different data sources is just
not that hard.  I'm interested to hear how you think a RDBMS would have
solved this problem, given that we use one already and it was the
*cause* of the problem.

(Incidentally, in case you just don't like PostgreSQL, we also tested
this with Oracle and duplicated the same performance issues with large
values in fields, which also matches the expectations shared by plenty
of others who had tried similar things.  Or maybe Oracle isn't an
industrial strength RDBMS either?  What is?)

> > > I disagree. Truly better tools would dramatically improve the
> > > robustness of applications. Procedural code is error-prone.
[quoted text clipped - 3 lines]
>
> It's meant in general.

I was afraid of that, but didn't want to make assumptions.

> For example, proving procedural code is correct is so hard that
> it is rarely attempted.

Clearly, proving something to be correct is a different matter than
avoiding errors in practice.  Few things can be proven correct,
especially when there are human factors involved.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lee Fesperman - 23 Jun 2004 09:04 GMT
> > You could claim (without evidence) that a majority would code it your
> > way, or even a vast majority, but that would still leave 'some'
[quoted text clipped - 5 lines]
> if someone does it right and the constraints can be expressed
> declaratively, but then you're in the same boat.

It's most certainly not the same boat. I listed reasons that enforcement of database
constraints in application code is error-prone compared to enforcement by the DBMS using
declarative constaints.

You've asserted that *all* application code in this area uses proper encapsulation and
is 100% error-free. I've merely asserted that this is surely true of the RDBMS.

> ....  I wonder how many data
> integrity checks can be done in Java, but can't be done in a database
[quoted text clipped - 3 lines]
> write a CHECK constraint to ensure that this remains the case?  (I don't
> know that you can't, but I'd be interested to find out.)

The constraints you've been describing are 'domain' constraints. This is the one place
in RM where procedural (even OO) code can be appropriate -- constructing user domains.
Domain constraints should be implemented by the domain itself.

Other types of constraints -- row constraints, multi-table constraints are harder to
implement outside the DBMS.

> I frequently see people asking about implementing certain constraints on
> data, and being told to use a trigger to write pseudo-procedural code,
> because a declarative constraint can't do what they want.  Of course,
> many more people in similar situations will just leave out the
> constraint and forget about it altogether.

The need for triggers is symptomatic of the weakness of SQL, which is rather poor in
following relational principles. Procedural constructs like triggers, stored procedures
and subqueries are needed to shore up SQL. However, they do have the advantage of being
executed by and on the DBMS (encapsulation) and are generally short, self-contained,
functional-style snippets of procedural code (easier to get right.)

I don't doubt that many needed constraints are poorly enforced or simply ignored.
Unfortunately, the effects of mangled databases are much worse than that of buggy
applications. One of the worst offenders is the popular technique of selective
denormalization to achieve performance. When you denormalize, you need to add
constraints to compensate for the resultant weakening of the database structure. Since
the additional constraints will often wipe out the performance gains, they are simply
ignored.

> It would be interesting, if only it were possible, to find out how often
> constraints on data are typically enforced in databases versus
[quoted text clipped - 7 lines]
> they are checking all the data in their database?  This, again, flies in
> the face of my experience.

Here you've moved outside of our subtopic -- from a single application to shared data.
When multiple applications share the db, it becomes even more important that constraints
are enforced by the DBMS.

I think you are wildly optimistic about the use of OO techniques. Just because someone
took a college source or attended your training doesn't mean they fully comprehend
encapsulation and will apply it effectively. In the real world of commercial
programming, things are not so neat. Impossible deadlines, unreasonable bosses who
insist on instant results (have you ever read Dilbert?) and other demands/distractions
work against well-written code.

> > > Sure, that's true.  If such a tool is used for data entry or
> > > modification, though, then the data is no longer application-private.  I
[quoted text clipped - 13 lines]
> definite indicator that it's time to put that data into a relational
> database.

Actually, I'm assuming that the tool exists or could exist for something like POD.

> All I'm doing here is clarifying what I mean by application-private
> data.  I find that such data exists (and, in fact, is often much more
> common than shared data), and part of its definition is that it does
> *not* need to be managed outside of the application.

Applications can have bugs, incomplete coverage of certain situations or just
'perceived' bugs. These will tempt the end-user to use a general tool to 'fixup' the
database.

> > Your claim is not supportable. OO programming is based entirely on
> > programming artifacts invented for programming convenience and not
[quoted text clipped - 6 lines]
> it's convenient.  Or maybe I'm misinterpreting what you mean by
> programming convenience?

OO has no data model. There no rules for data structure --- for which class a given data
field should be placed in.

OO is simply ad-hoc. IOW, its ability to model the real world is entirely dependent on
the skills of the individual programmer.

> > Yes, Date proves that the OO model is ad-hoc thus error-prone.
>
> Which is, of course, overstating the point and not even relevant to this
> piece of the discussion.

Not relevant? I was replying to your statement that Date attempts to prove that OO is
less rigorous than RM. Of course it is obvious that OO is hardly rigorous. However, you
also skipped Date's examination of OO's weakness in modeling the real-world.

> > Persistent data always needs to be consistent across the database.
> > Dynamic data does not always need to be consistent with persistent
[quoted text clipped - 3 lines]
> don't doubt that you know exactly what you mean, but there are at least
> half a dozen things you could mean by that word.

Not ambiguous at all. Consistency is a basic database concept. You don't know that?

> (Incidentally, in case you just don't like PostgreSQL, we also tested
> this with Oracle and duplicated the same performance issues with large
> values in fields, which also matches the expectations shared by plenty
> of others who had tried similar things.  Or maybe Oracle isn't an
> industrial strength RDBMS either?  What is?)

You were vague about what the 'trouble' was. I assumed you meant that it just didn't
work, rather than the performance issues that you now mention. There are quite a number
of solutions to performance issues. I'll assume that you found none that were feasible.

> > For example, proving procedural code is correct is so hard that
> > it is rarely attempted.
>
> Clearly, proving something to be correct is a different matter than
> avoiding errors in practice.  Few things can be proven correct,
> especially when there are human factors involved.

I wasn't talking about 'things' in general or human factors; I was referring to
programming languages. Certain non-procedural languages are amenable to correctness
proofs.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

Chris Smith - 23 Jun 2004 16:07 GMT
Lee,

I'm getting out of this conversation.  It's getting very long and not
doing either of us any good.  We seem to disagree on a few things
irreconcilably, such as whether application-private data exists, whether
Satan invented all procedural programming languages, and whether someone
using a relational database will be inclined to use its data integrity
checking features even when the same person would apparently never write
a line of Java code to accomplish the same task.

I'm not willing to take C. J. Date's word that procedural languages are
the epitomy of evil.  Quite simply put, he is a very biased source, and
Date can apparently convince himself with his own arguments well before
he convinces me.  If you intend to actually make arguments (or at least
make reference to specific arguments from Date's book, though I don't
have a copy at the moment and would find that difficult to follow
personally), then some progress could, perhaps, be made in this
discussion; instead, I'm seeing a lot of arguing from the *assumption*
that object-oriented approaches to problems are inherently inferior when
a relational approach exists.  That's not getting anywhere, since I
don't accept your premise.

This started with a statement that was far over the top about the
usefulness of data in the absence of a relational database.  I think
I'll just rely on readers of the newsgroup to make their own judgements
on whether data managed by a non-relational database -- even one that
doesn't check a lot of constraints -- is really very near to "garbage"
or not.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Roedy Green - 23 Jun 2004 19:27 GMT
>I'm getting out of this conversation.  I

I got out a while ago for the same reason. Think back to how the
debate it started, a claim that PODs were orders of magnitude faster
than SQL, and a counter claim that PODs were useless in nearly all
circumstances.

Signature

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.

Lee Fesperman - 24 Jun 2004 10:25 GMT
> I'm getting out of this conversation.  It's getting very long and not
> doing either of us any good.

Whatever. I notice you couldn't resist some parting shots ...

> ...  We seem to disagree on a few things
> irreconcilably, such as whether application-private data exists, whether
> Satan invented all procedural programming languages, and whether someone
> using a relational database will be inclined to use its data integrity
> checking features even when the same person would apparently never write
> a line of Java code to accomplish the same task.

Your predilection for prejorative statements (Satan, idiots) is simply childish and
obviously counterproductive.

You act like you are completely unaware of the difficulties associated with procedural
code, especially the use of variables. Yet, recently in c.l.j.p you were discussing
Single Assignment forms. The purpose of those forms is to eliminate variables. There are
Single Assignment Languages (SALs) which effectively transform variable into
placeholders to avoid the inherent problems with variables.

> I'm not willing to take C. J. Date's word that procedural languages are
> the epitomy of evil.  Quite simply put, he is a very biased source, and
[quoted text clipped - 7 lines]
> a relational approach exists.  That's not getting anywhere, since I
> don't accept your premise.

Sure, forget about Date. I disagree with him a lot. I mentioned his writings to provide
another source for comparing OO & RM.

Your use of 'assumptions' and 'premise' implies I've made no arguments in support of my
viewpoint. That is incorrect. My previous posting made specific arguments (for instance,
about the OO 'data model'), which you have ignored here.

The usual basis for your arguments is that it doesn't match your experience. You are not
a fulltime database programmer (or even a fulltime programmer). Also, ISTR from previous
exchanges that you've been doing professional programming for only a short time.

Signature

Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)

</