« Borat The Unknowable| Main | Transactional Programming »
Databases and the UML
Lately I have been spending a little more time with UML and databases than I usually do, and I came across an interesting WTF.
It seems that UML still isn't that great with databases. I kind of thought with version 2.0 out, most of these issues would be fixed, but when I run Enterprise Architect and decompose a database, I'm not getting useful stuff.
Sure, the tables come over just fine, but a database is a lot of things -- tables, stored procedures, functions, views, packages -- it's a lot more than just a relational diagram. Normally, when you're doing green field development, you can frame up the ERD in something like EA and leave the "guts" to the DBA guys with Erwin or whatever. But my challenge was a little different: somebody gave me a complex system and said, "all the people are gone, what's this thing do? Can we get rid of it?"
Now for those of you in the software business, you understand -- that kind of question is loaded. First thing I asked was, "Do you have any enitry relationship diagrams?" to which the answer was "what's that?"
So I found the most technical person, the one writing all the queries and holding the machine together. He told me there were no pictures of the database, no design documentation, and - best of all! -- no enforced RI (Referential Integrity)
So, hoping against hope, I open up Enterprise Architect and start poking around to see what's under the hood. You guessed it -- hundreds of tables, views, and stored procedures. Suddenly the question "What's this thing do?" was starting to sound like, "What's the meaning of existance?"
So I break out my trusty Enterprise Architect, which promises to help decompose the database for me. After running the wizard, of course all I get is a bunch of tables not joined up by anything -- EA uses the FKs and PKs to determine how things fit together. A very reasonable assumption, I might add. But not helpful.
Mumbling something about career choices, I started digging through paperwork. Old meeting notes, emails, physical files -- anything that might help. After a few hours of this, I found a document called "Implied relationships in the database" It was 5 pages long, and it sounded like somebody's guess (perhaps the last consutlant?) at what was going on.
I took EA and started hooking up wires. I also used a few queries the users liked, and I read a few stored procedures who had names that looked important. This is what I like to call the "Make it up as you go along" strategy, which is a lot like total BS, except it looks smarter. Here's what I got:

Well that certainly clears things up
The problem is, of course, that I'm just guessing what the thing actually looks like. It would take an exhaustive insepction of every view, function, and stored procedure (including external ones) to get close to actually "knowing" what the structure is.
Certainly there is more information that SQL Server knows that it is not sharing. After all, every bit of programming, string, and bailing wire in that database is linked to each other through the sys tables. Sure enough, I found a nifty tool called SQLDiagram that will take a stored proc, view, function, or table and show me the dependencies. Picking one at random, it kind of looks like this:

That's very nice. Love the colors. Wonder if they have anything in pastels?
Needless to say, the tool does not export the information, does not print correctly, and looks buggy. But hey -- it was forty bucks. (For those of you saying, but just buy XYZ! My only question is: is it cheap?)
The problem is, I need linkage and dependency information on the entire database, and I need it in UML. I need to click on a function and tell which tables it uses. It would be great if I could sequence across Stored Procedures and functions. EA brings in all of the views and stored procedures, but it makes one object called "Stored Procedures" and then puts each SP as a method under that. That's just facacta (look it up). It doesn't even bring in functions. So I email EA and make a feature request -- pull in functions and Stored Procs in a way that is useful to we DBA-type people.
The reply was something like, "Hey, OMG hasn't figured this out yet either, bub, so we used an agile group's method" with a link to that goup.
I'm sorry, but that's just not satisfactory. If we're going to use UML to model diverse and complex systems involving databases, we need something better. Stored Procedures are not just methods on some generic SP object.
Here's my WTF. All of these things, SPs, Views, Tables, etc, are objects that are created by the database. They have linkages; some have methods. The database itself is a group of such things. In fact, the database is a component. Everything inside of it is a class of some type managed by the component. Stored procedures are parameterized classes: when they run they take parameters, and the constructors can return various things back to the caller. They have no other methods or properties. Likewise for functions. Views and tables can be thought of as classes that only have one property and no methods (although triggers come immediately to mind as being standard methods for tables)
So when you want to model a couple of databases with views, tables, and stored proces, it should look like this:

I like this
So what's left? Somebody needs to write an EA add-in to suck down ALL of the meaningful information from a database and present it in UML. I'll be sure to get to that just as soon as my clone is completed.
Leave a comment