mewsings, a blog
Monday, November 06, 2006
Cowboys with Promiscuous Databases
In Northwest Iowa we have lots of cows but no cowboys. We have cattle farms that, as best I can tell, are much like hog farms.
Somewhere between Iowa and Wyoming the backdrop changes from cattle farmers to cattle ranchers. We move from confinement lots to more open ranges. Where the land is fertile, it is farmed. The cows remain rounded up and typically crowded together. Farm land is planned out, designed, and structured. Farmers designate areas for corn, soy, or cows. They work the land, toiling over its makeup throughout the seasons.
I'm no expert, but it seems that ranches are left with a more natural order. As you head towards the mountain regions, the rockier unfarmed land is available for cattle. With less control on their movements, these cattle spread out a bit more. Cowboys round them up and prompt them to move as a group to another location when needed.
Iowa cattle farm above, Wyoming cattle range below
Promiscuous means consisting of elements brought together without order
We might say that the confinement lot is organized, constrained, and controlled in accordance with economic and other mathematical principles, while the ranch provides a more promiscuous landscape. The ranch is also organized, mind you, but a natural order arises from the land. Cowboys interact with the landscape to herd the cattle, performing tasks that might have been designed into a confinement lot.
Using the second definition from dictionary.com, promiscuous means consisting of parts, elements, or individuals of different kinds brought together without order.
Lest there be confusion, let me state that as we turn our heads from cows to data, I will interpret the word order in this definition to refer broadly to the organization and structuredness of a database, not to the ordering of attributes or rows.
Real cowboys, Wyoming Nov 2006
Chose from among the most common dictionary definitions of the term database, such as a collection of related facts or perhaps one that requires the use of a computer, if you prefer. Most readers will likely be familiar with the process of designing databases by employing relational modeling, given that this is taught in college courses as well as on the job. The design is organized, constrained, and controlled according to mathematical principles from set theory and first order predicate logic. Like the cattle farmer's land, a model could be drafted showing how the database designer will structure the database. Structure of this sort attracts certain personalities (farmers?) and not others (ranchers?). You might guess that I, in particular, feel more at home on the range.
Any model other than a relational data model might seem to the software development profession as promiscuous. I chose this derogatory, yet enticing, term in part because of the seeming unorderedness, comparatively, of legacy data models. This is not unlike the seeming unorderedness of a cattle ranch compared to the more obvious structure of the cattle farm. I also like using the term promiscuous here because our profession currently sees these alternative database tools as improper, even if increasingly enticing. I predict that our industry will be seduced by something resembling these legacy databases enough to switch to considering not-really-relational databases as mainstream again in the future, especially as SQL becomes less attractive as our interface language to data.
Note that although I do need a term for the databases about which I have been writing, often referred to as embedded in marketing literature, I will likely not latch onto this term as that would put me in the uncomfortable position of endorsing promiscuity. I can live with that discomfort for this one blog entry.
Ask a rancher or cowboy how they divide up the land, and they might suggest that the land divides itself. The water is here, the grass there, and the rocks up this way. They might draw or paint you a picture. Ask a database cowboy, one working with a more promiscuous database than those based on the relational model, how they model the data and you might hear an analogous response. The data orders itself.
Ask what steps a database cowboy takes to design a database, and you will likely hear that the first step is to have a good understanding of the landscape, the business. Then you define the scope of your project, putting a fence around it, and then you record what you see inside that fence. By looking at the landscape, you can make a computerized model of this reality for your database implementation. The implementation is a model of the business, not unlike a painting of the range.
I am well aware this scenario generates laughter from some, ridicule from others, as it sounds so unscientific. But as my colleague Anthony Youngman (aka Wol) would suggest, relational modeling is mathematical but not very scientific. The RM imposes an order using mathematical terms such as predicate and relation, typically avoiding terms matching the problem domain such as thing, entity, property, empty, and list, terms used by cowboys working with promiscuous databases. Relational modeling includes putting data into Nth normal form, while the database cowboy knows the land and paints what he sees.
For anyone confused by the imprecision of this description, perhaps the Jayne VanDoe example in the Is Codd Dead? mewsing provides more hints. By the way, regarding science and databases, have the terms relational model and experiment ever made it into the same sentence? We need to return to the science and art, the craft, of databases, modeling by painting what we see and testing our models over time. I'll grant that there is a need for more emperical data related to the effectiveness and resources required over time for all varieties of databases.
At the risk of repeating what I have said in earlier blog entries, but for the sake of any new readers, I will briefly suggest three features that might distinguish a seemingly promiscuous database from one that more closely implements the relational model.
Most, if not all, languages that work with the data employ two-valued logic.
The data need not be in what has traditionally been called first normal form. Attribute values may be arrays or multivalues.
- Contraints as data
This one needs a sweet acronym and a better description, but the idea is that constraints related to attribute types are typically specified with data, rather than with metadata, and are enforced outside of a DBMS, rather than by one. Rather promiscuous, wouldn't you say?
Let's take a look at legacy databases. As it turns out, the data handled with/in/using databases termed legacy is current, not primarily legacy, data. While it has been the conventional wisdom, that some would say has been proven, to migrate from legacy databases to SQL-DBMS tools, signs point to a return of such proven approaches as the use of two-valued, rather than three-valued logic. Additionally, more and more work with databases is done by developers without the tired, old 1NF requirement, often by way of object-relational or XML-RM mappings. There is reason to suggest that the future of data modeling resembles the past, the data modeling done by our current database cowboys.
In case you are asking Where's the beef? (perhaps you cannot see the pictures herein), the next blog entry will start looking at specific design patterns used when designing for one model of promiscuous databases, the Pick/MultiValue databases. While I am not as familiar with other not-really-relational models, it is very likely that these best practices will translate to best practices in many other environments as well. And, as always, I fear they are apt to irritate or even infuriate some RM enthusiasts. Heigh ho.
Although cowboys typically prefer using an apprentice approach with new recruits, a cowboy handbook might be in order so that the next generation of cowboys can learn from the best practices of those who have gone before. While it once looked like these cowboys were a dying breed, with the new wild, wild west of the internet, database cowboys look like they will be around for as long as the farmers. In the next blog entry, I will have something for you to sink your teeth into. The least we can do is pass along some tips from seasoned cowboys on how they have been saddling promiscuous databases for the past half-century.