Tincat Group, Inc. -- tincat musings blog

<<software development>>

mewsings, a blog

--dawn

Sunday, February 12, 2006

Don't Suffer Impedance

A considerable amount of software development consists of mapping and converting data from one format to another, one schema to another. When software is used to bridge the gap between the person and the machine, there is an obvious need for translation. Impedance mismatch refers to the difference between the output of one process and input of another, requiring a transformation to connect the processes together. There is an impedance mismatch between the thoughts of a person and the 0's and 1's of a computer, for example. (There is perhaps less of a mismatch in the case of Nick here than for most others.)

Let's narrow the scope. As fascinating as it might be, I don't want to start with a person's thoughts, so let's move to the point where software components take input from a person, the user interface. That is our starting point. At the other end, the operating system works with the hardware to handle the translation of text to 0's and 1's, so we can focus for now on text-based data heading into some process that makes this text persistent in memory or on a secondary storage device. Additionally, let's assume a database product that has at least CRUD services and communicates with the operating system. In summary, we will look at data coming from the point of a web page user interface to the interface with a database product.

While impedance can be measured in electrical engineering (or so I have heard), in software development it is a much more loose term often used to sell products or claim superiority. I could not find any clear definition of the term as in OO-RM impedance mismatch that was not specific to OO and RM, so I thought I would lend my own. Impedance mismatch occurs when there is a difference in the data model used for the output of one process and the input of another, requiring a transformation. Not all transformations are due to an impedance mismatch, but for an arbitrary output of one data model a transformation will be required to prepare it for input to an interface using a different data model. You might want to refer back to The Naked Model for a description of a data model.

The number of transformations of any kind relates at least to the size and scope of any given project, but the number of impedance mismatches relates to the architecture and product choices for the solution. One problem with my definition of impedance mismatch is that the definition of a data model is not really tight. One could show how there is a different data model behind just about any two programming languages. Do Oracle and SQL Server employ the same data model in the interface to the database? They both make an effort to employ the RM, right? Sure, even though they do not implement it identically and neither would be considered a pure implementation thereof.

What data models are there other than the RM? Many database books talk about network and hierarchical models, although I know of no products that advertise themselves as such. These terms have been used primarily to attack non-relational models. We could view OO as a data model and the web or XML as another. You might want to split out data models differently and that is OK. If we can split them out and name them in a variety of ways, how do we know that OO and XML are not the same data model as each other or as the RM? Is this grouping entirely in the eye of the beholder.

Informally, if the representation of data structures and related operations look "different enough" between languages, then you are working with a different data model. Java, C#, and C++ could be seen as implementing an OO data model. I think of XML as "dynamic string arrays" or "di-graphs of trees" or other broader categories. But since most people have seen XML, I'll stick with that. It is not a programming language, but notice that an impedance mismatch is related to output and input, so it should do for our purposes.

What would it take to minimize the number of impedance mismatches?

There might be good reasons to put up with them, but what would it take to minimize the number of impedance mismatches in a particular application? Let's take an example of one stream of Input-Processing-Output, starting with input to an XHTML (so it is XML) web page that includes the data shown in the UML diagram shown here and ending with these data values stored in a generic SQL-DBMS. Our UI, therefore, is coded in a language implementing the XML data model and our database interface is based on the RM. What is the minimum number of impedance mismatches we can have in this example? One. We must somehow move from XML to RM. How could we make it zero instead?

Replace the RM database with an XML data server.
Replace the XML UI with an RM UI.
Replace both with an OO UI and an OO DBMS or some other data model.

I'm going to skip that last possibility for now. The first option is somewhat how I like to work, although I use an MV database given that I don't know any open source XML data servers. I also haven't looked in the last six months to see what is out there that isn't open source that is production quality and easy to work with. But this solution, which I referred to earlier as end-to-end AJAX, is feasible.

With that middle option, what would an RM UI be? I don't mean that the data are persisted with the RM, but that the actual input would conform to the RM. The Information Principle previously described would require that there be no ordered lists, for example. That says to me that our example (or an arbitrary) UI could not use the RM for the UI data model. Ordered lists are not an acceptable construct in the RM. Given that the RM was developed for the purpose of working with large shared data banks, it is understandable that it might not also work for a UI. But if we were to decide that life is too short for impedance, we would have to eliminate the RM from the solution.

0 Comments:

Litter Box

Some of this is worth pawing through.