Tincat Group, Inc. - Mewsings, a Software Development and Data Modeling Blog

mewsings, a blog

--dawn

Tuesday, February 07, 2006

The Model Behind the interFace

An interface is the face that computer software shows to a person, other software, or possibly hardware devices. While data models are often discussed related to databases and storing data, this mewsing is about data models behind software interfaces in general and user interfaces in particular.

Example UML class diagram for web page data

Let's take an example of a browser-based UI page with three text fields, one of which requires an integer value; one single selection drop-down; two multi-selection drop-downs; one text area; one radio button; and one date entry via a free-form text field. Using all the creativity I can muster right now, I'll name them as indicated in the UML class shown here.

Developing software is a process of modeling data and behavior. One set of data we can model is that which will be entered by the user. This single page of data could be backed by a view/schema modeled with this single UML box. We could use XML or JSON, for example, within the software to define and work with this view of data.

Similarly if not working with a UI but a data exchange interface, such as one using web services, we could use this same data model. This could be the model for a single record of data. For this example I'll include some sample values. I'll use an xml-ish format (because I wish XML had arrays like this) to model this view. [Note: I'll start the array index at 1, but I'm noting that I'm doing that just to retain my credentials in the real-programmers-start-counting-at-zero world.]

<MyExchange> <text1>elephant</text1> <text2>ears</text2> <text3>2</text3> <singleSelect>mouse</singleSelect> <multiSelect1> <multiSelect1[1]>grey</multiSelect1[1]> <multiSelect1[2]>pink</multiSelect1[2]> <multiSelect1[3]>ivory</multiSelect1[3]> </multiSelect1> <multiSelect2 /> <textArea>These are the times that try men's souls </textArea> <radioButton>Africa</radioButton> <dateText>01JAN06</dateText> </MyExchange>

An arbitrary web page cannot have an SQL view as a data model.

An arbitrary web page cannot have an SQL view as a data model. While views need not be in 3rd or 5th normal form or BCNF, you cannot define an SQL view that is not in 1NF. Using my favorite definition of an SQL view being a stored query, we see that while we can get a lot of different result sets in an SQL query, we cannot get a single web page of data if said data includes lists. Lists or arrays are very common in user interfaces as well as throughout the rest of software development. SQL-DBMS advocates have been known to say things like "You can use reporting tools to represent the view in whatever form you like--that is a representation issue". You might recall from a previous blog that the RM is all about representation, however.

The inability to get a view that is not normalized is a failure of SQL-DBMS tools, while the current state of the RM has made accommodations by redefining 1NF. I suspect I'll bring that up in a future blog, but for now I'll just make the point that even with some new variations on the RM that permit relation-valued attributes, ordered lists are still not included in the model.

Now that we have our UI or web services interface modeled, what might we want to do with data that are hosted by this model? We might want to select, project, join… basically we might want to do anything we otherwise do with data. These data need not come from a disk, they could come from a web page or pages, a web service or other interface, or a process that generates data and stores it in memory, for example.

Are there any of these statements with which you disagree?

Data modeling is required for all interfaces and, therefore, throughout the process of software development.
When data values are provided in data models related to a UI or any other interface, there might be a requirement to do any type of manipulation of or queries against this data.
When working with a UI data model, it is not possible to work exclusively with normalized data.

Therefore, it is not just important, but necessary, to have models of data other than the RM.

Therefore, it is not just important, but necessary, to have models of data other than the RM. Whatever the other data model, it has the same requirements for manipulating and querying the data as data models that are specific to DBMS tools. Data in these models must be projected, inspected, dejected, neglected, and selected (apologies to Arlo Guthrie and Alice's Restaurant).

Even if we decide to make changes to whatever data model we use for the UI when we work with large shared data banks, we cannot make the RM the data model across the board in software development. We must have ordered lists, for example. Before we turn our attention to the face of the database, I want to be sure you are with me on this point. The User simply requires a more full-featured model behind their Interface.

← Previous Next →

28 Comments:

At 3:42 AM, February 08, 2006 , x@c.d.t. said...: An interface is the face that computer software shows to a person, other software, or possibly hardware devices.

For this comment let's assume we are talking about the face seen by a person.

Let's take an example of a browser-based UI page with three text fields, one of which requires an integer value; one single selection drop-down; two multi-selection drop-downs; one text area; one radio button; and one date entry via a free-form text field. Using all the creativity I can muster right now, I'll name them as indicated in the UML class shown here.

Here you already assume an UML description for a HTML page. You now have a HTML description and an UML description of the face.

Hypertext/hypermedia interfaces existed before HTML (see this ). Have you noticed how hard is to maintain HTML sites ? Why aren't we building a relational model for this kind of face from the start not passing through HTML ?

SQL-DBMS advocates have been known to say things like "You can use reporting tools to represent the view in whatever form you like--that is a representation issue".

Here I think the relational model advocates have said you can use reporting tools to present the data in whatever form you like.

And I disagree with When working with a UI data model, it is not possible to work exclusively with normalized data.

Therefore, it is not just important, but necessary, to have models of data other than the RM.

Therefore, it is not just important, but necessary, to use the RM for all kinds of data so we don't have to write translators every time. :-)
At 8:58 AM, February 08, 2006 , --dawn said...: Hi x --
First, I'm not assume anything, I'm creating something. I'm developing a model of the data for the interface. I used UML only as one way to model the data for communication. As mentioned, I would model this data within the software using other tools.

You asked "Why aren't we building a relational model for this kind of face from the start not passing through HTML?" That sounds like a question I would ask. I suspect that if we understand the reasons, we will have more clues on why the RM doesn't really cut it. (Yes, I know you asked it as more of a directive.)

With your last statement, it seems you are not yet seeing my point in this blog. We cannot use the RM to model an arbitrary interface as indicated here. It is not possible. We must introduce a model that bursts outside of RM constraints and permits non-1NF in a view, for example. Introducing the RM into the mix is what brings in the impedence mismatch. Free yourself from thinking about the database for this blog. Think about modeling teh data of an arbitrary interface, such as an interface with a user. Can you see that you cannot model the precise face presented in the UML in this blog using the RM? My UML class is necessarily NOT an RM view of data and yet it IS the model for the UI. Let me know if you are able to tap into my thinking on this. I consider it a critical point. Thanks. --dawn
At 6:18 AM, February 09, 2006 , Wol said...: Hi Dawn and x,

I had a bit of a conversation with someone on LWN on what could be considered similar to this.

He was talking about programming and models, and said that, a program being an implementation of a model that it was an arbitrary choice as to whether the model or the program was "correct" when they disagreed.

I have no problem with that sort of argument, but to me that logic places the problem very firmly in the field of pure maths, and not in Computer Science which is where he put it (to me, Science means that the real world intrudes, and an error can only exist in the experiment or the model).

I feel there is a similar disconnect with database theory. So much effort is being spent comparing the data model and the relational model (and assuming that any error is in the data :-) To me, that's pure maths.

What's being ignored is the data itself - the real world. Like Dawn's web page, with lists and things...

RM is a good mathematical framework. But the problem is that it is maths and its proponents assume that any problems lie elsewhere.

In Science, you have to alter the model to fit the real world. Dawn's UML class *is* the real world, the web page, and if you change anything you should be changing the model.

Dawn says that the RM people have progressively redefined 1NF, but as far as I can tell it is still based on C&D's dictat that data is two-dimensional, that it comes in rows and columns. Well, Dawn's data is three-dimensional and that "2" to me is Science's "cosmological constant" - an inexplicable fudge that shouldn't be there but is needed to explain things. Science is now of the opinion that that constant is 0 (ie doesn't exist).

Unless and until RM throws that "2" out the window, it will not be able to model things like Dawn's UML, and therefore by definition *must* be a *partial* subset of database theory. When the real world and the model collide, Science says it's always the model that's wrong ...

Cheers,
Wol
At 7:07 AM, February 09, 2006 , JOG said...: >> Dawn says that the RM people have progressively redefined 1NF,
>> but as far as I can tell it is still based on C&D's dictat that
>> data is two-dimensional, that it comes in rows and columns.

Relational databases are not 2-dimensional. If I have a table with attributes firstname, lastname, age and email that's 4 dimensions straight off - RM is n-dimensional. Codd's dictat rather, was the each element (row) of the set (table) represents a function mapping, and as such could not have repeating attributes (columns) - entering an input value (column_name) into a function may only yield one consistent return value. It is this strict use of finite partial functions that MV appears to be contesting.
At 8:33 AM, February 09, 2006 , x@c.d.t. said...: In Science, you have to alter the model to fit the real world. Dawn's UML class *is* the real world, the web page, and if you change anything you should be changing the model.

The HTML and the UML are not the real world.
At 8:44 AM, February 09, 2006 , x@c.d.t. said...: And humans change the real world all the time. Including scientists :-)
At 8:48 AM, February 09, 2006 , --dawn said...: The word dimension is overloaded, so JOG is right that using the mathematical definition, an n-tuple is n-dimensional. Using a more typical physical definition or a programming language Array definition of dimension as Wol has done, it would make sense to call such tables 2D, given that each value can be accessed with its [i][j]. This misunderstanding crops up all the time, and I fault those with the mathematical definition (that's you in this case, JOG ;-) for this one because I think they realy do understand the alternative use of the term, while the programmers don't always know the mathematical definition. There, have I settled that one for all time?

As for the function mapping the JOG mentions, MV could be modeled with the same definition once you accept that this one consistent return value in your partial function could be a list.

And Wol, I will definitely talk about real world mapping to a conceptual data model in the future. I do love that topic. However, I do not think that the web page is the real world. It is not the starting point for the analysis of a problem space, but the actual implementation of a solution. A user interface, such as a web page, is another model of the real world, but still a model and not the real thing.

I chose to start my argument in these mewsings by looking at what a data model is and indicating that we must use a data model throughout software development. We model data for every interface, not just the interface wiht a database. I started here rather than with the real world and conceptual data modeling because there is a tighter argument here even if it is still very difficult to put in words. It hinges on what the data model in relational data model is. There is a tendency to make it a roving definition, reshaping within any given context so that it is not seen as the problem.

I tried to capture the meaning of data model, show that we must model data using a data model throughout software development, and present one such place where a model is required and the RM cannot be that model. Cheers! --dawn
At 10:17 AM, February 09, 2006 , Wol said...: The word "dimension" is overloaded ... ?

Mebbe. I need to rush, but jog's example to me very definitely is two-dimensional.

A three-dimensional space needs three indices, [x][y][z], and four dimensions is [x][y][z][t].

Does that mean the scientific and mathematical definitions of "dimension" are different? More proof that RM is an exercise in pure maths ... :-)

Cheers,
Wol
At 10:52 AM, February 09, 2006 , --dawn said...: That plural of math from you Brits continues to amuse me after all these years.

I love you sticking to your guns regarding dimension. If you are working with mathematical tuples such as in relational theory, you work with dimension as with vectors. A 2D tuple would be (x, y), for example, while a 3D would be (x, y, z) and you get the picture. So, a tuple of ("Wol", "OWL", "UK", "male") would have a dimension of 4.

HTH. --dawn
At 2:59 AM, February 10, 2006 , x@c.d.t. said...: A 3D tuple (x,y,z) could have a constraint on it and really be (u,v) and part of a 2D space after a change of coordinate system.

One could put all those points in a linear array and acces them by only one index.
At 6:28 AM, February 10, 2006 , Wol said...: I look at "dimension" as being "how many bits of information do I need to define the point of interest". So, as x says, one 3D tuple is linear or one-dimensional. All I need is the column name or field number - one dimension.

When we add further equivalent tuples, ( (a,b,c) for example), I now need a second dimension to identify which tuple ... etc etc.

(Like the surface of a sphere is two-dimensional - any point can be identified by its latitude and logitude. The sphere itself is three-dimensional because you need to add distance from the origin.)

Oh - by the way Dawn, do Americans say "mathematic"? To me, the word is plural so the abbreviation should be too :-)

Cheers,
Wol
At 6:45 AM, February 10, 2006 , x@c.d.t. said...: When we add further equivalent tuples, ( (a,b,c) for example), I now need a second dimension to identify which tuple ... etc etc.

You need only one index for all the tuples because you know they are triplets so you can use k div 3 and k mod 3 to identify which tuple and which field.
At 7:50 AM, February 11, 2006 , rudy said...: i'm going to bypass all the theoretical musings (and leave them to people more comfortable discussing how many angels can dance on the head of a pin, like fabian what's-his-name)

but you certainly can return a non-normalized SQL result, assuming your database has something like MySQL's GROUP_CONCAT function, a gloriously useful function which would be really handy for your MULTIPLE dropdown selects
At 4:42 AM, February 14, 2006 , Wol said...: Hi x,

One problem with 'k div 3' ... if I have a set of tuples (as the relational model requires :-) then there is no order and k div 3 is meaningless :-)

So no, one index is NOT sufficient to uniquely identify which bit of data I am looking for.

Cheers,
Wol
At 6:30 AM, February 14, 2006 , --dawn said...: Rudy--That is good to know. Because every implementation of SQL is different, I tend to think in terms of ODBC (SQL-92ish). I'm not familiar enough with MySQL to know if a single SQL statement could produce exactly the data to populate a view modeled as in the example UML class. You could not do that using ODBC with MySQL, I would think, but can you using other connectivity with MySQL? Thanks for the tip re GROUP_CONCAT. --dawn
At 6:50 AM, February 14, 2006 , x@c.d.t. said...: One problem with 'k div 3' ... if I have a set of tuples (as the relational model requires :-) then there is no order and k div 3 is meaningless :-)

I knew that. I was joking of course. I wondered if you will get it :-)
At 8:49 AM, February 14, 2006 , JOG said...: >> The word dimension is overloaded, so JOG is right that using the
>> mathematical definition, an n-tuple is n-dimensional. Using a more
>> typical physical definition or a programming language Array
>> definition of dimension as Wol has done, it would make sense to
>> call such tables 2D, given that each value can be accessed with its [i][j].

Aye, it is a confusion of spatial dimensions with informational dimensions.

>> This misunderstanding crops up all the time, and I fault those with
>> the mathematical definition (that's you in this case, JOG ;-)

Well I obviously vociferiously, but warm-heartedly, disagree ;) As you've probably experienced recently it can turn into a nightmare when we don't use the same terminology, especially when communicating using the written word. RM only has spatial dimensions if it is visualized down as a table - write it down in mathematical symbols and it will seem anything but 2-dimensional. As such I think we should encourage everyone to use the real sense of the word so we can have valid discussions without getting stuck in the desperate undergrowth of "tables" (and have a responsibility to do so if we actually want productive discussions).

And hey this isn't a complex mathematical thing here: everyone involved in this field knows a row is a tuple - and tuples have a dimension.

>> As for the function mapping the JOG mentions, MV could be modeled
>> with the same definition once you accept that this one consistent
>> return value in your partial function could be a list.

This is an interesting concept, but is subtly different (but wholly mathematically different) from standard MV. Worth exploring imo. Would you envision every non-multiple entry to be an atom or a themselves a list with only one value?
At 10:16 AM, February 14, 2006 , Wol said...: Would you envision every non-multiple entry to be an atom or a themselves a list with only one value?

Something I don't get about the relational model ...

In the real world we have nouns, adjectives, and gerunds (amongst others).

In the RM, all data seems to be seen as equal atoms.

Given that in the real world we have at least three different types of data, how do we expect to get sensible results if we model them as all the same in the database?

(Which goes back to my "emergent complexity" argument - we have the same thing in science with forces :-)

Cheers,
Wol
At 7:07 AM, February 15, 2006 , x@c.d.t. said...: In the real world we have nouns, adjectives, and gerunds (amongst others)

There are no such things in the "real world". :-)
Maybe in grammar. They are classes of words.

In the RM, all data seems to be seen as equal atoms.

Yes. Values. Not words.
At 5:58 AM, February 17, 2006 , Wol said...: Hi x,

you miss my point.

In grammar, we have nouns. In the real world, a noun is an object. Like you, for instance :-)

In grammar, we have adjectives. In the real world, an adjective ... well, there's no such thing :-) ... but you understand what I'm saying? There is a "something", but it's very different from an object.

In grammar, we have gerunds. In the real world we have ownership.

I think writing this has clarified my ideas somewhat but it's still left me with the same fundamental problem I started with. How does one map a relational model to the real world, if all you have in the model is values?

It goes back to my previous plaint, that the RM is an exercise in pure maths ... :-)
At 7:02 AM, February 17, 2006 , x@c.d.t. said...: How does one map a relational model to the real world, if all you have in the model is values?

Not all values are created equal :-)

The same humans maps any artificial language to the real world perhaps.

It goes back to my previous plaint, that the RM is an exercise in pure maths ... :-)

RM is artificial because is man :-) made. Have you discovered a method to work with a computer in natural language.
At 7:13 AM, February 17, 2006 , --dawn said...: Good idea to smile after man. Did it ever strike you that we have "man made" and "homemade." I have an office in my home so these blog entries are homemade, none are man made. smiles. --dawn
At 7:16 AM, February 17, 2006 , x@c.d.t. said...: Correction

Have you discovered a method to
work with a computer in natural language.

should be

Have you discovered a method to work with a computer in the mother tongue ?
At 7:20 AM, February 17, 2006 , x@c.d.t. said...: I wonder if grammar is man made. :-)
At 7:31 AM, February 17, 2006 , x@c.d.t. said...: Did it ever strike you that we have "man made" and "homemade."

No. Because I almost never encountered the expression "homemade" and English is not my mother (or father) tongue.

We have: om (human), barbat(man), femeie(woman), mascul, femela (for animals), masculin, feminin (for gender).
At 11:13 PM, February 17, 2006 , Anonymous said...: there is no such thing as this trite "real world". Just the world that we make up in our heads, and collectively agree on. Constantly appealing to some mythical, pure true real world is silliness.
At 5:55 AM, February 20, 2006 , x@c.d.t. said...: These blogs sound like an anthropological study of data modeling.
At 4:48 PM, March 05, 2006 , LG said...: Much of what Dawn is describing is approaching the idea of "Model View Controller" approach to Object Oriented Programming.

Please have a look at Ruby on Rails for a fantastic implementation of MVC, where the "view" is a web interface.

With UniObjects, it should not be too hard to provide a Ruby on Rails model for U2 databases.

Ruby is a great language for this.

Everyone seems to be talking about database models, as in 1NF etc. and how to extract the data therefrom. What we should really be caring about is the data itself.

Data should present itself in the model as data, not as a "query" or some method of accessing the data. Those methods should be (and have been) developed. But they will, in every case, be unique. Even between SQL-based databases like MySQL, Postgres and Oracle the language varies and capabilities vary.

If the database were abstracted before we got to the model, and we could deal with objects of data that had their relationships pre-defined by the model, then we could access those objects for representation/manipulation in the "View" (interface).

# Data modeling is required for all interfaces and, therefore, throughout the process of software development.

True, but it should be done ahead of time in a "toolkit", so that the database itself is abstracted, and programmers of the interface can be interface programmers and not necessarily aware of the database itself (or at least the work can be clearly divided between interface developers and database developers).

# When data values are provided in data models related to a UI or any other interface, there might be a requirement to do any type of manipulation of or queries against this data.

As in querying a subset of the data returned by the database? If we use programatic objects to describe our data in the model, the relationships are already there! Stored as attributes of the object instance's state. i.e. the relationships are "built-in".

# When working with a UI data model, it is not possible to work exclusively with normalized data.

And it's preferrable to not use normalized data at all!

Litter Box

Paw through past Mewsings, a blog about software development, with a focus on data modeling.

2005
November
A Modeling Profession

2009
January
New Year, New Blog