MultiValue Trilogy: Data
This set of SOFACards gives an introduction to data within a MultiValue or Pick® database. Use these cards to introduce technical professionals serving as managers, marketing professionals, students, or developers to the MultiValue server environment.
There are multiple flavors of MultiValue, at times using different vocabularies. Two variations are those designated in these cards as R83 ("pure Pick") and PI (Prime Information and "Pick-like" systems). The author's bias toward the PI, including UniData�, environment will be evident as that is the origin of the experience represented here. It is likely that not all statements are 100% accurate for all MultiValue flavors, so please refer to your vendor documentation for development details.
Record @ID or "AT ID"
PI @ID = R83 item-id
@ID is the name of the unique primary key for each record.
This record identifier, with its value stored at location zero in the record, resembles a key in a row of a
relational table. However, while the value of the @ID uniquely identifies a record within a given file, the
record could include multivalued elements so these embedded tables are also keyed by @ID. For a multipart key,
all parts of the key are concatenated using a delimiter, such as an asterisk. LIST statements assume @ID as the
1st column of output unless otherwise specified.
A 3-part key for a file could be a name, year, and quarter, for example:
item-id: Record Item Identifier
R83 item-id = PI @ID
The item-id is the unique primary key for each item in a file.
This item identifier, with its value stored at location zero in the item, resembles a key in a row of a
relational table. However, while the value of the item-id uniquely identifies an item within a given file, the
item could include multivalued elements so these embedded tables are also keyed by item-id. For a multipart key,
all parts of the key are concatenated using a delimiter, such as an asterisk. LIST statements assume item-id as
the 1st column of output unless otherwise specified.
A 3-part key for a file could be a name, year, and quarter, for example:
PI field = R83 attribute
Named elements making up a record in a MultiValue database.
Each MultiValue file contains records made up of fields. These fields are defined and described in the
dictionary for the file, including such information as a name, location, display information, multivalue and
association information. Developers are known to edit test data in a file by viewing one field per line. This
differs from many environments where fields are viewed in a row as in a spreadsheet (or for baby boomers as
columns on a single 80-column punch card).
001: Tincat Group, Inc.
R83 attribute = PI field
Named elements making up an item in a MultiValue database.
Each MultiValue file contains items made up of attributes. These attributes are defined and described in the
dictionary for the file, including such information as a name, location, display information, and associative
attribute codes. Developers are known to edit test data in a file by viewing one attribute per line. This differs
from many environments where attributes are viewed in a row as in a spreadsheet or on a good old 80-column
computer punch card.
001: Tincat Group, Inc.
DICT: File Dictionary
Metadata for a file including attribute or field names.
MultiValue files are named as records stored in the VOC or MD file. Each of these records, serving as file
pointers, typically specifies a data section and a dictionary section. Like the master dictionary (MD/VOC) for
each account, these DICTs may also be extended with user-defined vocabulary items. This is often done because
each MV Query pulls its vocabulary from just one DICT. So, in addition to stored data elements (type D) like
those defined in a COBOL FD section, a DICT can also include R83 correlatives or
PI virtual fields, also known as I-descs.
LIST DICT MYFILE WITH @ID="NAME"
@ID. . . . TYP LOC CONV. FORMAT . . SM . . ASSOC . . . .
NAME D 2 60L S
An element that defines an array rather than just a single value.
A field or attribute can be defined as single or multi-valued. A multivalue is stored as text, just as any other data, but includes multivalue marks, delimiters separating the individual values. Because multivalues are acceptable, the data model is not in first normal form. The data may be viewed as logically flat tables for ODBC or SQL-92 queries, for example, by viewing each multivalued element as if in a logical child table. While out of fashion with relational databases in the past, Oracle� and others have now changed their tune and their databases. Modifying a MultiValue database and application to change cardinality on attributes is so easy it should prompt jealousy from a SQL database developer.
PI record = R83 item
An ordered set of data elements; row in a table; tuple in a relation.
The record is core to Pick� -- everything is a record! For example, don't get confused when you see your
programs stored as records in a file, rather than as files in directories. With all records of variable length,
source code and other text-based documents fit right in. Identified by a unique key, a record in a file has
information about a single entity, such as a person, purchase order, or client. A new field can be added to a
record at any unused location without reworking existing applications.
: COUNT CLIENTS [Example of counting records in a file]
130 record(s) counted.
R83 item = PI record
An ordered set of data elements; row in a table; tuple in a relation.
The item is core to Pick� -- everything is an item! For example, don't get confused when you see your
programs stored as items in a file, rather than as files in directories. With all items of variable length,
source code and other text-based documents fit right in. Identified by a unique key, an item in a file has
information about a single entity, such as a person, purchase order, or client. A new attribute can be added to
an item at any unused location without reworking existing applications.
Example of qualifying which items to list from a file:
> LIST RECIPES NAME WITH FRUIT = "BANANA"
Like a SQL database table, a file contains data, typically stored on disk, organized in records, composed of
fields. It is identified by a file pointer stored in the VOC/MD, with a unique name for each logical DATA-DICT
coupling in an account. The file is the largest collection of data for querying with LIST statements. In R83 type LISTFILES for filenames.
A file pointer record -- UniData� example: PAGE VOC MONTE
F MONTE stores names of dogs, msw 5/02 [Comment after the "F"]
/pathalias/MONTE [path to data]
/pathalias/D_MONTE [path to dictionary]
In a MultiValue Virtual Machine, it represents a database instance.
Each account has its own VOC in PI or MD in R83 at a specified path within the OS which serves as the vocabulary file/master dictionary for the account. The account defines a namespace within the MultiValue system e.g. file names may be reused only if in separate accounts. A typical MultiValue site might have accounts for separate applications, such as sales and human resources, perhaps with TEST and LIVE accounts on the same server. The VOC/MD holds the pointer information, for example, to all stored data for that database instance. An ODBC data source will point to a single MultiValue Account, retrieving VIEWs and TABLEs based on security related to that account.
PI LOC = R83 AMC
DICT attribute indicating the position of a field within a record.
With all data being variable length, they are stored as delimited text strings. So, the location is not a fixed location, such as card column 72 (for the "mature" readers), but a number indicating whether this is the 1st, 2nd, ... value in the record. A stored data element is uniquely identified by a file name, the value of the key (@ID, stored in LOC 0) to a particular record, and the LOC of the element, which is included in the DICT specification. In spite of the use of locations within a record, the field names in a dictionary must still be unique. The LOC attribute is also used for virtual field code since I-descs have no stored location.
AMC: Attribute Mark Count
R83 AMC = PI LOC
DICT attribute indicating the position of an attribute in an item.
With all data being variable length, they are stored as delimited text strings. A mark is used to identify the end of one attribute value and start of the next. The attribute mark count indicates whether a named attribute in a dictionary will follow the 1st, 2nd, ... attribute mark, thus providing a logical location of the attribute within the item. A stored data element is uniquely identified by a file name, the value of the key (item-id stored in AMC 0), and the AMC of the attribute, which is included in the DICT specification. In spite of the identification of locations within an item, the attribute names in a dictionary must still be unique.
Field, value, and subvalue marks delimit the variable-length data.
Contributing to the flexibility of the data structures and corresponding productivity for software developers,
data are stored as delimited text strings. Items or records are separated by delimiters, as are the strings
within a record. For example, a field or attribute mark separates one element from another while a value mark,
@VM, delimits the values in a multivalued element. @VM has an ascii value of "]" with a hexadecimal value of "fd"
and a decimal value of 253 (I tossed that in since I knew it -- sad, eh?).
PI A named association forming a logical child table.
R83 Uses associative attributes, putting codes in attribute 4 of the DICT
Some multivalued data fields are simple one-dimensional arrays, with one value per row, while others are in multivalue associations, with multivalued elements associated with siblings in multi-dimensional arrays. These parallel multivalues form a table within a record or item.
|First||Last||Phone Type||Number||1st pick|
| || ||Home||(901)555-4321||Yes|
| || ||Mobile||(703)555-3333||No|
A value within a sub-table of a multivalued field or attribute.
Just as files are made up of records, which are made up of fields, which may be single or multivalued, the
multivalues could have subvalues. That is, an element may be of type "array" and that array may be more than
one-dimensional. Alternate terminology might indicate that a child table (multivalue) that inherits from a parent
table may itself have embedded child tables (only after marriage). An element with a 5 x 3 array for its value
would include 15 simple values, with delimiters. @SVM=^252 is used as a delimiter for subvalues.
Unprintable characters shown.
R83 correlative may be found in PI, but less likely
A function to be applied to a data element for specified events.
Correlatives are used as stored procedures tied to data elements in R83 and as a means to specify derived data,
which in PI might be coded as I-descs. If a data element is stored as a code that should be shown to the user as
the related description, e.g., then correlatives can be coded to that effect. The PI branch typically leaves
dictionary attribute 8 open for correlatives, rarely using them. Many in the R83 branch have also advocated
retiring correlatives. Learn only if required or it edifies you.
ED DICT FRIENDS FULLNAME_LFM &
8 A; N(LAST) : "," : N(FIRST) : N(MIDDLE)
I-desc: Interpretive Descriptor
PI Virtual field, computed or calculated data, user-defined function
When querying a MultiValue database, these I-descriptors or I-descs look to the user like stored data, but are
not. They are calculated on the fly when requested. Identified as a type I or V, these fields are specified using
DataBASIC syntax, unlike R83 correlatives. They often execute subroutines, or SUBRs, to perform complex
computations. Specify I-descs in dictionaries as single or multivalued as with stored data, but unlike stored
procedures in Oracle�. I-descs are one of the best enhancements to the MultiValue platform arising from the PI
@ID.......................... TYP.. LOC....................CONV & FORMAT SM
PRODUCT_TOTAL I PRICE*QTY MD2,$ 9R S
TRANS: Translate from another File
PI TRANS is similar to the R83 TFILE correlative
Function used to specify a field from another file in a dictionary.
The native MultiValue query language permits queries against exactly one file in each LIST statement. The
vocabulary for the query statement includes reserved words (like LIST), the file name, names of fields defined in
the file's dictionary, and inline functions on those fields. The fields for the query may be stored or computed.
Computed fields include those using the TRANS function to retrieve data from a "foreign" file, requiring the
proper parameters to retrieve the correct value.
DECK_ID D 3 [two DICT entries in the same file, one a TRANS]
CARD_NAME I TRANS('CARD_DECKS',DECK_ID,'DECK_NAME','X')
TFILE: Translate from another File
R83 t-file or TFILE correlative is similar to a PI TRANS I-descriptor
A correlative function for retrieving data from another file.
A correlative may be added to a dictionary giving a foreign key pointer to another file and retrieving the
value of an attribute in that other file. Because the MultiValue query language does not permit specifying JOINS
within the syntax, the MV file definition serves as a VIEW or SCHEMA for a query. For a particular query, the
values to be retrieved must be specified as attributes in a single DICT. An F correlative with a TFILE function
can be added to the dictionary as an S or synonym type attribute. The cryptic syntax was developed during an era
Instead of talking about blocks and block sizes, Pick� uses frames.
A frame is the basic unit of disk storage for MultiValue data. Some implementations call these "blocks" to align with the language of other databases. In memory, it is called a "Group" while on disk the frame is contiguous storage with the size specified when creating a file. In UniData�, for example, you specify n where the frame size is 512 bytes x 2n in the CREATE.FILE statement, so a 4k frame would be n=2. To optimize database performance, learn more about creating and resizing files for MultiValue, or if you prefer, delegate it to someone with more interest in this topic. For more specifics read your vendor documentation and be sure to check out fitzlong.com
Specify the number of Groups allocated to a file when you create it.
The modulo specification is given when creating a file. It is the number of Groups used for the file I-O. You
specify both the group or frame size and the modulo (number of groups), beyond which the data goes into
"overflow." While less database administration is typically required in MultiValue environments than with other
databases, tuning files and understanding various file types is one job it is best not to ignore. I have seen
developers just pick a prime number (I typically choose Mersenne primes) for the modulo when creating a new file
rather than determining a wise choice, leaving the file rework and optimization to the DBA.
Unidata example with modulo 7: CREATE.FILE MYFILE 7,2 DYNAMIC
Not SAM files nor Indexed sequential, MultiValue files are hashed.
MultiValue systems scale amazingly well. Hashing is the process of putting data into groups for future retrieval. Using a hashing algorithm to locate data separates access time from file size. MultiValue databases are often employed in applications handling volume transactions, out-performing other databases where the rubber meets the road. Tuning the MultiValue database includes altering the number and size of groups for the individual files. In some cases a file can be defined to resize dynamically. Various flavors of MultiValue also permit specification of alternate hashing algorithms to tune performance. So, although I have heard developers refer to them as DAM files, they are, in fact, hashed.
Notable for being Y2K compliant since the start, uses date offsets.
January 1, 1968 is significant in the Pick� world as Day 1. Unlike many applications using other databases, MultiValue applications with MV standard dates have been Y2K compliant from the outset. Each date is stored as a string representing the number of days since December 31, 1967, whether positive, 0, or negative. With this approach, dates are easy to compare and subtract. This is a boon for software developers since the system handily manages date conversions in any format. Offset dates are also used by such programs as Microsoft� Excel. So that you know when to send a card to firstname.lastname@example.org, or in case you just want to know my age, I was born on -4286.
R83, µData, and PI
These cards use the terms R83, µData, and PI to identify three not-entirely-distinct streams of development of the MultiValue platform. The MultiValue Family Tree diagram, available as a free pdf from store.tincat-group.com, provides a corresponding visual. These terms are used at times in the cards as shorthand identifiers for the three interwoven but separate branches of MultiValue.
In the late 70? s, the Pr1me Information (PI) stream started, emulating functionality roughly equivalent to Microdata? s Reality 3.x product (µData), which was roughly equivalent to Pick® R77. After that (e.g. R83), the streams diverged. Pick historians indicate that R80 is the software version that was marketed as R83, but the marketing term serves our purposes in these cards none-the-less.
Terms related to MV data and commands are identified with the two primary flavors -- R83 and PI -- with µData typically aligned with the R83 camp.
SOFACards MultiValue Overview Pack
Many people and web sites were consulted in the development of these cards. Special thanks to the following people whose help was invaluable and whose years of experience in the MultiValue world total over 80:
- Dave Weaver for prompt responses of encouragement when I needed it and excellent proof-reading skills in both form and content.
- Mark Pick for sharing stories and sending the thumbs up signals.
- Jon Sisk for knowing so much and coming to my aid in the 11th hour, plus his detailed publications that aided my research.
- Virginia VanAndel for "doing lunch" and passing along helpful intuitive reactions as well as knowledge from her vast experience.