mmx metadata framework
...the DNA of your data
MMX metadata framework is a lightweight implementation of OMG Metadata Object Facility built on relational database technology. MMX framework
is based on three general concepts:
Metamodel | MMX Metamodel provides a storage mechanism for various knowledge models. The data model underlying the metadata framework is more abstract in nature than metadata models in general. The model consists of only a few abstract entities... see more.
Access layer | Object oriented methods can be exploited using inheritance to derive the whole data access layer from a small set of primitives created in SQL. MMX Metadata Framework provides several diverse methods of data access to fulfill different requirements... see more.
Generic transformation | A large part of relationships between different objects in metadata model are too complex to be described through simple static relations. Instead, universal data transformation concept is put to use enabling definition of transformations, mappings and transitions of any complexity... see more.

MMX Framework: Extending with Extended Properties

December 10, 2010 13:36 by mmx

Pretty often there is a need to add some extra attributes to a metamodel that has already been deployed. The need for those changes usually arises from changed requirements for an application. So how to enable those changes without extending the metamodel (M2 layer) itself?

There is a special property type, AnyProperty, to handle this scenario. Whan an object type (class) has a property of type AnyProperty, an object (instance of the class) can have an arbitrary number of different properties of type AnyProperty that all share the same property_type_cd. Then how to distinguish between those properties? 

All the instances of an AnyProperty descend from the same property type of a class in M2 layer, but an application who 'owns' them can give them new names. Alternatively, it is possible to use the same name, but a distinguishing prefix directly in the value field of the property. So an AnyProperty instance 'belongs' to an application and its completely up to the application how it chooses to use, denote and present this property.

Downsides (there obviously are some, right?). First, the inheritance mechanisms built into MMX Framework (MMXMD API) obviously cannot provide full support to those properties, as the names of them are not known to M2 layer. Second, AnyProperty is defined as String in M2 so an application cannot rely on the type information if it chooses to have an instance of AnyProperty that is implemented with some other datatype (eg. an XML document or a date formatted as a string).

Anyway, this mechanism (which bears some resemblance with anyAttribute element in XML Schema) enables making run-time changes in metamodel when design-time is not an option (run-time and design-time here refer to modeling, not application design), just remember to use it with care.

 



XML Schema to MMX Mapping: Essentials

August 16, 2010 17:54 by marx

XML Schema is an important and one of the most widely used modelling tools. Therefore it would make sense to have the the ability to use metamodels created as XML Schemas in MMX Metadata Framework in addition to metamodels originating from UML. As MMX architecture closely follows the ideas and architecture of MOF it would be natural to take advantage of the multitude of UML profiles for XML Schema that are already available. Most of these, however, focus on UML to XSD conversion, so there are only a few targeting XML Schema mapping to UML. As always, our approach is a pragmatic one, concentrating on elements we find more important, more widely used and easier to implement. Hence the word 'Essentials' in the title.  

Data types. Most important XML Schema built-in data types have direct equivalents in MMX Core Metamodel (string, integer, decimal, boolean, date etc.). A property (md_property_type) realizing an XSD attribute with built-in type therefore references one of these Core types in its datatype_cd. In case an XSD simple type is inherited from a built-in type with <restriction> construct the derived datatype is created in MMX M2 layer as a new class with datatype referred in <base=...> as its parent. The root data type classes have all required properties to support XML Schema restrictions and facets (length, minLength, maxLength, minInclusive, maxInclusive, minExclusive, maxExclusive, fractionDigits, totalDigits, pattern) that are appropriately inherited by their respective descendants as optional properties. In case a restriction specifes an enumeration a new Enumeration class is created.

Global vs. local. Global simple types would be realized as descendants of a built-in data type (parent_object_type_cd). Global complex types would be preferably realized as independent classes (md_object_type) with elements referring to them with <type=...> or <ref=...> implemented as their descendants. In case a complex type is defined directly inside an element (local complex type) it could be realized either globally and referenced with parent_object_type_cd, or directly inside the element, with parent_object_type_cd referencing the schema class. We assume that whenever an element references another element we can subsitute the reference with the referenced element. So, when there is a <ref=...> or a <type=...> attribute pointing to another (global) element we can replace the reference with the target element itself. The same applies to <attributeGroup>.

Naming. If possible, name attribute of an element or an attribute is used as name of a class (object_nm) or a property type (property_nm). It is very important that every element and attribute had either a name (preferred) or an ID (second best) to uniquely identify the corresponding class in MMX metamodel. In case both name and ID are missing a technical name gets generated that is less intuitive and makes the metamodel more difficult to understand. Note. origin_ds column of md_object_type is constructed based on either name or ID attribute.

Model Groups. XML Schema group (ordering) indicators <all>, <sequence> and <choice> do not have direct counterparts in UML. One way to implement this in MMX is via <group> (named model group) element. A <group> element would be realized as a class (md_object_type) having a 1:M relationship (aggregation) with the group members (classes). Ordering of the group members is indicated as a dedicated property of the group class.

XSD construct Mapping to MMX model
<schema>

<schema> element corresponds to an MMX metamodel. <schema> attributes attributeFormDefault, elementFormDefault, blockDefault and finalDefault are irrelevant in MMX metamodel context and are assumed to have their respective default values. Content of targetNamespace attribute is stored with md_object_type (class) while version and xmlns attributes are stored as class properties in md_property_type. Each xmlns attribute gets its own property type as this is a multiple property type. 

<element>

In case an <element> contains a complexType, specifies a complexType as its type or references another element that happens to be a complexType it gets realized as a class (md_object_type). In case an element is a simpleType it is realized as a property (md_property_type) of the schema class. 

<attribute>

An <attibute> generally corresponds to an MMX M2 level property (md_property_type). default and fixed attributes values are stored in default_value_ds and changeable_ind columns of md_property_type. Required attribute (use="required") is stored in mandatory_ind column.

<complexType> <complexType> element is naturally realized as a class (md_object_type) in MMX. In case a complex type contains another complex type element, the nested element gets its own class and a relationship between the two classes. Complex type attributes are realized as properties (md_property_type) owned by the complex type class. 
<simpleType>

Stand-alone (global) <simpleType> elements are realized as properties (md_property_type) of the root (schema) class.

<enumeration>

MMX provides the facilities to realize an <enumeration> of unlimited depth in form of a special built-in data type. An enumeration class is realized in M2 (class) layer of MMX metametamodel. Enumeration instances (corresponding to enumeration literals of UML) are created on M1 (instance) layer, in md_object table. An attribute taking enumeration as its type is modelled as a property type with enumeration as its data type referencing the specific enumeration class as its domain_cd. 

<group>

A <group> is realized as a named class (md_object_type) having relationships with the group members. Order indicator (<all>, <sequence> or <choice>) is stored in a dedicated property of the <group> class.

<all>, <sequence>, <choice> 

XML Schema order indicators (<all>, <sequence>, <choice>) denote 1:M relationships (aggregations) that might require a specific order and are realized via a named <group> element.

<minOccurs>, <maxOccurs>

XML Schema occurrence indicators (<minOccurs>, <maxOccurs>) are expressed as a combination of mandatory_ind and multiplicity_ind columns. Note that mandatory_ind and multiplicity_ind do not support numeric occurrence indicator values therefore only values "0", "1" and "unbounded" are allowed here. 

<attributeGroup>

An <attributeGroup> is essentially implemented as a complex type (MMX class), with attribute group name as the class name. A reference to an attribute group is realized either as inheritance between complex types or by substituting it with referenced attributes. In former case, the attribute group class would be an abstract class, with descendant class inheriting all of its attributes. 

<annotation>, <documentation>, <appInfo>

Class <annotation> is stored in object_ds column of a class (md_object_type). Attribute <annotation> is stored in property_ds column of a property (md_property_type). 

Notes. The following elements of XML Schema are not covered (yet) for various reasons:

- Identity constraints (<field>, <selector>, <key>, <keyref>, <unique>). 
- Undefined elements, attributes and content (<any>, <anyAttribute>, <notation>). 
- Relationships with external schemas (<redefine>, <import>, <include>). 
- Constructs we don't see too often or don't like (<union>, <list>). 

At least a part of this list will be covered in a subsequent release of this mapping document.

References:

[1] Grady Booch, Magnus Christerson, Matthew Fuchs, Jari Koistinen: UML for XML Schema Mapping Specification 

[2] David Carlson: UML Profile for XML Schema, http://www.xmlmodeling.com/documentation/specs/

[3] Martin Bernauer, Gerti Kappel, Gerhard Kramler: Representing XML Schema in UML - An UML Profile for XML Schema

[4] Nicholas Routledge, Linda Bird and Andrew Goodchild: UML and XML Schema

[5] M. Laura Caliusco, César Maidana, Martín Patiño, M. Rosa Galli and Omar Chiotti: A UML profile for XML Schema

[6] Martin Bernauer, Gerti Kappel, Gerhard Kramler: Representing XML Schema in UML – A Comparison of Approaches

 



Trees and Hierarchies the MMX Way

March 30, 2009 22:34 by marx

Implementing trees and hierarchies in a relational database is an issue that has been puzzling many and has triggered numerous posts, articles and even some books on the topic. 

As stated by Joe Celko in Chapter 26, Trees [1]: "Unfortunately, SQL provides poor support for such data. It does not directly map hierarchical data into tables, because tables are based on sets rather than on graphs. SQL directly supports neither the retrieval of the raw data in a meaningful recursive or hierarchical fashion nor computation of recursively defined functions that commonly occur in these types of applications. <...> Since the nodes contain the data, we can add columns to represent the edges of a tree. This is usually done in one of two ways in SQL: a single table or two tables." The single table representation enables one-to-many relationships via self-references (parent-child) while more general, two table representation handles many-to-many relationships of arbitrary cardinality. Based on the principles of Meta-Object Facility (MOF), MMX implements both M1 (model) and M2 (metamodel) layers of abstraction. Two most important relationship types defined by UML, Generalization and Association, are realized.

Generalization is defined on M2 level and is implemented via SQL self-relationship mechanism. Each class defined in M2 must belong to one class hierarchy, and only single inheritance is allowed. In terms of semantic relationship types in Controlled Vocabularies [2], this is an 'isA' relationship. Associations (as well as aggregations and compositions) are realized as a relationship table (an associative or a 'join table') allowing any class to be related to any other class with an arbitrary number of associations of different type (with support for mandatory and multiplicity constraints). This implementation enables straightforward translation of metamodels expressed as UML class diagrams into equivalent representation as MMX M2 level class objects.

M1 level deals with instances of M2 classes and parent-child hierarchies here denote 'inclusion', 'broader-narrower' or structural relationships between objects ('partOf' relationship in Controlled Vocabularies world). UML Links are implemented as a many-to-many relationship table, with both parent-child and link relationships being inherited from associations defined on M2 level. This inheritance enables automatic validation of M1 models against M2 metamodels by defining general rules to reinforce the integrity of models based on the characteristics of respective metamodel elements.

  Hierarchy
('single table', parent-child, one-to-many)
Relationships
('two tables', relationship table, many-to-many)
MOF M2
(metamodel)
Class hierarchy ('isA'),
UML Generalization
UML Associations
MOF M1
(model)
Object hierarchies ('whole-part'),
UML Links
UML Links

There seems to be a huge controversy in data management community whether implementing hierarchies in SQL should employ recursion support built into modern database systems or not. While a technique employing manual traversal and management of tree structures is proposed by Joe Celko in [1], the book is 15 years old and meanwhile the world (and databases) have changed a bit. Recursion is now part of ANSI SQL-99 with most big players providing at least basic support for it, and in many cases arguable gain in performance without taking advantage of recursive processing makes way to the gain in ease and speed of application development with it.

MMX Framework encapsulates all the details of handling inheritance, traversing hierarchies, navigating linked object paths etc. in MMX Metadata API realized as a set of table functions (database functions that return table as the result) that can be easily mapped by Object-Relational Mappers [3]. The performance penalty paid for recursion that might be an issue in an enterprise scale DWH is not an issue here - after all, MMX Framework is designed for (and mostly used in) metadata management, where data amounts are not beyond comprehension. 

[1] Joe Celko's SQL For Smarties: Advanced SQL Programming, 1995.

[2] Zeng, Marcia Lei. Construction of Controlled Vocabularies, A Primer (based on Z39.19), 2005.

[3] Scott W. Ambler. Mapping Objects to Relational Databases, 2000.