onsdag den 31. oktober 2007

No Artifacts DataBase (NADB)


Fueled by the Tech Talk 'Everything is miscellanous' and the book about abstract datamodelling I read lately I think it's time I give it some action...

Databases today inherently have trouble adapting to changes in the business, nowadays business logic changes rapidly because of the increasing demand of adaptive business and use of IT in general.

Our databases are not up to speed though. Designing a database often takes too much time and is too great a risk to mess up, so you start of by designing this, and then basing your software on it.

With rapid development, it ought to be the other way around. The business dictates what needs to be in the database and often business changes. Databases should be as adaptable.

Avoid artifacts in the database

Data are not static! The database of a typical company contains a product, order and orderline.

Looking at the product table, what constitutes a Product? That you say it's a product? - no, it has a price and can be bought and therefore is a product.

If you are a paint store your product table could contain the columns: color, viscousity, manufacturer, price. If the paint store at some point wishes to extend it's catalogue to, perhaps, lamps, the table would be extended and we would have a table with the columns: color, viscousity, manufacturer, bulbtype and price.

This would leave some of the rows as null values, since a lamp doesn't have viscousity(Okay bad example.. you get the idea). Maybe you would fill out viscousity anyway or do something awful like using the column for something else.

In a big company I worked for I actually saw a table called 'Ship' which also contained 'Automobiles', because when automobiles became accepted as a business object some time in the past, the table happened to involve some of the same systems and had a property like 'Motor registration number'. Ofcourse, some columns in the table were along the way misused for something completely different then intended.
Actually it wasn't really clear anymore what entites were saved there, but I think also 'Caravans' were in there.

Taking the example of mixing 'Paint' and 'Lamps', lets look at their individual properties:

    • Paint



    • Lamp


    • Product


  • Notice that I dont go all out here and create a 'Fluid' table for viscousity and a 'Coloured' table for the color. Also I have left out the property manufacturer, because it becomes a table 'Manufactured'. A product could be a service, which is not manufactured.

    So you're probably asking, what handles the relations between all these properties. Well, the product should not relate to either 'Paint' or 'Lamp' since it should not define what a product is. Instead the business handles this by instantiating entities in the database with an abstract table 'Instance'. The abstract modelling I have come up with is shown below. The relations are read as "Instance implements Type" and "Type describes Instance". It is like the modelling we know from Object-oriented languages, with the key difference that it only describes states, because we are dealing with a persistence mechanism, that should only handle state. Behaviour is described by the business(READ application on top of DB).

    You might be thinking, what about metadata like logging of change of state and other valuble business info. These are very relevant issues and such a log of something should be an instance in its own right. It would have properties like date and user. Where user is a property that is a reference to an instance. So the property user would have a domain allowing it to be an instance, and the value would be a reference to an actual instance of a user. This is the way to do metadata in the model. If you would to refer to a user by a UserId you lock the table in to being a User logged event, by allowing it to be a reference to an arbitrary instance you can define it to be a User logged event in your business by checking the type of the referred instance.

    Note that this is not a classical Inheritance hierarchy where types can 'be of' other types, the 'Type' here is equal to an interface(as implemented in Java atleast). The creators of Java have since regretted the class inheritance as we know it today, because of the classical problem of the fragile superclass. The superclass is fragile because it's children are inherently(pun intended) dependant on the implementation of the superclass. When several children exists, the purpose of the superclass becomes more obscured as the business continues to grow and different requirements are put on the children, as the case with a table expanding horizontally. Instead, the children should implement different interfaces, dividing up the different requirements into several descriptors. The implementation of the interfaces can then be delegated. The big difference here is that state is not inherited.

    In the case of paint being sold, it would be an instance of paint AND price, collectively becoming what the business would describe as paint we are selling.

    The 'Instance' does not contain any properties itself because the 'Instance' is the sum of its relations.


    The relational databases today are too statically implemented, by abstracting the data, we can much more easliy build a clean database, without ripping out tables. To change the properties of an instance you only need to relate it to a different property and not do heavy, frightening, undoable 'Alter tables' etc.

    The model ofcourse requires some robust business logic on top, but from the recent working with Hibernate and Linq it's apparent that the traditional business logic never really worked. Contraints and validation of data has moved more towards UI and the business layer has become some dumb DAO, hopefully this approach will revitalize the layer and bring more flexibility into our backend.

    From here I have to prove my theory by actually implementing a system for it and describe how intuitive and flexible my backend became :)

    Stating that something is within a particular domain by putting the instance in a table with a certain name, does not make it so. It is as absurd as arguing whether Pluto is a moon or a planet, because it doesn't fit into either. Pluto is just an instance with a different set of properties.. "Everything is miscellanous".