onsdag den 31. oktober 2007

No Artifacts DataBase (NADB)

prologue

Fueled by the Tech Talk 'Everything is miscellanous' and the book about abstract datamodelling I read lately I think it's time I give it some action...

Databases today inherently have trouble adapting to changes in the business, nowadays business logic changes rapidly because of the increasing demand of adaptive business and use of IT in general.

Our databases are not up to speed though. Designing a database often takes too much time and is too great a risk to mess up, so you start of by designing this, and then basing your software on it.

With rapid development, it ought to be the other way around. The business dictates what needs to be in the database and often business changes. Databases should be as adaptable.

Avoid artifacts in the database

Data are not static! The database of a typical company contains a product, order and orderline.

Looking at the product table, what constitutes a Product? That you say it's a product? - no, it has a price and can be bought and therefore is a product.

If you are a paint store your product table could contain the columns: color, viscousity, manufacturer, price. If the paint store at some point wishes to extend it's catalogue to, perhaps, lamps, the table would be extended and we would have a table with the columns: color, viscousity, manufacturer, bulbtype and price.

This would leave some of the rows as null values, since a lamp doesn't have viscousity(Okay bad example.. you get the idea). Maybe you would fill out viscousity anyway or do something awful like using the column for something else.

In a big company I worked for I actually saw a table called 'Ship' which also contained 'Automobiles', because when automobiles became accepted as a business object some time in the past, the table happened to involve some of the same systems and had a property like 'Motor registration number'. Ofcourse, some columns in the table were along the way misused for something completely different then intended.
Actually it wasn't really clear anymore what entites were saved there, but I think also 'Caravans' were in there.

Taking the example of mixing 'Paint' and 'Lamps', lets look at their individual properties:

Paint

viscousity

color

Lamp

bulbtype

Product

price

Notice that I dont go all out here and create a 'Fluid' table for viscousity and a 'Coloured' table for the color. Also I have left out the property manufacturer, because it becomes a table 'Manufactured'. A product could be a service, which is not manufactured.

So you're probably asking, what handles the relations between all these properties. Well, the product should not relate to either 'Paint' or 'Lamp' since it should not define what a product is. Instead the business handles this by instantiating entities in the database with an abstract table 'Instance'. The abstract modelling I have come up with is shown below. The relations are read as "Instance implements Type" and "Type describes Instance". It is like the modelling we know from Object-oriented languages, with the key difference that it only describes states, because we are dealing with a persistence mechanism, that should only handle state. Behaviour is described by the business(READ application on top of DB).

You might be thinking, what about metadata like logging of change of state and other valuble business info. These are very relevant issues and such a log of something should be an instance in its own right. It would have properties like date and user. Where user is a property that is a reference to an instance. So the property user would have a domain allowing it to be an instance, and the value would be a reference to an actual instance of a user. This is the way to do metadata in the model. If you would to refer to a user by a UserId you lock the table in to being a User logged event, by allowing it to be a reference to an arbitrary instance you can define it to be a User logged event in your business by checking the type of the referred instance.

Note that this is not a classical Inheritance hierarchy where types can 'be of' other types, the 'Type' here is equal to an interface(as implemented in Java atleast). The creators of Java have since regretted the class inheritance as we know it today, because of the classical problem of the fragile superclass. The superclass is fragile because it's children are inherently(pun intended) dependant on the implementation of the superclass. When several children exists, the purpose of the superclass becomes more obscured as the business continues to grow and different requirements are put on the children, as the case with a table expanding horizontally. Instead, the children should implement different interfaces, dividing up the different requirements into several descriptors. The implementation of the interfaces can then be delegated. The big difference here is that state is not inherited.

In the case of paint being sold, it would be an instance of paint AND price, collectively becoming what the business would describe as paint we are selling.

The 'Instance' does not contain any properties itself because the 'Instance' is the sum of its relations.

Conclusion

The relational databases today are too statically implemented, by abstracting the data, we can much more easliy build a clean database, without ripping out tables. To change the properties of an instance you only need to relate it to a different property and not do heavy, frightening, undoable 'Alter tables' etc.

The model ofcourse requires some robust business logic on top, but from the recent working with Hibernate and Linq it's apparent that the traditional business logic never really worked. Contraints and validation of data has moved more towards UI and the business layer has become some dumb DAO, hopefully this approach will revitalize the layer and bring more flexibility into our backend.

From here I have to prove my theory by actually implementing a system for it and describe how intuitive and flexible my backend became :)

Stating that something is within a particular domain by putting the instance in a table with a certain name, does not make it so. It is as absurd as arguing whether Pluto is a moon or a planet, because it doesn't fit into either. Pluto is just an instance with a different set of properties.. "Everything is miscellanous".

onsdag den 17. oktober 2007

BEA Weblogic tips

J2EE shared libraries are very useful, because they are merged with your web application you can reuse web resources like JSP's, special framework stuff like BEA controller's and so on.

Here's how you might do it:

creating a shared J2EE library

a library can be any J2EE module. Mind you I've mostly used Portal Web projects as the base for my libraries as I found that I often have web resources in my libraries

defining a shared J2EE library

1. go to the META-INF folder
2. open the MANIFEST.MF file
3. add the following three lines, filling in (example given further down):
4. Extension-Name:
5. Specification-Version:
6. Implementation-Version:

Manifest-Version: 1.0
Extension-Name: common-portlet-template-web
Specification-Version: 0.1
Implementation-Version: 0.1

NOTE: the version MUST be given in order to import the library into BEA Workshop 10.0

exporting a shared J2EE library from a Web project

1. update the versions as commented in 'defining a shared J2EE library'
2. right-click the project, choose 'export->WAR file'

exporting a shared J2EE library from a J2EE Utility project

1. update the versions as commented in 'defining a shared J2EE library'
2. right-click the project, choose 'export->Export', select JAR file, click 'Next'
3. choose your source folders and the META-INF folder for export, click finish

importing a shared J2EE library into the BEA Workshop 10.0

1. in window->preferences->weblogic->J2EE libraries click 'add'
2. click 'browse', locate your exported WAR or JAR or EAR
3. click 'open', click 'ok'
4. adding it to your project is done by expanding the 'WebLogic Deployment Descriptor' in your project
5. right-click -> add, 'browse', locate it and click 'ok'

redeploying a shared J2EE library from the Workshop

1. in window->preferences->weblogic->J2EE libraries select the library and click 'remove'
2. open the server overview by double-clicking the server in the 'Servers' view
3. in the published module list, select the ear project and click 'Undeploy'
4. after the ear has finished undeploying, click the shared Library module, and click 'Undeploy'
5. now your are ready to deploy the library again, do this by going to window->preferences->weblogic->J2EE libraries
6. click 'add', then 'browse' and navigate to your shared library, select the library, click 'ok'
7. redploy the library by running you project on the server.

torsdag den 11. oktober 2007

Slow development cycles? Try something old-school.

Does this scenario sound familliar?

"Quality Assurance just discovered a logical error in your presentation of the customers product.. instead of the 'Two-month no credit limit' - product, the 'Three-month no credit limit' - product emerges,
Thinking you have the answer, you are remembering the pesky ProductNameProvider you wrote, and you guess it's a one-off error here. You seem to remember that the first two OR three records in its output are just static titels. You can't remember if it was the first two OR three, so you try incrementing your reference-index by 1. You then restart the server, deploy the application and wait patiently for a result...

After replicating the error in 42 easy steps, you conclude this wasn't the right fix, now it's showing a completely wrong 'One-month no credit limit' - product.

You suddenly realize that this could be an error in the actual product request or maybe the text formatting or maybe it's the caching?
- so you try every possible path for the error and try out 3-4 different scenarios for 3-4 different paths only repeating the 42 easy steps... again... and again..."

Such slow processes of bugfixing are caused by ONE thing.. lack of unit tests and proper structure. This cycle of fixing said bug, should have an obvious solution. A unit test somewhere should test that the output of a ProductNameProvider is fixed. Having an automatic testframework and supporting stubbing(maybe mocking), would allow you to test this functionality exclusively and not the entire application at once, which is too complex for any human to handle... well some applications are, especially if you like me only handle applications for shorter periods(READ 3-6 months).

I have just begun working on an assignment at a large danish company. Their applications are mainly J2EE with some heavy junk in the trunk.
My task there involves developing portlets in WebLogic 10. This is a nice platform for portals because of their intuitive CMS and flexibility in layout. BUT since its a J2EE platform, it's inherently slow, because of lots of configuration and stuff under the hood that does lots of... well.. stuff.

To test an application developers tend to restart the server, because we are neurotically inclined to think that an incorrect state of the configuration involved is the main cause of our bugs, and it's understandable. Although 99% of the time, we just fucked up, we do however restart the server about 5 times before admitting it..

NONE of my tasks will ever involve extending the CMS or directly interfacing with the CMS in ANY manner, so why would I need to EVER start this up when doing a simple two-page portlet.. or a complex seventy-page one for that matter.. case is.. I don't and I certainly won't. So, my advice is to avoid that the running of your code requires a running server. A common mistake is to run code, right there in your main methods.. for an example in a struts action, a beehive controller or whatever. Delegate, delegate, delegate all responsibility to classes with limited responsibility.. atleast 1 class, that doesn't require you to load a shitload of code or annotated compiles.

Even a simple ProductNameProvider will eventually be executed atleast 10 times in its development stage and atleast 10 times (maybe hundreds) in its lifetime for debugging purposes.. so lets do the math.

total runs = 20
Writing the ProductNameProvider in a unit test friendly environment = 15 min
Restarting a shitload of applications on a J2EE server + replicating bug = 2 min (atleast)
running a single unit test = 1 sec (atmost)

60 x 15 (unittest creation) + 20 x 1 (unittest running) = 920 seconds
60 x 2 x 20 = 2400 seconds

I am not counting in maintenance of unit tests, this is parrallel to development and should not acquire extra time.. unless you implement a completely new feature..

I'm not even getting in on how unit testing reduces bugs, facilitates refactoring, reduces the footprint of your code etc etc.. there are tons of documented experienced of that out there.. read it.. read it all... now!

Running/testing your code, should be as easy as ALT + SHIFT + x, t

Kenneth H. Nielsen