mandag den 21. januar 2008

Interactive javascript environment

Before I really got into Javascript I experienced it as a necessary evil when constructing web applications.

Though recently because of the added pressure of webapps being interactive, I finally took the bullet and gave object oriented javascript a go.

I was amazed, after going through several tutorials on scriptaculous(based on prototype) and JQuery.

But to really get my groove on I needed the ultimate tools, which I found here: Web Development Bookmarklets.

The JavaScript interactive compiler(drag to bookmarks bar), from web development page. You can click this bookmarklet to open a javascript shell for any page and it keeps its scope across new requests.. Excellent!

here's a bookmarklet for loading prototype on the fly, made this myself :), drag to bookmark bar:

load prototype

Then you may write something like this, in the shell, to print the names of all divs:


$$('div').each(function(element){print(element.name);});

Or this to make all divs blink


$$('div').each(function(element){blink(element);});

You can load any script you want, which is particularly easy in the shell with> load(scriptURL). ex.


load('http://wiki.script.aculo.us/javascripts/prototype.js');
//to add more fun load this:
load('http://wiki.script.aculo.us/javascripts/effects.js');
//shake all divs
$$('div').each(function(divElement){Effect.Shake(divElement);});

It boils down to being able to manipulate the dom at will and test Ajax very easily. Enjoy it, I do.

UPDATE:
more powerfull load
load scripts

adds the function shakeDivs(); - try it

søndag den 23. december 2007

High automatic test coverage of web applications

Tests in web applications

Features in a web application reside in one of two locations: On the client or on the server.

A unit test of the feature in the serverside codebase can cover numerous errors. If the application is written in PHP, Python or any other language compiled at runtime, the unit test can cover compiler errors. It can also cover logical errors and the error of not meeting the requirements.

A functional test on the client side can also cover the integration test with browsers if the test framework can execute the tests in multiple browsers, otherwise the lowest denominator might be adequate (think IE 6).

frameworks to use

Almost every decent language has a unit testing framework which provide a standard way of defining, executing and reporting tests. Some, like Django's, also handles the fixtures, like creating and deleting a database for each test run. This is very helpful.

For functional tests in the browser engine, there are numerous frameworks. Some record a macro and repeats it, these tools are to be avoided, since they make maintenance of the tests very cumbersome, which may lead to neglect in the project. Instead of dicussing pro's and con's of various frameworks, like loadrunner, selenium etc. I'm simply going for Canoo's webtest, which aggregates HtmlUnit and several other thirdparty frameworks into one very good framework.

It has the advantage that it is defined in XML, which ought to be readable by most with at least some technical interest. Think testers, new developers and maybe even project managers. It also has a very good reporting tool that is generated in static html files. I'll give an example of a webtest further down.

implementing and maintaining the tests

When developing testdriven, it is a given that at least unit tests are being made. In most IDE's for java developers JUnit tests are we easily made and runnable (though sometimes it can be tricky to setup a full test suite running all tests). In Django you simply write a tests.py module at the root of your application*1. The management tool in Django can then invoke all tests py simple running 'manage.py tests' at the root of the project. In java you have to marshal all the tests in an ANT script or reference all test classes from 1 testsuite, leading to tedious work that puts the tests in jeopardy of being neglected.

A canoo webtest is run by an ANT script. 1 script, which I always use is simply called 'allTests.xml'. Separating tests should not be the norm as with JUnit instead you should have to put some effort into excluding certain tests. The exclusions might be performance tests or some other test that is not a candidate for regression testing.

In webtest you may define several macros and reuse them when you like and it has all the functionality of ANT. In the next section I give an example of a test implementation.

testing upload and file conversion

This feature of my project involves multiple units on the serverside and only 1 clientside. Because most of the features serverside are using almost exclusively the Django framework there is no need to cover it, so I can rely on one simple unit test to test this for me, it takes a file and converts it, which is the only part I have implemented myself:


    def test_convert_video(self):
        entity = Entity(name="test video")
        entity.save()
        physical = Physical(content_object=entity)
        
        convert_to_flv(self.avifilename, self.flvfilename, physical)
        result = os.path.exists(self.flvfilename)
        self.assertTrue(result)
        self.assertEqual(self.flvfilename, physical.converted_filename)
        os.remove(self.flvfilename)

There are several more features surrounding this one, like test_grab_thumbnail_from_flv, test_invalid_cleanup_convert_video, test_get_default_thumbnail_filename and many more. This part has to have high coverage because it is a core function of the web application, but I will not go into priority vs. coverage in this entry.

The unit under test in the example is the convert_to_flv. When the test executes a fake database is set up and some test doubles are made. The files used in the test are cleaned up afterwards to ensure that the state of the system is not changed as a result of the test.

On the client side I have this webtest:


<target name="test">
  <webtest name="add different kinds of content">
   <login username="user" password="pass"/>
   <addContent name="picture 1" file="../../JOOLLU_webtest/tests/picture1.png"/>
   <addContent name="picture 2" file="../../JOOLLU_webtest/tests/babe.jpg"/>
   <addContent name="movie 1" file="../../JOOLLU_webtest/tests/plane_lands.avi"/>
   <addContent name="movie 2" file="../../JOOLLU_webtest/tests/preview.mov"/>
   <addContent name="" file="../../JOOLLU_webtest/tests/preview.mov"/>
   <addContent name="plain flv" file="../../JOOLLU_webtest/tests/3.flv"/>
  </webtest>

It covers the feature of uploading files. Other tests rely on this being executed which leads to test dependencies, which is totally fine. If it was a requirement not to have test dependencies I would have tests that were hard to maintain since all tests relying on this one would have to upload content themselves, putting the test at risk of being neglected.

Inspect the test. The tag 'addContent' is one macro I have defined as a part of my test project, and it looks like this:


<macrodef name="addContent" description="add some content to the current user profile">
 <attribute name="name"/>
 <attribute name="file"/>
 
 <sequential>
   <clickLink label="add content"/>
   <setInputField name="name" value="@{name}"/>
   <setFileField name="file" fileName="@{file}"/>
   <clickButton htmlId="add_button"/>
 </sequential>
</macrodef>

The power of this is that I can encapsulate code that is part of implementing the tests separately and thus make the actual test easier to read and maintain. A tester with no programming background would be able to read the test and change the name of the file and the file itself.
Among other tests that rely on this I have contentValidation.xml, invalidContentValidation.xml, verifyCorrectContent.xml and many others. The nice thing is that each subsequent test can be a direct verification of the feature or requirement. This way I can ensure that no test effort is lost because the implementation on the server has changed.

closing remarks

Testing this way allows for full coverage of a web application. The webtest framework can handle almost all browser functionality even Ajax, ehich is due to the HtmlUit project. It has a couple of bugs, it can for example only drag and drop 1 item for each test. But these are minor issues. If my application did not contain flash as part of the implementation I reckon I could achieve 100% coverage in automatic tests. As it is now I don't have much test of requirements for performance, but I think I'm at 80% coverage in total with unit tests and webtest.

To get the high automatic coverage you need to figure out on which side you have to test your feature, client or server. Then implement the unit tests for serverside and clientside tests in a framework like webtest.. and then you can implement the feature itself, knowing that when the bar in the reporting tool reaches green.. you are finished.

links

canoo webtest

Django testing

footnotes
*1 in Django there are multiple applications per project. An application in Django is the equivalent of a plugin in Eclipse RCP or a Web Application Resource in J2EE.

tirsdag den 4. december 2007

Development with Django and NADB

Python has always intrigued me. Occasionally I have kept revisiting the language to be marbled at the simplicity and ease of use. Until recently I did not have a reason to go further with it.

I started implementing the project Joollu(coming soon joollu).

I decided to use the Django framework. Because a) its done in python and b) I heard it was good for content-driven websites.

And boy it's intuitive, being a novice Python-programmer I felt it astonishing that I could implement my own video blog in less than approx. 10 hours. The documentation is equally astounding.

The Django framework has an object relational-mapper, that handles all your db-management for you, it's a very simple way of making a model layer. You only need to configure the things that are specfic for you, making it very rapid for simple features. I thought it might be a bigger task to implement the NADB, which is only based on interfaces, but then I discovered this notion of Generics implemented in Django.

Generics is used to associate a model object with any other object, without knowing anything about it's implementation. This is exactly what the NADB is all about. Instead of having a Video model class, I implemented a Physical(for storing files), viewable(for attributes like thumbnail, width/height etc.) and firstly an Entity class to represent the actual video, the only attribute here is name and id. None of these models know anything about the other on the DB-layer, instead I have a MediaManager, that can marshal a video from all its referred interfaces. Retrieving a video can be done like so:



>>> from joollu.core.models import *
>>> entity = Entity.objects.get(name="myvideo")
>>> physical = Physical.objects.get(content_object=e)
>>> viewable = Viewable.objects.get(content_object=e)

or more simply with a little wrapping



>>> from joollu.core.managers import MediaManager
>>> entity = Entity.objects.get(name="myvideo")
>>> video = MediaManager().filter_video(name="myvideo")



>>> from joollu.core.managers import MediaManager
>>> entity = Entity.objects.get(name="myvideo")
>>> video = MediaManager().get_media(entity.id)

sadly it kinda impairs the possibilities of filtering and retrieving which is done very easily with Django. Here is an example for physical, if I want particular files:



>>> from joollu.core.models import *
>>> physical_list = Physical.objects.filter(file__endswith=".png")
>>> physical_list
[Physical: /Users/kennethnielsen/pythonspace/joollu/media/content/picture1.png]

Keeping the notions of things like video and images in the business layer increases flexibility. If I were to change attributes I can slap it on one of the exisitng interfaces or make a new one. Like if I want to save who and when something has been published, I can make a new model class Publishable and attach it to all content. This would be regardless of it being a video, image or other. It is up to the business layer to decide if published attributes are needed in specific cases. Comments is another example and so it goes..

There are a ton of other stuff worth mentioning about Django, it all just kinda fits together in this very intuitive framework. You should read up on it, or find the tech talk about it on google tech talks.

Outsourcing

With all the J2EE projects being outsourced to India, I guess this zeitgeist graph makes sense:

Google searches for 'What is j2ee'

Look at the regions...

onsdag den 31. oktober 2007

No Artifacts DataBase (NADB)

prologue

Fueled by the Tech Talk 'Everything is miscellanous' and the book about abstract datamodelling I read lately I think it's time I give it some action...

Databases today inherently have trouble adapting to changes in the business, nowadays business logic changes rapidly because of the increasing demand of adaptive business and use of IT in general.

Our databases are not up to speed though. Designing a database often takes too much time and is too great a risk to mess up, so you start of by designing this, and then basing your software on it.

With rapid development, it ought to be the other way around. The business dictates what needs to be in the database and often business changes. Databases should be as adaptable.

Avoid artifacts in the database

Data are not static! The database of a typical company contains a product, order and orderline.

Looking at the product table, what constitutes a Product? That you say it's a product? - no, it has a price and can be bought and therefore is a product.

If you are a paint store your product table could contain the columns: color, viscousity, manufacturer, price. If the paint store at some point wishes to extend it's catalogue to, perhaps, lamps, the table would be extended and we would have a table with the columns: color, viscousity, manufacturer, bulbtype and price.

This would leave some of the rows as null values, since a lamp doesn't have viscousity(Okay bad example.. you get the idea). Maybe you would fill out viscousity anyway or do something awful like using the column for something else.

In a big company I worked for I actually saw a table called 'Ship' which also contained 'Automobiles', because when automobiles became accepted as a business object some time in the past, the table happened to involve some of the same systems and had a property like 'Motor registration number'. Ofcourse, some columns in the table were along the way misused for something completely different then intended.
Actually it wasn't really clear anymore what entites were saved there, but I think also 'Caravans' were in there.

Taking the example of mixing 'Paint' and 'Lamps', lets look at their individual properties:

Paint

viscousity

color

Lamp

bulbtype

Product

price

Notice that I dont go all out here and create a 'Fluid' table for viscousity and a 'Coloured' table for the color. Also I have left out the property manufacturer, because it becomes a table 'Manufactured'. A product could be a service, which is not manufactured.

So you're probably asking, what handles the relations between all these properties. Well, the product should not relate to either 'Paint' or 'Lamp' since it should not define what a product is. Instead the business handles this by instantiating entities in the database with an abstract table 'Instance'. The abstract modelling I have come up with is shown below. The relations are read as "Instance implements Type" and "Type describes Instance". It is like the modelling we know from Object-oriented languages, with the key difference that it only describes states, because we are dealing with a persistence mechanism, that should only handle state. Behaviour is described by the business(READ application on top of DB).

You might be thinking, what about metadata like logging of change of state and other valuble business info. These are very relevant issues and such a log of something should be an instance in its own right. It would have properties like date and user. Where user is a property that is a reference to an instance. So the property user would have a domain allowing it to be an instance, and the value would be a reference to an actual instance of a user. This is the way to do metadata in the model. If you would to refer to a user by a UserId you lock the table in to being a User logged event, by allowing it to be a reference to an arbitrary instance you can define it to be a User logged event in your business by checking the type of the referred instance.

Note that this is not a classical Inheritance hierarchy where types can 'be of' other types, the 'Type' here is equal to an interface(as implemented in Java atleast). The creators of Java have since regretted the class inheritance as we know it today, because of the classical problem of the fragile superclass. The superclass is fragile because it's children are inherently(pun intended) dependant on the implementation of the superclass. When several children exists, the purpose of the superclass becomes more obscured as the business continues to grow and different requirements are put on the children, as the case with a table expanding horizontally. Instead, the children should implement different interfaces, dividing up the different requirements into several descriptors. The implementation of the interfaces can then be delegated. The big difference here is that state is not inherited.

In the case of paint being sold, it would be an instance of paint AND price, collectively becoming what the business would describe as paint we are selling.

The 'Instance' does not contain any properties itself because the 'Instance' is the sum of its relations.

Conclusion

The relational databases today are too statically implemented, by abstracting the data, we can much more easliy build a clean database, without ripping out tables. To change the properties of an instance you only need to relate it to a different property and not do heavy, frightening, undoable 'Alter tables' etc.

The model ofcourse requires some robust business logic on top, but from the recent working with Hibernate and Linq it's apparent that the traditional business logic never really worked. Contraints and validation of data has moved more towards UI and the business layer has become some dumb DAO, hopefully this approach will revitalize the layer and bring more flexibility into our backend.

From here I have to prove my theory by actually implementing a system for it and describe how intuitive and flexible my backend became :)

Stating that something is within a particular domain by putting the instance in a table with a certain name, does not make it so. It is as absurd as arguing whether Pluto is a moon or a planet, because it doesn't fit into either. Pluto is just an instance with a different set of properties.. "Everything is miscellanous".

onsdag den 17. oktober 2007

BEA Weblogic tips

J2EE shared libraries are very useful, because they are merged with your web application you can reuse web resources like JSP's, special framework stuff like BEA controller's and so on.

Here's how you might do it:

creating a shared J2EE library

a library can be any J2EE module. Mind you I've mostly used Portal Web projects as the base for my libraries as I found that I often have web resources in my libraries

defining a shared J2EE library

1. go to the META-INF folder
2. open the MANIFEST.MF file
3. add the following three lines, filling in (example given further down):
4. Extension-Name:
5. Specification-Version:
6. Implementation-Version:

Manifest-Version: 1.0
Extension-Name: common-portlet-template-web
Specification-Version: 0.1
Implementation-Version: 0.1

NOTE: the version MUST be given in order to import the library into BEA Workshop 10.0

exporting a shared J2EE library from a Web project

1. update the versions as commented in 'defining a shared J2EE library'
2. right-click the project, choose 'export->WAR file'

exporting a shared J2EE library from a J2EE Utility project

1. update the versions as commented in 'defining a shared J2EE library'
2. right-click the project, choose 'export->Export', select JAR file, click 'Next'
3. choose your source folders and the META-INF folder for export, click finish

importing a shared J2EE library into the BEA Workshop 10.0

1. in window->preferences->weblogic->J2EE libraries click 'add'
2. click 'browse', locate your exported WAR or JAR or EAR
3. click 'open', click 'ok'
4. adding it to your project is done by expanding the 'WebLogic Deployment Descriptor' in your project
5. right-click -> add, 'browse', locate it and click 'ok'

redeploying a shared J2EE library from the Workshop

1. in window->preferences->weblogic->J2EE libraries select the library and click 'remove'
2. open the server overview by double-clicking the server in the 'Servers' view
3. in the published module list, select the ear project and click 'Undeploy'
4. after the ear has finished undeploying, click the shared Library module, and click 'Undeploy'
5. now your are ready to deploy the library again, do this by going to window->preferences->weblogic->J2EE libraries
6. click 'add', then 'browse' and navigate to your shared library, select the library, click 'ok'
7. redploy the library by running you project on the server.

torsdag den 11. oktober 2007

Slow development cycles? Try something old-school.

Does this scenario sound familliar?

"Quality Assurance just discovered a logical error in your presentation of the customers product.. instead of the 'Two-month no credit limit' - product, the 'Three-month no credit limit' - product emerges,
Thinking you have the answer, you are remembering the pesky ProductNameProvider you wrote, and you guess it's a one-off error here. You seem to remember that the first two OR three records in its output are just static titels. You can't remember if it was the first two OR three, so you try incrementing your reference-index by 1. You then restart the server, deploy the application and wait patiently for a result...

After replicating the error in 42 easy steps, you conclude this wasn't the right fix, now it's showing a completely wrong 'One-month no credit limit' - product.

You suddenly realize that this could be an error in the actual product request or maybe the text formatting or maybe it's the caching?
- so you try every possible path for the error and try out 3-4 different scenarios for 3-4 different paths only repeating the 42 easy steps... again... and again..."

Such slow processes of bugfixing are caused by ONE thing.. lack of unit tests and proper structure. This cycle of fixing said bug, should have an obvious solution. A unit test somewhere should test that the output of a ProductNameProvider is fixed. Having an automatic testframework and supporting stubbing(maybe mocking), would allow you to test this functionality exclusively and not the entire application at once, which is too complex for any human to handle... well some applications are, especially if you like me only handle applications for shorter periods(READ 3-6 months).

I have just begun working on an assignment at a large danish company. Their applications are mainly J2EE with some heavy junk in the trunk.
My task there involves developing portlets in WebLogic 10. This is a nice platform for portals because of their intuitive CMS and flexibility in layout. BUT since its a J2EE platform, it's inherently slow, because of lots of configuration and stuff under the hood that does lots of... well.. stuff.

To test an application developers tend to restart the server, because we are neurotically inclined to think that an incorrect state of the configuration involved is the main cause of our bugs, and it's understandable. Although 99% of the time, we just fucked up, we do however restart the server about 5 times before admitting it..

NONE of my tasks will ever involve extending the CMS or directly interfacing with the CMS in ANY manner, so why would I need to EVER start this up when doing a simple two-page portlet.. or a complex seventy-page one for that matter.. case is.. I don't and I certainly won't. So, my advice is to avoid that the running of your code requires a running server. A common mistake is to run code, right there in your main methods.. for an example in a struts action, a beehive controller or whatever. Delegate, delegate, delegate all responsibility to classes with limited responsibility.. atleast 1 class, that doesn't require you to load a shitload of code or annotated compiles.

Even a simple ProductNameProvider will eventually be executed atleast 10 times in its development stage and atleast 10 times (maybe hundreds) in its lifetime for debugging purposes.. so lets do the math.

total runs = 20
Writing the ProductNameProvider in a unit test friendly environment = 15 min
Restarting a shitload of applications on a J2EE server + replicating bug = 2 min (atleast)
running a single unit test = 1 sec (atmost)

60 x 15 (unittest creation) + 20 x 1 (unittest running) = 920 seconds
60 x 2 x 20 = 2400 seconds

I am not counting in maintenance of unit tests, this is parrallel to development and should not acquire extra time.. unless you implement a completely new feature..

I'm not even getting in on how unit testing reduces bugs, facilitates refactoring, reduces the footprint of your code etc etc.. there are tons of documented experienced of that out there.. read it.. read it all... now!

Running/testing your code, should be as easy as ALT + SHIFT + x, t

Kenneth H. Nielsen