søndag den 23. december 2007

High automatic test coverage of web applications

Tests in web applications

Features in a web application reside in one of two locations: On the client or on the server.

A unit test of the feature in the serverside codebase can cover numerous errors. If the application is written in PHP, Python or any other language compiled at runtime, the unit test can cover compiler errors. It can also cover logical errors and the error of not meeting the requirements.

A functional test on the client side can also cover the integration test with browsers if the test framework can execute the tests in multiple browsers, otherwise the lowest denominator might be adequate (think IE 6).

frameworks to use

Almost every decent language has a unit testing framework which provide a standard way of defining, executing and reporting tests. Some, like Django's, also handles the fixtures, like creating and deleting a database for each test run. This is very helpful.

For functional tests in the browser engine, there are numerous frameworks. Some record a macro and repeats it, these tools are to be avoided, since they make maintenance of the tests very cumbersome, which may lead to neglect in the project. Instead of dicussing pro's and con's of various frameworks, like loadrunner, selenium etc. I'm simply going for Canoo's webtest, which aggregates HtmlUnit and several other thirdparty frameworks into one very good framework.

It has the advantage that it is defined in XML, which ought to be readable by most with at least some technical interest. Think testers, new developers and maybe even project managers. It also has a very good reporting tool that is generated in static html files. I'll give an example of a webtest further down.

implementing and maintaining the tests

When developing testdriven, it is a given that at least unit tests are being made. In most IDE's for java developers JUnit tests are we easily made and runnable (though sometimes it can be tricky to setup a full test suite running all tests). In Django you simply write a tests.py module at the root of your application*1. The management tool in Django can then invoke all tests py simple running 'manage.py tests' at the root of the project. In java you have to marshal all the tests in an ANT script or reference all test classes from 1 testsuite, leading to tedious work that puts the tests in jeopardy of being neglected.

A canoo webtest is run by an ANT script. 1 script, which I always use is simply called 'allTests.xml'. Separating tests should not be the norm as with JUnit instead you should have to put some effort into excluding certain tests. The exclusions might be performance tests or some other test that is not a candidate for regression testing.

In webtest you may define several macros and reuse them when you like and it has all the functionality of ANT. In the next section I give an example of a test implementation.

testing upload and file conversion

This feature of my project involves multiple units on the serverside and only 1 clientside. Because most of the features serverside are using almost exclusively the Django framework there is no need to cover it, so I can rely on one simple unit test to test this for me, it takes a file and converts it, which is the only part I have implemented myself:

def test_convert_video(self):
entity = Entity(name="test video")
physical = Physical(content_object=entity)

convert_to_flv(self.avifilename, self.flvfilename, physical)
result = os.path.exists(self.flvfilename)
self.assertEqual(self.flvfilename, physical.converted_filename)

There are several more features surrounding this one, like test_grab_thumbnail_from_flv, test_invalid_cleanup_convert_video, test_get_default_thumbnail_filename and many more. This part has to have high coverage because it is a core function of the web application, but I will not go into priority vs. coverage in this entry.

The unit under test in the example is the convert_to_flv. When the test executes a fake database is set up and some test doubles are made. The files used in the test are cleaned up afterwards to ensure that the state of the system is not changed as a result of the test.

On the client side I have this webtest:

<target name="test">
<webtest name="add different kinds of content">
<login username="user" password="pass"/>
<addContent name="picture 1" file="../../JOOLLU_webtest/tests/picture1.png"/>
<addContent name="picture 2" file="../../JOOLLU_webtest/tests/babe.jpg"/>
<addContent name="movie 1" file="../../JOOLLU_webtest/tests/plane_lands.avi"/>
<addContent name="movie 2" file="../../JOOLLU_webtest/tests/preview.mov"/>
<addContent name="" file="../../JOOLLU_webtest/tests/preview.mov"/>
<addContent name="plain flv" file="../../JOOLLU_webtest/tests/3.flv"/>

It covers the feature of uploading files. Other tests rely on this being executed which leads to test dependencies, which is totally fine. If it was a requirement not to have test dependencies I would have tests that were hard to maintain since all tests relying on this one would have to upload content themselves, putting the test at risk of being neglected.

Inspect the test. The tag 'addContent' is one macro I have defined as a part of my test project, and it looks like this:

<macrodef name="addContent" description="add some content to the current user profile">
<attribute name="name"/>
<attribute name="file"/>

<clickLink label="add content"/>
<setInputField name="name" value="@{name}"/>
<setFileField name="file" fileName="@{file}"/>
<clickButton htmlId="add_button"/>

The power of this is that I can encapsulate code that is part of implementing the tests separately and thus make the actual test easier to read and maintain. A tester with no programming background would be able to read the test and change the name of the file and the file itself.
Among other tests that rely on this I have contentValidation.xml, invalidContentValidation.xml, verifyCorrectContent.xml and many others. The nice thing is that each subsequent test can be a direct verification of the feature or requirement. This way I can ensure that no test effort is lost because the implementation on the server has changed.

closing remarks

Testing this way allows for full coverage of a web application. The webtest framework can handle almost all browser functionality even Ajax, ehich is due to the HtmlUit project. It has a couple of bugs, it can for example only drag and drop 1 item for each test. But these are minor issues. If my application did not contain flash as part of the implementation I reckon I could achieve 100% coverage in automatic tests. As it is now I don't have much test of requirements for performance, but I think I'm at 80% coverage in total with unit tests and webtest.

To get the high automatic coverage you need to figure out on which side you have to test your feature, client or server. Then implement the unit tests for serverside and clientside tests in a framework like webtest.. and then you can implement the feature itself, knowing that when the bar in the reporting tool reaches green.. you are finished.


canoo webtest

Django testing

*1 in Django there are multiple applications per project. An application in Django is the equivalent of a plugin in Eclipse RCP or a Web Application Resource in J2EE.

tirsdag den 4. december 2007

Development with Django and NADB

Python has always intrigued me. Occasionally I have kept revisiting the language to be marbled at the simplicity and ease of use. Until recently I did not have a reason to go further with it.

I started implementing the project Joollu(coming soon joollu).

I decided to use the Django framework. Because a) its done in python and b) I heard it was good for content-driven websites.

And boy it's intuitive, being a novice Python-programmer I felt it astonishing that I could implement my own video blog in less than approx. 10 hours. The documentation is equally astounding.

The Django framework has an object relational-mapper, that handles all your db-management for you, it's a very simple way of making a model layer. You only need to configure the things that are specfic for you, making it very rapid for simple features. I thought it might be a bigger task to implement the NADB, which is only based on interfaces, but then I discovered this notion of Generics implemented in Django.

Generics is used to associate a model object with any other object, without knowing anything about it's implementation. This is exactly what the NADB is all about. Instead of having a Video model class, I implemented a Physical(for storing files), viewable(for attributes like thumbnail, width/height etc.) and firstly an Entity class to represent the actual video, the only attribute here is name and id. None of these models know anything about the other on the DB-layer, instead I have a MediaManager, that can marshal a video from all its referred interfaces. Retrieving a video can be done like so:

>>> from joollu.core.models import *
>>> entity = Entity.objects.get(name="myvideo")
>>> physical = Physical.objects.get(content_object=e)
>>> viewable = Viewable.objects.get(content_object=e)

or more simply with a little wrapping

>>> from joollu.core.managers import MediaManager
>>> entity = Entity.objects.get(name="myvideo")
>>> video = MediaManager().filter_video(name="myvideo")


>>> from joollu.core.managers import MediaManager
>>> entity = Entity.objects.get(name="myvideo")
>>> video = MediaManager().get_media(entity.id)

sadly it kinda impairs the possibilities of filtering and retrieving which is done very easily with Django. Here is an example for physical, if I want particular files:

>>> from joollu.core.models import *
>>> physical_list = Physical.objects.filter(file__endswith=".png")
>>> physical_list
[Physical: /Users/kennethnielsen/pythonspace/joollu/media/content/picture1.png]

Keeping the notions of things like video and images in the business layer increases flexibility. If I were to change attributes I can slap it on one of the exisitng interfaces or make a new one. Like if I want to save who and when something has been published, I can make a new model class Publishable and attach it to all content. This would be regardless of it being a video, image or other. It is up to the business layer to decide if published attributes are needed in specific cases. Comments is another example and so it goes..

There are a ton of other stuff worth mentioning about Django, it all just kinda fits together in this very intuitive framework. You should read up on it, or find the tech talk about it on google tech talks.


With all the J2EE projects being outsourced to India, I guess this zeitgeist graph makes sense:

Google searches for 'What is j2ee'

Look at the regions...