Gunther Popp´s Blog: September 2010

Tuesday, September 21, 2010

OpenProj - Not bad, but ...

I am not an expert in project planning tools at all. But if a customer asks me to take the role of a technical lead or to coach another lead in a project, I will need a tool like MS Project. A lot of people take Excel instead, but over the years I got used to MS Project and now I really like it.

A technical lead typically has at least some planning tasks to fulfill. He must estimate the effort of the upcoming tasks, put them in some time line and assign the members of his team to the tasks. I am using two different kinds of tools to get this part of the job done as easily and efficient as possible: a task management system like Redmine or Jira. And a project planning tool, which up to now typically was MS Project.

In MS Project I'm performing the time and resource planning. For the actual assignment of the tasks and the tracking of the progress I'm using the task management system. One problem of this approach is, that some customer won't pay for a MS Project license (yes, I could buy one myself, of course. But often I am not even allowed to bring my own laptop). As I recently talked with a colleague about this problem, he guided me to OpenProj. OpenProj is an open source project planning tool written in Java. Its GUI and the basic feature set is very similar to MS Project.

I did some quick tests to check if the features I use most in MS Project work in OpenProj, too. The following features of OpenProj work OK for me:

Create resources with individual calendars
Choose individual columns for the task list
Easily enter tasks in the Gantt-View.
Enter estimated effort of a task in hours or days
Assign one or multiple resources to a task by simply entering the name in the task list
Duration of a task is calculated automatically based on the entered effort
Dependencies to other tasks can be entered manually in the task list
Tasks can be easily grouped (this worked only partly for me, because I didn't manage to group tasks by keyboard)
Aggregated properties (effort, duration, ...) of a task group are calculated automatically
Easily create milestones (by entering a duration of zero days, like in MS Project)
Choose different planning restrictions for a task (starts at, ends at, enter a delay, etc.)
Easily monitor the resource usage. In fact, this works even better than in MS Project, because OpenProj offers some comfortable, combined views of Gantt charts and resource usage.
Highlight the critical path in the Gantt chart
Export task list to excel. This basically works by Copy&Paste, but is not very comfortable. For example, you loose all task groupings when pasting the tasks in Excel.

Now for the bad news: OpenProj is not able to automatically level the resource usage (at least I have nothing found in the documentation and in the GUI about this feature). That is, if you assign multiple tasks to the same resource in the same period of time, this resource will remain over-allocated. You can easily monitor this in the resource usage view, but you have to resolve it manually.

Summary: OpenProj supports most of the basic functionality I'm personally using in a desktop planning tool. However, its lack of automatic resource leveling is a show stopper for me. I could work around this in small projects. But, honestly, in small projects I would rather use some sort of simple ToDo-List Utility like the one from AbstractSpoon Software.

Wednesday, September 15, 2010

The pitfalls of software metrics

In this post I'm going to talk about software metrics in general and specifically about the pitfalls involved with using them. What people are trying to do when using software metrics, is to capture certain aspects of a software system in an, at the first glance, easy to understand number. This is why metrics are fascinating, especially for project leads and managers. Software is kind of an "invisible product" for people that are not able to read source code. So all means to shed some light on what this expensive bunch of developers is actually doing all day are very welcome.

If you're an architect or technical lead in your project, it is part of your job to improve transparency for the management. People that give the money certainly should know what progress you are making and if the quality the produced code is OK. Metrics can help you in satisfying the demand for information and yes, control. But in my experience metrics can cause more harm to a project than they might actually help, if they are applied careless.

To prevent this, we at first need some basic understanding what software metrics actually are. Typically software metrics are divided up in three categories:

Product metrics: Metrics of this kind are used to measure properties of the actual software system. Samples are system size, complexity, response time, error rates, etc.
Process metrics: Process metrics tell you something about the quality of your development process. For example, you could measure how long it takes in average to fix a bug.
Project metrics: Are used to describe the project that builds the system. Typical examples are the budget and the team size.

All metrics can be further classified based on the measurement process. Some metrics can only be measured manually (like team size) others are calculated automatically from the project artifacts. The most popular example for the latter type of metrics is the LOC (lines-of-code) metric.

The LOC metric is a good candidate to explain why careless usage of software metrics can be harmful. Lets assume the management request some form of weekly or monthly reporting about the progress of the running projects. At first glance, an easy way to fulfil this request is to run some tool on the source code, calculate the LOC (and, possibly, some other metrics the tool happens to support), put everything in an excel, generate a chart and put this into the project status report. I am pretty sure people will like it. First, charts have been very welcome in nearly all status meetings I've ever been in. And secondly, the LOC metric is very easy to understand. More lines of code compared to the last meeting sounds good to anyone.

However, things will get complicated as soon as the LOC number remains constant or, even worse, shrinks from meeting to meeting. Now, don't blame the people attending the meeting for getting nervous. They asked for a progress indicator and you gave them LOC. The good thing about the LOC metric is that you can calculate it easily. The bad thing is, it does not measure the system size. It measures the number of source code lines (whatever that is, but we'll ignore this detail for now). The count of source code lines is only an indicator of how "big" a software system is. For example, a skillful developer will reuse code as often as possible. This involves reworking existing code in a way that new code can be implemented in an efficient way. At the end of the day, maybe just a few, none or even LESSER lines of code have been "produced". Nonetheless, the system size has increased. Unfortunately, only the coolest of all project leads and managers in a status meeting will believe this ...

To avoid the described scenario, we must select software metrics carefully. The selection criteria must not be that a metric is supported by a certain tool out of the box. Instead, the goals of the measurement process are relevant. In the example, the overall goal was to provide a report on the project progress. Now, we formulate questions we have to answer in order to reach this goal. Since we're talking about a software project here, a good question would be "How many use cases have already been implemented? ". This still is rather generic. So we get into more detail:

Is the source code checked into the repository?
Do module tests exist and are they executing without errors?
Have the testers confirmed that the implementation is feature-complete according to the use case description?

In a third step, we derive metrics that are able to answer the questions. The metric for the above example would be called Completed Used Cases. If the project uses a task management system like Jira, Redmine, Trac or whatever, the metric can even be calculated automatically using the reporting functions.

Because of the three steps Goal, Question and Metric this approach is also known as GQM approach. In my experience it works OK. One of its main advantages is that the used metrics are well-documented - just take a look at the questions. Additionally, the GQM approach is easy to apply in practice, everybody understands it and it puts the tools at the end of the process and not at the beginning.

Tuesday, September 7, 2010

To DTO or not to DTO ...

In every single Enterprise Java project I am faced with the question: To DTO or not to DTO? A DTO (Data Transfer Object) is a potential design element to be used in the service layer of an enterprise system. DTOs have a single purpose: To transport data between the service layer and the presentation layer. If you use DTOs you have to pay for it. That is, you need mapping code that copies the data from your domain model to the DTOs and the other way round. On the other hand, you get additional functionality from the DTOs. For example, if you have to support multiple languages, measurement systems and currencies, all the transformation stuff can be encapsulated in mapping code that copies data between domain model and DTOs.

Over the years I´ve collected some personal guidelines that help me deciding the DTO-question in new projects. Here we go:

When NOT to use DTOs:

Small to mid-sized project team. That is: At max 5 people.
Team sits in one location.
No separate teams for GUI and backend.
If you don't use DTOs, the presentation and the backend will be tightly coupled (the domain model will be used directly in the presentation layer). This is OK if the project scope is limited and you know more or less what the product will look like in the end. It is definitely NOT OK if you work on some kind of strategic, mission critical application that is likely to live for 10-15 years and will be extended by multiple teams in the future.
Highly skilled team with developers that understand the technical implications of tight coupling between backend domain model and presentation. For example, if you use some kind of OR mapping framework to persist the domain model, the team has to understand under what conditions what data is available in the presentation layer.
Only limited I18N requirements. That is: Multiple languages are OK, multiple measurement systems, complex currency conversions and so on are not.

When to use DTOs:

Teams with more than 5 people. Starting with this size, teams will split up and architecture needs to take this into account (just a side note: the phenomenom that the architecture of a system corresponds to the structure of the development team is called Conway´s law). If teams split up, the tight coupling between backend and presentation is inacceptable, so we need DTOs to prevent this.
Teams that work distributed at several locations. The worst-case examples are projects that use some kind of nearshoring or offshoring models.
The need for complex mapping functionality between domain model in the backend and the presentation layer.
Average skilled team, you have junior developers or web-only developers in your team. The usage of DTOs allows them to use the backend services as a "black-box".
Reduce overhead between backend and presentation. DTOs can be optimized for certain service calls. An optimized DTO contains only those attributes that are absolutely required. Popular examples are search services, that return lists of slim DTOs. In fact, performance optimization is often the one and only argument to introduce DTOs. IMO, this factor is often overemphasized. It may be very important for "real" internet applications that produce a very high load on the backend. For enterprise systems, one should carefully consider all pros and cons before introducing DTOs on a wide scale in an archictecture "just" for performance reasons. Maybe it is good enough to optimize just the top 3 search services using sepcialized DTOs and use domain objects in all over services (especially the CRUD services).

Consequences of the usage of DTOs

As I mentioned already, DTOs don´t come for free. It is crucial to document the consequences of the introduction of DTOs in the software architecture documentation. The stakeholders of the project must understand, that DTOs are a very expensive feature of a system. The most important consequences of DTOs are:

You must maintain two separate models: a domain model in the backend that implements the business functionality and a DTO model that transports more or less the same data to the presentation layer.
Easy to understand, but nonetheless something you should mention several times to your manager: Two models cost more than one model. Most of the additional costs arise after initial development: You must synchronize both models all the time in the maintenance phase of the system.
Additional mapping code is necessary. You must make a well considered design decision about how the mapping logic is implemented. One extreme is the manual implementation with tons of is-not-null checks. The other extreme are mapping frameworks that promise to do all the work for you automagically. Personally, I´ve tried out both extremes and several variations inbetween and have still not found the optimal solution.