Thursday, December 24, 2009

Reflections towards an ethic for software development

I find it very valuable to think about the philosophical implications of what we do. Even though I would not define myself primarily as a software developer, I have been working as one for quite some time now. Recently, I started thinking about the implications that the moral teachings of Kant would have on software development.
The next posts will summarize the main aspects that interest me about the subject of ethics in software development. My main questions are:
1. What is the moral (ethical) implication of design? Does it make a difference whether one designs well or not (provided the end-result is, in both cases, a working system)? What, if any, is the moral or ethical worth of design?
2. Are we free when we are working for a corporation? Is there a possibility of freedom within the contraints of the enterprise?
3. What does it mean to design well? Ethically, what can be considered 'good design'?
Since these are philosophical questions, I can only promise to attempt to provide my position, but, of course, no definitive answers. I will sacrifice some academic rigour in favor of publishing my ideas somewhat faster. I don't claim to own the truth, I'll just make my attempt at writing my version of it :-)

Monday, December 21, 2009

Passacaglia

En el fondo, saber que algo de eso es mio, que en verdad me pertenece, es quizas lo mas importante.

Los guardianes defienden algo asi como un castillo, pero no entendamos su acepcion un poco estilizada, sino mas bien como una defensa absoluta, completamente infranqueable, pero a su vez clasica

para llegar al centro, tan sutil de una belleza que ni siquiera hubiera podido ser sospechada, de una paz, una paz absoluta, insinua ese tema, tal vez el tema al final lo fastidia, tal vez lo hace infeliz, pero en ese momento lo logra interpretar como si fuera el canto de angeles, disculpen la cursileria y mira, y ve nevar por la ventana

desata su fuerza contra todos aquellos que no sirven para nada. Tambien es cierto que sus posibilidades de ataque pueden parecer risibles, pero, si tenes oidos, si tenes alma, pone atencion a como te esta liquidando, que despues de esto no queda nada de vos, pero cuando digo nada me refiero a nada.

Liquidame, eliminame, pero que sea con estos acordes

Perdon. Perdon. Perdon.

Wednesday, December 16, 2009

Is performance the main concern when choosing an ORM tool?

After yet another discussion in the office about ORM, and why it is a waste of time doing DALs manually, I was assigned the task of benchmarking Hibernate’s performance against using stored procedures (SP) and prepared statements. While I was looking for data out there, as well as trying to imagine what test case I could implement that would be representative, I did some reading and thinking about the assignment itself. If you think about what Hibernate has to offer, it is clear that measuring its performance advantages requires real-world problems and situations that are hard to invent just for benchmarking. Features such as lazy loading, caching, etc., are really hard to simulate in a lab experiment, as many variables must be taken into account (of course not the least of them is how the system will actually be used).

The question then arised: how important is performance when choosing an ORM tool?

Let me first state that I have experience with Hibernate, and that in this case the decision would be whether to use an ORM tool (NHibernate) or just do everything manually. It is not about choosing among ORM tools.

I like to take decisions based on data. Having worked (and still working frecuently) as a Business Analyst I think the data is fundamental to decide whether some feature is worth it or not. But when it comes to ORM tools, are numbers really that defining a factor?

What has happened to design? If I was to choose just one thing that I liked about Hibernate it would certainly be how naturally it is incorporated into the programming language. I think they call it idiomatic persistence. Being able to design your business domain using only one idiomatic paradigm, namely object-oriented design, is a great thing. It greatly reduces the effort to keep everything up-to-date as well as the time you need in order to have your application working.

I have always thought that, if you design something right, then you will have at least decent performance to start with. Another important aspect about performance: let the bottlenecks show themselves before fixing them! I’d rather have a stable and functioning system at the end of a development sprint with passable performance, that an incomplete system with very efficient algorithms that doesn’t work from the end-users’s perspective.

But coming back to Hibernate and performance: there are many optimizations that Hibernate performs “out of the box” which are not trivial to implement. I recommend that you read the Hibernate's Performance Q&A, but let me quote the highlights:

  • Caching objects.
  • Executing SQL statements later, when needed.
  • Never updating unmodified objects.
  • Efficient Collection Handling.
  • Rolling two updates into one.
  • Updating only the modified columns.
  • Outer join fetching.
  • Lazy collection initialization.
  • Lazy object initialization.

Of course, one could argue, we need numbers to prove that these optimizations are worth something. But it is difficult –if not impossible- to prove how much of a performance gain you would get e.g. by caching objects in a non-productive setting. It is much easier to understand it when you have the application running, and then you switch to using caching on your actual workload. Optimizations are generally very specific to the environment that you are writing your software for, meaning that benchmark data may not always apply to your situation. Quoting from the same QA mentioned above:

The first category of benchmarks are trivial micro benchmarks. Hibernate of course will have an overhead in simple scenarios (loading 50.000 objects and doing nothing else is considered trivial) compared to JDBC. (…) these numbers are meaningless for real application performance and scalability.

I have to agree with that statement. What good is it to prove that Hibernate is slower than SPs retrieving objects from the database in a trivial scenario? This kind of performance is not the one I am looking for when designing a DAL, as the development cost of manually assembling all objects, relationships, etc. generally speaking exceeds the small performance penalty that you incur by using an ORM tool.

The services that Hibernate provides go way beyond loading and storing objects in the database. It is not just what you do (since in the end of course a lot of CRUD operations will, of course, occur) but how it is accomplished. I’d like to see a benchmark about development effort, using Hibernate and doing the DAL manually. That would be interesting. Or benchmarking design, code readability, and maintainability. Those seemingly intangible things that, in the end, actually make the software work rightly.

The reason to choose an ORM tool over manually implementing the DAL is not primary performance, but design. The ability to write clearer code and make it more understandable, and resembling more closely the business domain can out weight small performance penalties. Hibernate does offers important optimizations out-of-the-box and generally performs very well, if used correctly.