Four games programmers play, but shouldn’t
As a programmer, I like order, I like perfection, and I have spent a large part of my life chasing it. Perfection in programming comes in many forms: this code is as small as possible, as fast as possible, etc. Unfortunately, most perfection-metrics are counterproductive to achieving organizational goals.
Trying to achieve perfection-metrics is fun, and it feels essential, but it is more a game than it is work. In this post, I discuss some of the most common perfection-metrics and how gaming them is detrimental to desired outcomes.
Let’s just straight in to probably the most controversial one: performance. Optimizing code for performance is almost always orthogonal to readability and maintainability.
Breaking long methods into smaller ones incur some performance cost. Instantiating classes requires notoriously slow memory allocation, not to get into alignment and cache issues. And anyone who is all about them bits knows how slow conditionals and branching is.
So if methods, objects, and control flow operators are off-limits, we are left with only linear bit operations. Assembly gurus and some library architects can read these with relative ease, but they are far outnumbered by us regular folks who can only with great effort stumble through them.
These highly optimized methods can destroy teamwork and fragment the feeling of shared responsibility. Even worse, often, these optimizations are performed on gut feelings or based on assumptions that are no longer valid. Doing optimization without profiling is like performing surgery with your eyes closed. Even if you were a surgeon, that would be irresponsible.
By all means, chose proper data structures and design efficient user paths through your application. But intra-method optimization should only be done in isolated parts of the codebase, by skilled experts with the assistance of a profiler and reliable metrics.
The first doctrine I learned was Don’t Repeat Yourself (DRY). If I am not mistaken, it originates in the excellent book “Pragmatic programmer.” It means that any piece of information should only be represented once in a codebase. The most important reason for this is that if there is a chance in this knowledge, it represents (like a bug) we need only change it in one location.
While there is some merit to this chasing it fanatically — like many people do — is worse than not following it at all — in my opinion — here’s why.
The sentence “Don’t repeat yourself” is so easy to remember, and you hear it often in this short form that you might forget, or miss the fundamental reasoning completely. DRY encourages convergence in the code, i.e., it discourages divergence. Sometimes we want convergence, but this is not always the case, and we need to remember that. Otherwise, we end up spending time unifying, only to then spend time separating.
When “Pragmatic programmer” was published, relational databases were prevalent. Relational databases tended to follow the rules for normalization, which in practice means not duplicating data. The same arguments apply to this as to DRY; we risk losing data consistency.
However, normalization is not how I see the industry using databases. Performance and consistency are at odds with each other in a database, and many err on the side of performance. In NoSQL, duplicating data is more common than sharing it.
The most damaging effect of DRY is when the unified code lands between two separate teams. Shared ownership is equal to no ownership. Without clear ownership, people stop maintaining it, and they lose their feeling of responsibility. In these cases, I always recommend giving each team a copy of the code for which only they are responsible.
Very closely related is another perfection-metric: generality. In mathematics, we often search for the “master lemma.” Some general truths that we can use to prove a myriad of other things. So it is not difficult to see how this practice has carried over, resulting in a search for functions that can solve many problems.
Having everything in one generalized function encourages fast global change because we can change something everywhere by changing it in one place. Therefore changes riskier and time-consuming because we need to do more validation to get the confidence we need.
General functions also add to the cognitive load of everyone with access to it. They need to keep track of all the common functions, how they work, and any gotcha’s. With more to keep track of, we have less capacity for solving the tasks we are working on. In these situations, good documentation is crucial because it allows us to forget details as we can quickly regained them.
Imagine having spent time refining some code into a cool general function that can be used in so many situations. Only to not need this generality, because the product went in a different direction than we expected.
I recommend that you architect for your current system and immediate tickets. Postpone generalizing and unifying for as long as possible.
I learned Scheme, a functional language in the Lisp family, at university. Like many, I was stunned at how short the code as compared to the other languages I knew. This quickly became a mini-obsession with trying to make everything as few lines or characters as possible, an activity fondly called line golf. This obsession culminated in a rather fun challenge to my peers to implement a cross-product function in Scheme in 75 characters or fewer.
Contrary to the other perfection-metrics we have discussed: for most people, this is just a phase, and we grow out of it once we learn about the others. However, it turns out that a desire for small methods is beneficial. You can find much more about all the things in this post in my book, named after my current ‘par’ for line golf: