Hawaiʻi's Technology Community

Don’t Repeat Yourself (DRY)

Here is the Wikipedia definition of DRY:

In software engineering, Don't Repeat Yourself (DRY) is a principle of software development aimed at reducing repetition of information of all kinds, especially useful in multi-tier architectures. The DRY principle is stated as "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system." The principle has been formulated by Andy Hunt and Dave Thomas in their book The Pragmatic Programmer. They apply it quite broadly to include "database schemas, test plans, the build system, even documentation."[1] When the DRY principle is applied successfully, a modification of any single element of a system does not require a change in other logically unrelated elements. Additionally, elements that are logically related all change predictably and uniformly, and are thus kept in sync. Besides using methods and subroutines in their code, Thomas and Hunt rely on code generators, automatic build systems, and scripting languages to observe the DRY principle across layers.

DRY is a principle that is often used to check for code smells and this is a good thing. What is often forgotten during these checks is that DRY is more than refactoring out duplicated code. It is about factoring out duplicated code that share logically related code paths. The ability to identify this type of code requires a deep understanding of the system and its goals. Failure to properly follow the principle results in code that has massive classes with static properties representing every possible shared value. It results in code that has a switch statement that is several printed pages long. It results in code that ends up with several checks for special conditions but no one can remember why those checks were added.

On a recent project a programmer noticed that there is a class full of methods that repeat the same pattern of operations. “This violates DRY”, he exclaimed. “This should be re-written this so that it is these methods do not duplicate functionality”. At first glance there did appear to be a lot of duplication. After a few short discussions there was a new proposed implementation. The solution worked out to be a constructor that registered methods based on the type of object and a single entry method that accepted a generic type. The method would look at the type and call the correct method. This approach would significantly reduce the code and removed what appeared to be a significant amount of duplicated code. What was not considered is what the relationship was between all of the refactored methods. The methods did the same kind of work but with distinctly different objects in distinctly different code paths. The proposed solution resulted in the inability to change one code path without impacting all others. The new solution was initially widely praised but with some hurt feelings it was eventually shot down. The difficulty was that it was easy to imagine the duplication and it was easy to see the impact of removing all the methods. It was hard to see was the long term effects – even in the face of real world problems that would soon need to be solved.

When there is a choice to repeat code or not do not simply assume that duplicated code is always bad. Evaluate the code paths that call through the code in question what is it used for? If there is a need to change it for one path does that change apply to all others? Will you need to put in branching logic for specific paths? After a thoughtful analysis of the code better decisions can be made and there can be more confidence that the code is written the right way.

Don’t Repeat Yourself (DRY)

You need to be a member of TechHui to add comments!

Sponsors