Canary in the code mine

Abstraction in programming

There is this concept is in programming called "premature abstraction]". A closely related term is "over-engineering", it also has its own slogan: YAGNI. But there are also things that you are going to need that are a lot harder to include later than now. So how do you decide if you are over-engineering or not? Most advice related to how not to over-engineer is basically this:

Let's start with what an abstraction is

Here is a video of Alfred Korzybski1 explaining what an abstraction is Abstraction is a pattern in the information we get from reality. We abstract because it makes acting in the world possible. Abstractions are bosonic, or wave like, in that multiple things can occupy the same place. Like multiple signals can be extracted from the same waveform. The same stream of ones and zeroes can be interpreted in multiple ways. We can get multiple abstractions from the same pattern in reality, the rubin vase illusion illustrates it nicely. It illustrates it better than Korzybskis example because the picture does not change at all, the change is only in the interpretation.

If you make one interpretation explicit by naming it, it hides the others. And what are the chances that you have the best abstraction? I am not sure if there is a best one, but I have heuristics to find good ones and avoid bad ones, I will explain those in other blog posts down the line.

Almost all of abstractions have to stay abstract, implicit, because some data can have any number of interpretations. Different abstractions can be applied to the same data. For example the same bites can be seen

From a mathematical viewpoint Abstraction is invariants. An expansive abstraction is a global invariant for the system we are currently working on. Conservation of energy comes from continuous translation symmetry, if the invariance of a system of equations under translation holds, it will have conservation of energy. As a program matures, its context typically expands. If you rely on encapsulating abstractions, they will likely need to change. In contrast, building on expansive abstractions is more stable, as they can accommodate growth without modification. If you write a function: f_drag(), you can use it within a certain velocity interval, pressure interval, etc... But if the context widens: you want to get to the moon, f_drag() either changes to handle the wider context or you rename it to f_drag_with_limits(). When choosing abstractions that you want to reuse:

Expansive abstractions can serve as ground truth data, while encapsulating abstractions should be more flexible, pushed to the boundaries of the system and not built on. I talk more about this in Modeling - territory first.

Here are some properties that expansive abstractions have:

In the following blog posts I will illustrate some examples and explain how these properties apply. Since abstraction is an overloaded term in programming, I will use the term modeling instead. These heuristics are in different blog posts, but they are not necessarily separable, and they reinforce each other.

  1. He is the originator of the phrase "The map is not the territory"