Review heuristic: Call out bad code

Posted on 2018-09-12

Code health tradeoffs in larger codebases

When people program, they often need to make tradeoffs between what will fix the current problem quickly, and what will make for a healthy codebase. Examples of such a tradeoff is:

I found a function which does almost what I want, but not quite. Do I wait for its maintainer to let me refactor it so I can reuse it, or do I just copy-paste? Or - if it’s just a 5-line chunk, do I extract it into a function if it will require a bunch of boilerplate (maybe it would necessiate adding a new module)?

It’s often a question of “do I pay the bigger upfront cost now, or do I make future-me front it”. These days I often work in a really big codebase, where there ends up being a bunch of those.

My review tip for copy-pasted code

When you review someone’s code that adds similar technical debt, you might not want to force the author to go and dig in 4 files just to extract out a function. It might frustrate them. And also, often it’s really not the best thing to do. If you really only duplicate the same 5-line chunk in two places, and it’s not often changed and not tricky, it might really be less costly to copy it than to agressively share such code.

So the request I often give in code reviews is: “if you duplicate code, add a TODO that the code is duplicated” (preferably to both copies).

This way, you allow the reviewer to quickly go along their day. But also you leave a “hey, this is a known bit of technical debt” affordance in the code.

When someone ends up copy-pasting the same code a third time, you will be much more likely to look and say “hey time to extract a function, there is this TODO which says it’s not the first time we’re doing this”.

General heuristic: “feel the pain”

There is a more general heuristic, which I use, that I call “feel the pain”. The heuristic is that bad things should be obviously painfully bad. Other instances of “feel the pain” are:

Let’s say someone breaks the contract of your Frobnicate RPC and sometimes doesn’t pass a required parameter foo, and you need to fix your service to accept that. Fine. But don’t do it like this:

void HandleFrobnicateRPC(const string& foo, const string& bar) {
  if (foo.empty()) {
    foo = ComputeDefaultFoo(bar);
  }
  ...
}

Do it like this:

void HandleFrobnicateRPC(const string& foo, const string& bar) {
  if (foo.empty()) {
    // NOTE: As of 2018-09-12, the Bazinator service calls Frobnicate with
    // an empty 'foo'. The Frobnicate service cannot be easily fixed to pass
    // the 'foo' itself, because it does not currently have access to the
    // backend which can find the right foo for the bar. If the bar has
    // multiple associated foos, this will only return the last written foo.
    // Other clients SHOULD NOT rely on this.
    foo = ComputeDefaultFoo(bar);
  }
  ...
}

Why? Because that way, people are less likely to build more hacks on top of this hack, because there is a long paragraph full of scary words.

Let’s say that you inherited 1 million LOC from someone in a hurry, and at some point, they made a misguided architectural decision that makes your code clunky and hard to understand. Create a central bug for this in your bug tracking system, and whenever you write new code that would be made better by fixing the bug, add a note like:
```
// TODO(agentydragon): Once we resolve b/12345, we'll be able to replace the
// Frobnicator with a mock for tests, so our tests won't need such
// complicated fixtures.
```
Why? Because that way once you get around to fixing b/12345, you will be able to grep all places where this hurt some code, and fix them one by one. Also, let’s say some kind soul needs to change this code 2 years after you’re done with it, and finds that TODO. (That person might be you.) When they see this TODO, they might go “hmmm, that bug number looks quite low. aha, it’s been filed 2 years back, and it’s fixed now. yay! that means that I can fix this now and the diff for my new feature will be about 20% less horrible!”
Let’s say your code processes two different kinds of things, which both happen to be sort-of-strings (indulge me for a second and assume your code does not use strong types for such things). Think “URLs and street addresses”. For some hacky reason, which you hope to get rid of at some point, you are computing one from the other. What I would do in that case is make that code obviously painful and horrible using devices like variable and function names and type aliases. Not like this:
```
string GetResult(const string& input) {
  return UrlEncode(input + "-autogenerated");
}
```
More like this:
```
using StreetAddress = string;
using Url = string;

// Long and verbose comment about why this is necessary and what it should be
// replaced with and when.
Url BuildFallbackUrlFromStreetAddress(const string& address) {
  return UrlEncode(address + "-autogenerated");
}
```

Even generaler heuristic: “Call out problems”

There is actually an even more general heuristic than this, which is “call out problems”. I also use it in other contexts.

Let’s say I am writing a design document, and I notice that I did not actually verify some assumption that I’m making, let’s say, “when a customer frobnicates a Foo without also having a Bar, Frobnicator service will not give them cake”.

When I notice that and don’t have time to immediately verify it, I’ll openly say, “I believe that when …, then … happens, but did not verify that.” This is good for when your future-self might be going over your notes later, and, reading that doc, starts assuming that the doc is an authoritative source on the particularities of customers getting cake from the Frobnicator service.

Same about your colleagues. If you say “I did not verify this”, someone who has nagging doubts might be much more likely to say “hey I’m not sure because I’m a human and I forget but I think I got some cake last month when I tried a different thing”, instead of thinking “huh. so he says Frobnicator service does not give cake in this case. I guess they changed it or I don’t remember it correctly.”.
Let’s say that I’m talking with someone and they raise a counterpoint to my preferred opinion that I never considered yet. Instead of immediately rushing to defend myself, I sometimes try to instead take in the new thing, and sit there for a bit with the dissonance (“but but but I want to be right so this thing must be wroooong aaaaaaaah”, or maybe “but but but I don’t want to have to redo this 3k line change that I love aaaaaarrrhh”). And instead of “Hmm, but your proposal would not address the …insert ad-lib…”, say, “Huh. I didn’t think of that yet. I’ll need to think about that.”

“Call out problems” feels sort of close to non-violent communication. I feel much less like openly communicating problems if it feels like the environment will hurt me if I do that. That happens to me with some strong personalities, or with people who (probably mostly unknowingly) trigger my sense of “argh I’m being bullied I want to curl up in a corner”. On the other hand, I usually try to call out problems more often than people around me, because I want to nudge culture towards cooperation. (Google culture is pretty blameless, so I don’t feel I’m acting against my interests.)