String Calculator kata in the Object-Oriented ways

The big thing about practicing katas is not just to finish them, but the possibility to rethink them while repeating. This works best when you try to rewrite a kata in different paradigms and compare the solutions. Today, I want to talk about a String Calculator kata and its object-oriented implementation. Beyond that, we are going to take a look at decomposition and some object-oriented design principles. Let’s dive into it.

This is an introductory theory-focused article in a series. There are four more articles in the series:

Each of the solutions focuses on a different way of implementing the kata. If you want to skip this article, just go straight to a list of kata solutions or github repo. However, I would highly recommend you reading this article or, at least, skimming through it.

TL;DR: Even in one programming paradigm there are different ways of doing the decomposition. The object-oriented decomposition involves the collaboration of different entities. Just using the class keyword in a single file does not make your program automatically object-oriented.

A little bit of a disclaimer. As always, there is no one “right” or “correct” answer in programming. One may say that some of the implementations are not the object-oriented ones, because objects don’t have any state. To me it is more about responsibilities, not the state. As Rebecca Wirfs-Brock said: “Objects are not just simple bundles of logic and data. They are responsible members of an object community.”

Introduction

The String Calculator kata is a prominent kata which was introduced by Roy Osherove very long time ago. The kata is an iterative one, with strict rules and average complexity. If you never did it, I highly recommend getting it done before reading further. So, you will have a better understanding of the problem and its decomposition.

Nevertheless, this kata is so prominent that there are lots of videos with different solutions in different paradigms. Interestingly enough, the majority of the solutions are procedural and functional. There are very few object-oriented solutions. This fact spurred me to share these possible solutions. Besides, I want to make this material as useful as possible, so, we are going to touch some object-oriented design principles and even draw some diagrams.

But before, we need to talk a little bit about the decomposition. In computer science the decomposition refers to the process of breaking down a problem into smaller, more intelligible, comprehensible, and manageable parts. There are different types of decomposition. We will be using the functional decomposition and object-oriented decomposition. We will get to the decomposition later. Before, we need to discuss the ways how katas can be solved.

In general, katas should be solved with the TDD technique. This means that we start by writing a failing test (red test) before any production code, then we provide the simplest possible implementation to make the test pass (green test), and we refactor the code. This TDD circle may look pretty straightforward, but it is not, especially the refactoring stage. But why so? To me, it happens because refactoring can be done on very different levels. There are different ways of how to think about refactoring levels. One of the ways is to distinguish between class level refactoring, story level refactoring, and big bang refactoring. So, it is very hard to decide which level of refactoring is safe and when, where, and to which extent you are allowed to refactor during the TDD practice. For example, Sendi Metz suggests the high-level restructuring only when all the requirements are met.

Beginning of the kata

The String Calculator kata consists of the following steps (requirements):

1. Create a simple String calculator with a method signature:
———————————————
int Add(string numbers)
———————————————
The method can take up to two numbers, separated by commas, and will return their sum. 
for example “” or “1” or “1,2” as inputs.
(for an empty string it will return 0) 
Hints:
——————
 - Start with the simplest test case of an empty string and move to one and two numbers
 - Remember to solve things as simply as possible so that you force yourself to write tests you did not think about
 - Remember to refactor after each passing test
———————————————————————————————
2. Allow the Add method to handle an unknown amount of numbers
————————————————————————————————
3. Allow the Add method to handle new lines between numbers (instead of commas).
   1. the following input is ok: “1\n2,3” (will equal 6)
   2. the following input is NOT ok: “1,\n” (not need to prove it - just clarifying)
——————————————————————————————-
4. Support different delimiters
    1. to change a delimiter, the beginning of the string will contain a separate line that looks like this: “//[delimiter]\n[numbers…]” for example “//;\n1;2” should return three where the default delimiter is ‘;’ .
    2. the first line is optional. all existing scenarios should still be supported
————————————————————————————————
5. Calling Add with a negative number will throw an exception “negatives not allowed” - and the negative that was passed. 
if there are multiple negatives, show all of them in the exception message.
————————————————————————————————
STOP HERE if you are a beginner. Continue if you can finish the steps so far in less than 30 minutes.
————————————————————————————————
6. Numbers bigger than 1000 should be ignored, so adding 2 + 1001 = 2
————————————————————————————————
7. Delimiters can be of any length with the following format: “//[delimiter]\n” for example: “//[***]\n1***2***3” should return 6
————————————————————————————————
8. Allow multiple delimiters like this: “//[delim1][delim2]\n” for example “//[*][%]\n1*2%3” should return 6.
————————————————————————————————
9. Make sure you can also handle multiple delimiters with length longer than one char

Now, with all this knowledge in hand, let’s start with the kata. I will just provide a general overview until the very crucial point – the 4th step. So, we start with an empty string (first step in our requirements), a string with only one number (first step), and a string with two numbers (first step). Then, we make sure that it works for an unknown number of numbers (second step). Then, we add a new line as a delimiter (third step). Then, we are supposed to add support for the different delimiters (fourth step).

Whether you don't consider any upfront design or you follow the preparatory refactoring by Martin Fowler you should notice something interesting before implementing the 5th step. During all the previous steps, we were working on a very straightforward algorithm with two sequential responsibilities (parse an input string and sum the result). One may say that it is already complex, and it has more responsibilities. To me, it is not (particularly, due to the fact that we weren’t supposed to check for valid input).

The preparatory refactoring is a technique which embodies the prominent quote: “make the change easy, then make the easy change.” by Kent Beck. In simple words it means starting each new feature with a refactoring session that should simplify the feature implementation.

Starting from the 5th step, our kata becomes more sophisticated because we have to introduce a new responsibility (we have to start validating the result of the parsing process). This new responsibility adds to the complexity, and it should be put somewhere in the code (however, we can continue without restructuring the code until we implement all the requirements and only then restructure it). The decisions on how to structure the code are guided by data, behavior, responsibilities, etc. In the majority of cases, the programming paradigm dictates how the code should be structured.

One important note here. Whatever programming paradigm it is, the programs are written for humans. The writing code for humans, not machines concept is truly important. I don't think it's necessary to elaborate on that. So, one of the ways to make code easier to understand is to decompose it into high-level policies (what to do) and low-level implementations (how to do it). This is a very old programming principle which was widely popularized by UncleBob.

Responsibilities

Speaking of responsibilities, let’s find out main responsibilities and identify the possibility of their changes. After examining the kata’s requirements, I would distinguish four main responsibilities:

  • extraction of numbers (we parse strings of different format and extract numbers from them) – a high possibility of change
  • validation of numbers (when something throws an exception on unexpected data it looks like validation to me) – an average possibility of change
  • filtering of numbers (some valid numbers should be removed or filtered before the addition operation) – an average possibility of change
  • addition of numbers (at we end we are expected to get the sum of numbers) – a low possibility of change

I know that some people may object and name the first responsibility just a parsing of inputs. Whatever it is named, its main responsibility is to identify the numbers in an input string, collect them, and return them. The other group of people may contradict on the number of responsibilities because there is a fifth one - a delimiter extraction. That's true, but let’s think about the responsibilities as steps. What exactly does the StringCalculator need to proceed the validation of numbers?

Let’s think about the kata’s solving algorithm in high-level steps:

  • extract numbers,
  • validate numbers,
  • filter numbers,
  • perform addition.

So, does the algorithm need a delimiter to proceed further? No, it doesn’t. It just needs a collection of numbers. To me, extracting a delimiter is a subtask of the numbers extraction task. And that’s the thing. The more functionality and knowledge you expose to the higher level, the more complex the program becomes at the end. So, be careful and mindful during the decomposition process.

Do you remember before I mentioned the decomposition? This is the time to decompose our code. Even though we use the object-oriented paradigm, there is the possibility to combine different types of decomposition. However, before decomposing the code we need to agree on String Calculator responsibilities.

Spoiler Alert: this is the very last warning! If you have never done this kata before, do it before moving further. Otherwise, the kata won’t be so interesting and beneficial.

Solutions

I’m going to provide four different solutions (or implementations if you like). The solutions are in this github repo. The repo consists of a master branch and 4 different branches for each solution. All the branches stem from master (from the 4th kata step where the important design decision was made). So, we can treat the StringCalculator as:

Important note: to make the differences between solutions as much demonstrative as possible, the code that is responsible for different pieces of functionality is going to be the same. For example, the code that separates the delimiter from the string of numbers is going to be the same if all four solutions (e.g. input.split('\n')). This approach will help us to concentrate on the code structure and separations of concerns rather than idioms and different code implementations.

The 10th step (reflection)

I would like to introduce a 10th step to the kata – a reflection step. I don’t mean reflection from the reflective programming, but a detailed and thoughtful analysis of the resulting code and its design. While writing this series of posts, I came up with a list of questions that may help you to reflect on the end result. Try to answer these questions:

  • Why the code is structured in this way? What are the pros and cons of this implementation?
  • Are you happy with the resulting code? Which parts of the implementation you don't like?
  • Does the code comply with the general programming principles (Separation of Concerns, DRY, KISS)?
  • Does the code comply with the object-oriented principles (GRASP, SOLID, especially the SRP and OCP)?
  • What happens if a new requirement comes into play? For example, a new input string format?
  • What happens if the requirements change in the parsing, validation, filtering, addition operation?
  • Which parts of the implementation can be improved? How? Why is this an improvement?
  • Use AI if you are up to.

Conclusions

As we've found out, even in one programming paradigm there are different ways of doing the decomposition. The way the code is decomposed affects the code maintainability, readability, understandability, testability, etc. But a decomposition does not guarantee that the code will be easy to change and maintain. Sometimes a decomposition suits well with a task, sometimes it doesn't. So, to get a good decomposition you need to experiment and reflect on different implementations.

Comment this page: