return first example

It seems my “return first” post was not as enlightening as I had hoped. It was posted on reddit, and while the majority of commenters completely missed the point, it wasn’t really clear for those that did not just read the title. Either way, I am to blame for that – the examples and my reasoning were not very conclusive. So let me try clearing up the confusion with a better example.

First things first, here’s the mantra again: Whenever you want to call a function, ask yourself:

Can I return first?

But now to the example:

Parsing array braces

The task was to parse a string with a data-type in it. This was already working for single-value types, so we could parse "int", "double", "string" etc, via the function from_input_type. Now I was to extend it to also parse array definitions with one or two fixed dimensions, like "int[5]" or "double[4,7]".

My first attempt, implementing it as a constructor taking the definition string, looked like this:

auto suffix_begin = type_code.find('[');
if (suffix_begin == std::string::npos)
{
  this->type = from_input_type(type_code);
  return;
}

auto suffix_end = type_code.find(']', suffix_begin);
if (suffix_end == std::string::npos)
{
  throw std::invalid_argument("Malformed attribute type suffix: no end brace.");
}

auto type_tag = type_code.substr(0, suffix_begin);
this->type = from_input_type(type_tag);
auto in_brackets = type_code.substr(suffix_begin+1, suffix_end-suffix_begin-1);

auto separator = in_brackets.find(',');
if (separator == std::string::npos)
{
  this->rank = attribute_rank_t::1d;
  this->dim[0] = parse_size(in_brackets);
  return;
}
  
auto first = in_brackets.substr(0, separator);
auto second = in_brackets.substr(separator+1);

this->rank = attribute_rank_t::2d;
this->dim[0] = parse_size(first);
this->dim[1] = parse_size(second);

It’s not pretty, but it passed all the tests I set up for it. And this was pre-refactoring. I knew there was something else coming up: In a different constructor, we wanted to parse type definitions that look similar, but are not quite the same. Instead of 1 or 2 fixed dimensions, the brackets have to be empty there, e.g. "float[]" or "string[]". Note that they are still optional, it can still have single-values as well.
Now I wanted to reuse the code to locate the brackets, but the current structure wasn’t really well suited for that, with the member initialization spread all over the function. Obviously, the code parsing the contents of the brackets (from the auto separator = ... line down) was of no use for the second case, the first half is the interesting bit here. So I was looking at the calls to from_input_type in the upper half and asked myself: Can I return first, before calling this? The answer is, of course, yes.

struct type_with_brackets_t
{
  std::string_view type;
  std::string_view in_brackets;
  // There's a difference between empty brackets (e.g. string[])
  // and no brackets (e.g. string)
  bool has_brackets = false;
};

type_with_brackets_t split_type(std::string_view const& type_code)
{
  auto suffix_begin = type_code.find('[');
  if (suffix_begin == std::string::npos)
  {
    return {type_code, {}, false};
  }

  auto suffix_end = type_code.find(']', suffix_begin);
  if (suffix_end == std::string::npos)
  {
    throw std::invalid_argument("Malformed attribute type suffix: no end brace.");
  }

  auto type_tag = type_code.substr(0, suffix_begin);
  auto suffix = type_code.substr(suffix_begin+1, suffix_end-suffix_begin-1);
  return {type_tag, suffix, true};
}

With this, we can replace the upper half of the first function with:

auto [tag, in_brackets, has_brackets] = split_type(type_code);
this->type = from_input_type(tag);
if (!has_brackets)
  return;

/* continue parsing in_brackets */

The other “int[]” case can obviously be implemented very easiely now:

auto [tag, _, has_brackets] = split_type(type_code);
this->type = from_input_type(tag);
this->is_array = has_brackets;

Of course, when just extracting the code as a function, you could be tempted to also call from_input_type in that function, but return first guided us away from that. I think this is a very good outcome, as it clearly separates splitting the string and interpreting the parts, naturally eliminating the duplicated from_input_type call. You can still have a function that does both, if you want, by adding a small facade araound split_type that also does the conversion.

I hope this example cleared up the method a bit more. One reason why deeply nested function calls are so common is that most languages make it easier to pass parameters than return multiple values. You will often find that this style will require more custom data-types that are just used as function return values. But functions will naturally compose easier because you will bundle smaller pieces, e.g. in this case, you can use the function without from_input_type, and I believe that will pay off in the end.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.