Wednesday, August 28, 2019

A C++ Pitfall I was just caught by

It's be a long time that I was using C++ without surprise. I'm not saying my C++ code was totally bug free, but the mistakes were commonly due to some kind of carelessness and can be easily identified, understood and fixed. Until today... So, I think I should share it.

Let's say we have a vector of string:

std::vector<std::string> tokens;

Now I iterate over each element of the vector.

for (int i = 0; i < tokens.size(); i++) {
  ...
}

The index i is useful within the body of the loop. Everything goes as expected.

Then, for some reason, I want to do one more loop than the number of tokens. So I change the code to:

for (int i = -1; i < tokens.size(); i++) {
  ...
}

Is this correct. No. And even worse, it is supposed to generate some signals along with others. It does not crash. And the code was for experiments with no test cases covering it yet. It was found until I happened to carefully check the generated data. And I went back to read the code again and again. The -1 was not a constant in the real code but was generated by an expression. All of a sudden, I realized the problem.

What is the problem then? The unsigned int. For some reason, std::vector::size (and all other sizes in std) is of size_t, which is an unsigned integer. While i is a signed integer. Now, in the condition expression i < tokens.size() in C++, i is first cast into an unsigned int before being compared with the unsigned int. So, -1 becomes 0xFFFFFFFF (64-bits) which fails the condition and the whole loop body never runs.

This nearly wasted several days of my time. I begin to miss the requirement of explicit cast in Go for this situation.