Tip of the Week #120: Return Values are Untouchable

Originally posted as TotW #120 on August 16, 2012

by Samuel Benzaquen, ([email protected])

Let’s suppose you have a code snippet like below, which relies on an RAII cleanup function, which seems to be working as expected:

MyStatus DoSomething() {
  MyStatus status;
  auto log_on_error = RunWhenOutOfScope([&status] {
    if (!status.ok()) LOG(ERROR) << status;
  });
  status = DoA();
  if (!status.ok()) return status;
  status = DoB();
  if (!status.ok()) return status;
  status = DoC();
  if (!status.ok()) return status;
  return status;
}

A refactor changes the last line from return status; to return MyStatus(); and suddenly the code stopped logging errors.

What is going on?

Summary

Never access (read or write) the return variable of a function after the return statement has run. Unless you take great care to do it correctly, the behavior is unspecified.

Return variables are implicitly accessed by destructors after they have been copied or move (See Section 6.6.3 of the C++11 standard [stmt.return]), which is how this unexpected access occurs, but the copy/move may be elided, which is why behavior is undefined.

This tip only applies when you are returning a non-reference local variable. Returning any other expression will not trigger this problem.

The Problem

There are 2 different optimizations on return statements that can modify the behavior of our original code snippet: NRVO (See TotW #11) and implicit moves.

The before code worked because copy elision is taking place and the return statement is not actually doing any work. The variable status is already constructed in the return address and the cleanup object is seeing this unique instance of the MyStatus object after the return statement.

In the after code, copy elision is not being applied and the returned variable is being moved into the return value. RunWhenOutOfScope() runs after the move operation is done and it is seeing a moved-from MyStatus object.

Note that the before code was not correct either, as it relies on the copy elision optimization for correctness. We encourage you to rely on copy elision for performance (See TotW #24), but not correctness. After all, copy elision is an optional optimization and compiler options or quality of implementation of the compiler can affect whether or not it happens.

Solution

Do not touch the return variable after the return statement. Be careful of destructors of local variables doing so implicitly.

The simplest solution is to separate the function in two. One that does all the processing, and one that calls the first one and does the post-processing (ie logging on error). Eg:

MyStatus DoSomething() {
  MyStatus status;
  status = DoA();
  if (!status.ok()) return status;
  status = DoB();
  if (!status.ok()) return status;
  status = DoC();
  if (!status.ok()) return status;
  return status;
}
MyStatus DoSomethingAndLog() {
  MyStatus status = DoSomething();
  if (!status.ok()) LOG(ERROR) << status;
  return status;
}

If you are only reading the value, you could also make sure to disable the optimizations instead. That would force a copy to be made all the time and the post-processing will not see a moved-from value. E.g.:

MyStatus DoSomething() {
  MyStatus status_no_nrvo;
  // 'status' is a reference so NRVO and all associated optimizations
  // will be disabled.
  // The 'return status;' statements will always copy the object and Logger
  // will always see the correct value.
  MyStatus& status = status_no_nrvo;
  auto log_on_error = RunWhenOutOfScope([&status] {
    if (!status.ok()) LOG(ERROR) << status;
  });
  status = DoA();
  if (!status.ok()) return status;
  status = DoB();
  if (!status.ok()) return status;
  status = DoC();
  if (!status.ok()) return status;
  return status;
}

Another example:

std::string EncodeVarInt(int i) {
  std::string out;
  StringOutputStream string_output(&out);
  CodedOutputStream coded_output(&string_output);
  coded_output.WriteVarint32(i);
  return out;
}

CodedOutputStream does some work in the destructor to trim unused trailing bytes. This function can leave garbage bytes on the string if NRVO does not happen.

Note that in this case you can’t force NRVO to happen and the trick to disable it won’t help. We must modify the return value before the return statement runs.

A good solution is to add a block and restrict the function to only return after the block is finished. Like this:

std::string EncodeVarInt(int i) {
  std::string out;
  {
    StringOutputStream string_output(&out);
    CodedOutputStream coded_output(&string_output);
    coded_output.WriteVarint32(i);
  }
  // At this point the streams are destroyed and they already flushed.
  // We can safely return 'out'.
  return out;
}

Conclusion

Don’t hold on to references to variables that are being returned.

You can’t control whether NRVO happens or not. Compiler versions and options can change this from underneath you. Do not depend on it for correctness.

You can’t control whether returning a local variable triggers an implicit move or not. The type you use might be updated in the future to support move operations. Moreover, future language standards will apply implicit moves on even more situations so you can’t assume that just because it is not happening now it won’t happen in the future.


Subscribe to the Abseil Blog