In part one, we spent some time on the broader aspects of code review. This part will be focused on C++, offering up a code review checklist and providing a few best practices. You can read them in any order, but we’d recommend going back and reading our previous post.
The ultimate C++ review checklist
Code review checklists are never comprehensive – the number of issues, items, and potential things to check are nearly endless. Thus, making a list that covers every single potential eventuality, would be impossible to follow and harder to enforce. Instead, let’s focus on the broader aspects that we should cover during a C++ code review.
Category 1 – Requirements and understanding of the underlying domain
The then-CEO spent quite some time asking seemingly random questions, as well as some harder ones. The questions got harder and sharper, until he asked a “killer” question. Once Spolsky was able to answer it, the review got a pass from Gates. The reason? Bill Gates wanted to make sure the guy being reviewed controls the materials, and he did that by asking harder and harder questions in the domain of the review.
That’s an important part in the review. As a reviewer, you should understand the feature or bug fix that you are reviewing and be able to ask good questions on the implementation details. This is generally not the main focus of the code review (i.e., should not take significant time portion of the review, usually) – but it should be part of it.
Category 2 – Non-functional requirements
Non-functional requirements include the parts of the system that do not directly serve the product needs but allow the product to work properly (and according to law and regulations), and let us analyze issues when things aren’t working properly. Here are some key things to look for:
- Does the code have proper logging?
- Do the logs include the relevant available information? (One of the most frustrating experiences is debugging a production issue, having a log line that may assist, but lacking required information in that log line).
- Are there any other traceability requirements? (e.g., reporting to NMS via SNMP).
Privacy and security
- Make sure that the code does not log or print any sensitive information.
- Code should not keep any hard-coded password or other sensitive data.
- Do not assume any external input is valid, avoid injection threats.
(Read more on C++ security in our blog post on C++ vulnerabilities).
Category 3 – Maintainability
C++ is a complex language, so the code reviewer should think about ways to make it as readable as possible. These are just a few things you can do to make code easier to comprehend:
- Keep your functions short and single-purpose. Then make sure their name corresponds to what they actually do (we’ll discuss names soon).
- Keep your lines of code short. If necessary, break them into several lines or use helper functions.
- Keep your classes concise and follow the single responsibility principle.
- Avoid deep nesting – if you see a four levels nesting, you should probably refactor your code. Again, use helper functions for this if needed.
- Don’t get too clever. If your code may confuse others or even your future self, simplify it. Avoid code like this:
a = b += c;
When you start naming functions, follow these guidelines:
- Name your functions in a way that clearly indicates what they do.
- Functions that start with ‘get’ should be const, return a value, and as much as possible not have side effects.
- Functions starting with ‘set’ should get a value and set it into an internal variable. They may or may not return a success indicator.
- Functions starting with ‘is’ should be const, return a bool, and should not usually have side effects.
- Avoid unnecessarily long names.
- Avoid short names that do not convey enough information.
- Avoid overloading operators for operations that are unclear (e.g. what should operator* do for two Point objects? Don’t use it – write a function with a clear name).
- Avoid the name `i` for a variable that is not a loop counter.
- Avoid the name `temp` for any variable.
Coding guidelines and conventions
- Keep your classes in their corresponding cpp and h files; do not put several classes in the same file (except for nested classes).
- Go with either east-const or west-const – don’t mix-and-match.
- Use `auto` to declare variables. but avoid using `auto` for no reason on function return type and on function parameters (C++20 function parameters can be declared as `auto` but the function then becomes a template function).
- Do not use #define to declare constants.
- Do not use #define to declare macros unless there is no other way to achieve the required outcome, in which case the code should be part of your infrastructure code.
- Use `constexpr` or `constinit` for your constants and make them `static`.
- Do not declare `enum` in the global scope. Either use `enum class` or put the `enum` inside a proper class according to its context.
- Avoid `using namespace` for an entire namespace to avoid name pollution.
- Make sure to have an internal widely accepted list of your own coding guidelines and conventions.
- Code should be as simple as possible but not simpler than that.
- Go with the rule-of-zero whenever you can. Usually there is no need for user-manual memory allocations, keep those for special infrastructure classes that should sit in the infrastructure part of your project. (If you have to divert from the rule-of-zero: we will discuss the rule-of-three and the rule-of-five below).
- Use helper functions. Yes, we know we said that already above.
- Avoid cyclic dependencies between types, use interfaces instead of the real type.
- Avoid unnecessary abstractions. If you have an abstraction it should have a clear reason (future readiness for some unknown is not a good reason).
- Postpone your inheritance. Use it for small things such as state and strategy and not for big things like the actual object that needs to manage the state or strategy. A developer and a technical writer should both be just an employee, maybe with a role field.
Keeping data in proper context, private, and const
- Do not use global variables. Just don’t.
- All your data members should be private by default. You should have a good reason for having a protected data member and no reason for having a public data member.
- Your static members should also be private.
- Aim to keep things const. Make sure parameters travel as const references and functions are declared as const, unless there is a need for modifications.
- Member functions which return a data member by reference or by pointer, without const, are usually evil. Try to avoid them. If you really need to return a member by reference expose it based on an interface and not on its real type.
- Ensure logical constness of your members. The compiler ensures the physical constness. It’s your responsibility to protect your logical constness.
- If a user cannot tell from a function’s name and signature if the function may change the calling object members, something is wrong with the function name and signature.
Avoiding code duplication
- Code duplication is bad, it inflates your code creating more work for future maintenance. And it is prone for bugs like fixing or changing logic in one location while accidentally keeping old logic in other places. Moreover, code duplication is prone for copy-paste bugs where the pasted code had to be updated to fit its new usage, but the programmer forgot to update it or updated it only partially.
- Do not duplicate code even if it doesn’t look exactly the same – try to move things to a base class or into a function. It is worth the effort.
- Template is a great tool, not only for experts or for the infrastructure team, use templates to avoid code duplication.
Documenting the “why” would include:
- Why this loop must come before the next one, even though it seems odd.
- Why we didn’t do the obvious thing here (that doesn’t really work, we tried…).
Make sure you have these kinds of comments – they will save you time and save you from nasty bugs.
Category 4 – Usability and correctness
This refers to a working system with correct code, without bugs and having good performance. This should include the following:
Avoid undefined behavior at any cost and aim to avoid unspecified behavior
You should never use the argument “but it works” in C++. Any code that relies on undefined behavior is broken; any code that relies on unspecified behavior can change its behavior when recompiled (with different compilation flags, or if recompiled with another compiler).
Make sure you avoid:
- Signed integers overflow
- Incorrect type-conversions
- Accessing released memory which incidentally is still accessible
- Forgetting to implement a virtual destructor in a polymorphic hierarchy and with dynamic allocations of derived class objects.
- Other undefined behaviors (a hot topic at cppcon).
Just avoid it, even if you have to rewrite a piece of code that you are strongly attached to.
Resource management should be considered a solved issue in C++. RAII is the magic word and smart pointers are one of the tools. Just use them. And of course, use the standard library containers (or other managed containers).
- Keep yourself safe with the rule-of-zero whenever you can. Use smart pointers and proper containers.
- If for some reason you need to implement a destructor, remember the rule-of-three: implement or delete the copy constructor and the copy assignment operator. (If you just declare a default virtual destructor, do not block the copy and assignment. Instead, declare the default for all four: copy and move constructors and assignment).
- Make sure all your pointers are initialized to nullptr or to a valid address.
- Use std::unique_ptr for unique ownership, and do not pass unique_ptr by ref.
- Use std::shared_ptr for shared ownership, and prefer not to pass it by ref.
- Avoid using new and delete in user code, allocate memory with std::make_unique and std::make_shared.
- Check that calls to unique_ptr::get and shared_ptr::get do not misuse the pointer (e.g. do not pass it accidentally to another smart pointer).
- Check that calls to unique_ptr::release takes the returned pointer and pass it to another smart pointer or free it.
- Use lock guards and the RAII technique for any resource that should be released.
Dangling pointers and references, invalidated references and iterators
One of the hazards of C++ is using a dangling or invalidated reference, pointer or iterator. Make sure that:
- You do not use iterator to a vector that was taken before calling push_back (make sure to follow the invalidation rules see also here).
- You do not use any reference to a temporary (like in this Facebook Curiously Recurring C++ Bug).
Conversions, types and type safety
Conversions are tricky, especially if bypassing the compiler and when not using strong types. Bad types and wrong conversions are known to have crashed satellites, so you better be careful:
- Use the C++ casting operators (static_cast, reinterpret_cast, const_cast and dynamic_cast) instead of C-style casting. They say what you want to perform.
- Do not use const_cast to remove constness and change the variable, it is undefined behavior.
- Use strong types for type safety (see Joe Boccara’s NamedType library).
Make sure the code doesn’t ignore errors.
- If there is an `if` without an `else`, ask – what should happen in the `else`? It is not that you should always need an `else`, but asking might reveal a need you were unaware of.
- Same as above, cover all cases in switch including the default (i.e. nothing above fits).
- If something may fail, assume it will. It is okay if you do not immediately handle the error case. Just leave there a TODO comment and an error log or throw an exception if you are working with exceptions.
- Letting an exception propagate out, aborting the program, is better than swallowing exceptions. Do not swallow exceptions!
- If we have a recursive algorithm, are we protected from a stack overflow scenario? What is the maximum depth the algorithm may reach? Consider handling this case.
- In general, prefer fail-safe over crash, but prefer a crash over running with a bad state.
If your program runs with multiple threads, the question “is it thread safe” must be raised in your code review. Specifically, you should ask:
- Is data accessed from multiple threads being locked with the same lock in all threads, or handled with proper atomic operations?
- Blocking operations should preferably be called with a timeout, at least logging when the timeout occurs, for better tracing and to avoid deadlocks.
- Consider using RCU (Read, Copy, Update) to avoid the need for long locking, or for locking at all.
- Do not invent your lock-free algorithms unless this is what you do for a living. Falling at ABA problem or similar issues is too easy. Use existing algorithms and data structures.
- When using standard library containers make sure to follow their thread safety rules, see also here.
- Try to have unit tests that imitate race conditions for your critical multithreaded scenarios.
One of the reasons we write in C++ is performance – achieving lower latency and higher throughput.
- Remember that vector is the best. Document why if you choose from the rest!
- Using standard library algorithms would be better in most cases than implementing your own version (and not only for performance reasons).
- Do not invest days in pre-optimization of something that is not known to be a bottleneck. But if it is an additional hour, prefer investing a bit more time and thought to write more efficient code.
- Think of algorithmic complexity, preferring O(n) algorithm over O(n^2) is not pre-optimization.
- Make sure you do not create redundant copies by getting objects by value or by implicitly casting to temporary when getting a const ref, if the type that you send does not exactly match.
- Make sure to break out of loops when relevant. Yes, everybody knows that and yet sometimes we forget.
- If you see a three-levels nested loop ask a few questions about what’s going on there, try to investigate the algorithm a bit. Is it efficient?
- Remember that a simple call inside a loop might contain a loop.
- Use move semantics smartly:
- The rule-of-zero is again your friend, but if you are not there remember the rule-of-five and make sure to implement or declare the default move operations.
- If you implement move operations, remember to mark them as `noexcept`.
- Remember to use std::move and std::forward when needed (but do not use them on variables that are still in use or when returning a local variable).
Best practices for code review
You can have the best checklist in the world, but it won’t be of much use if you’re not following good code review practices. Some of the more technical practices of code review were discussed in part 1 of this blog, but let’s add a few important items for the code review itself:
- Review in short chunks — and give yourself plenty of time to do it. Most people would not be able to grasp more than 12 classes or more than 600 lines of code in a single meeting. Some developers find that their attention wanders after 150 lines. Figure out what works for you and your team and set your own standard.
- Don’t review for more than 90 minutes at a time. After that, take a break to clear your head before you come back and review the code again. Stare at the same code for too long, and you’ll start to miss things.
- Build and test before review. We already discussed this in part one, but it is important enough to repeat it. If you’re going to ask developers to check your code, make sure you’re making it worth their time by building and testing your code first. Code reviews are not the tool to detect bugs that could be spotted by an automated test. Ideally, you should come to your code review also with the results of your static code analysis, along with comments on any recommendations you might have ignored.
- Make sure you always conduct your code reviews before the code is merged. Code review is a gate! Do not open the gate if the code didn’t pass the review.
- Use the code review to educate and standardize your team’s understanding of what makes ‘good’ code. Everyone has different standards for code quality, so make sure you’re all going into a code review on the same page. Make sure everyone understands the reasons for any suggested change, make the suggestions justified, specific, and productive. This way you would avoid the need to comment on similar mistakes in future code reviews.
- Choose your comments carefully and give them reasonable priorities. Again, already discussed in part one and still important enough to repeat: not all comments are of the same priority, make sure to prioritize your comments and make it clear which comments require a fix before the code could be merged and which can be postponed for a later release.
- Make comments polite, thoughtful, reasoned, and direct. You’re a team, and everyone’s trying their best to write good code. Try to find a balance between being respectful and being direct and clear about any problems. Google’s guidelines for comments are always a good place to start.
Want to build code that really works? Make C++ code review part of your routine
Code review isn’t really an optional add-on for any language, but for C++ is really a must. Good C++ code review is essential if you want to make sure your code is clean, logical, efficient, and, most importantly, correct. Getting additional eye on your code is, really, the only way to find those little tweaks that take your code from ‘works on my machine’ to ‘ready for production.’
- The C++ Core Guidelines
- Rob Pike’s (one of the creators of the Go language) set of advices on programming in C (yes, C, but it is mostly relevant also for C++). Take a special look at the parts discussing comments and procedure names, but the entire thing is a real pearl.
- 13 principles for writing readable code
- An Abbreviated C++ Code Inspection Checklist
- And another Code review check list