Google C++ style guide

Sat Oct 3 17:52:34 PDT 2009

I have found this page linked from Reddit (click "Toggle all summaries" at the top to read the full page):
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml

At Google C++ isn't the most used language, so it may be better to use a C++ style guide from a firm that uses C++ more than Google. On the other hand Google has hired many good programmers, and probably some of them have strong C++ experience, so if you are interested in C++/D this style guide deserves to be read.

This guide is mostly (as it often happens with C++) a list of features that are forbidden, I think usually to reduce the total bug count of the programs. Some of such imposed limits make me a little nervous, so I'd like to remove/relax some of those limits, but I am ignorant regarding C++, while the people that have written this document are expert, so their judgement has weight.

They forbid several features that are present in D too. Does it means D has to drop such features (or make them less "natural", so the syntax discourages their use)?

Here are few things from that document that I think are somehow interesting. Some of those things may be added to D style guide, or they may even suggest changes in the language itself.

-------------------

>Function Parameter Ordering: When defining a function, parameter order is: inputs, then outputs.<

D may even enforce this, allowing "out" only after "in" arguments.

-------------------

>Nested Classes: Do not make nested classes public unless they are actually part of the interface, e.g., a class that holds a set of options for some method.<

-------------------

>Static and Global Variables: Static or global variables of class type are forbidden: they cause hard-to-find bugs due to indeterminate order of construction and destruction. [...] The order in which class constructors, destructors, and initializers for static variables are called is only partially specified in C++ and can even change from build to build, which can cause bugs that are difficult to find. [...] As a result we only allow static variables to contain POD data.<

I think D avoids such problem.

-------------------

>Doing Work in Constructors: Do only trivial initialization in a constructor. If at all possible, use an Init() method for non-trivial initialization. [...] If the work calls virtual functions, these calls will not get dispatched to the subclass implementations. Future modification to your class can quietly introduce this problem even if your class is not currently subclassed, causing much confusion.<

-------------------

>Declaration Order: Use the specified order of declarations within a class: public: before private:, methods before data members (variables), etc.<

D may even enforce such order (Pascal does something similar).

-------------------

>Reference Arguments: All parameters passed by reference must be labeled const.<

>In fact it is a very strong convention in Google code that input arguments are values or const references while output arguments are pointers. Input parameters may be const pointers, but we never allow non-const reference parameters.<

I think C solves part of such problem forcing the programmer to add "ref" before the variable name in the calling place too. D may do the same.

-------------------

Function Overloading: Use overloaded functions (including constructors) only in cases where input can be specified in different types that contain the same information.

>Cons: One reason to minimize function overloading is that overloading can make it hard to tell which function is being called at a particular call site. Another one is that most people are confused by the semantics of inheritance if a deriving class overrides only some of the variants of a function.<

>Decision: If you want to overload a function, consider qualifying the name with some information about the arguments, e.g., AppendString(), AppendInt() rather than just Append().<

This is a strong limitation. One of the things that makes C++ more handy than C. I accept it for normal code, but I refuse it for "library code". Library code is designed to be more flexible and reusable, making syntax simpler, etc.
So I want D to keep overloaded functions.

-------------------

>Default Arguments: We do not allow default function parameters.<

>Cons: People often figure out how to use an API by looking at existing code that uses it. Default parameters are more difficult to maintain because copy-and-paste from previous code may not reveal all the parameters. Copy-and-pasting of code segments can cause major problems when the default arguments are not appropriate for the new code.<

>Decision: We require all arguments to be explicitly specified, to force programmers to consider the API and the values they are passing for each argument rather than silently accepting defaults they may not be aware of.<

This too is a strong limitation. I understand that it may make life a little more complex, but they are handy. So I think their usage has to be limited, but I don't like to totally forbid them.
"Forcing the programmers to consider the API" has some negative side-effects too that they seem to ignore. So I want D to keep its default function parameters feature.

-------------------

>Variable-Length Arrays and alloca(): We do not allow variable-length arrays or alloca().<

>Cons: Variable-length arrays and alloca [...] allocate a data-dependent amount of stack space that can trigger difficult-to-find memory overwriting bugs: "It ran fine on my machine, but dies mysteriously in production".<

>Decision:  Use a safe allocator instead, such as scoped_ptr/scoped_array.<

After reading this page:
http://www.boost.org/doc/libs/1_40_0/libs/smart_ptr/scoped_array.htm
I think they are just a pointer that points to heap-allocated memory, plus it gets deallocated when the scope ends.

In 99.5% of the cases a heap allocation is good enough in D (especially of the GC gets better). But once in a while speed is more important, so for very small arrays I'd like to have variable-length arrays in D (allocating large arrays on the stack is always bad in production code).

-------------------

>Run-Time Type Information (RTTI): We do not use Run Time Type Information (RTTI).<

>If you find yourself in need of writing code that behaves differently based on the class of an object, consider one of the alternatives to querying the type. Virtual methods are the preferred way of executing different code paths depending on a specific subclass type. This puts the work within the object itself. If the work belongs outside the object and instead in some processing code, consider a double-dispatch solution, such as the Visitor design pattern. This allows a facility outside the object itself to determine the type of class using the built-in type system. If you think you truly cannot use those ideas, you may use RTTI. But think twice about it. :-) Then think twice again. Do not hand-implement an RTTI-like workaround. The arguments against RTTI apply just as much to workarounds like class hierarchies with type tags. <

I think this is in most situations acceptable. On the other hand I'd like D to have a better implemented reflection (whithin the bounds of the things that can be done by a static compiler, even if future D implementations may run on a VM, like a future alternative LDC), that can be useful in unittesting.

I am not sure about this, I don't use RTTI a lot in D code.

-------------------

>Casting: Use C++ casts like static_cast<>(). Do not use other cast formats like int y = (int)x; or int y = int(x);.<

>Pros: The problem with C casts is the ambiguity of the operation; sometimes you are doing a conversion (e.g., (int)3.5) and sometimes you are doing a cast (e.g., (int)"hello"); C++ casts avoid this. Additionally C++ casts are more visible when searching for them.<

>Do not use C-style casts. Instead, use these C++-style casts.
* Use static_cast as the equivalent of a C-style cast that does value conversion, or when you need to explicitly up-cast a pointer from a class to its superclass.
* Use const_cast to remove the const qualifier (see const).
* Use reinterpret_cast to do unsafe conversions of pointer types to and from integer and other pointer types. Use this only if you know what you are doing and you understand the aliasing issues.
* Do not use dynamic_cast except in test code. If you need to know type information at runtime in this way outside of a unittest, you probably have a design flaw.<

I agree with them that mixing all different kinds of cast as in D is bad. In D I'd like to know what I'm doing in a more precise way. This is something that can be improved in D.

-------------------

Integer Types:

>You should not use the unsigned integer types such as uint32_t, unless the quantity you are representing is really a bit pattern rather than a number, or unless you need defined twos-complement overflow. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this.<

I'm for the removal of size_t from everywhere it's not stricly necessary (so for example from array lenghts) to avoid bugs.

See also the recent thread about signed-unsigned issues:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=17800

Integer oveflow tests too will help.

-------------------

Boost:

>Cons: Some Boost libraries encourage coding practices which can hamper readability, such as metaprogramming and other advanced template techniques, and an excessively "functional" style of programming.<

Advanced used of templates makes the code less easy to understand. But sometimes functional style makes code shorter, more readable, safer multiprocessing-wise, sometimes even parallelizable, etc.

-------------------

Type Names: often I don't like the C++ practice of using a single uppercase letter for a template type, like T. Better to give a meaningful name to types, when possible.

-------------------

>Class Data Members: Data members (also called instance variables or member variables) are lowercase with optional underscores like regular variable names, but always end with a trailing underscore.<

D may even enforce some simple syntax for class members, like that underscore or something else. No other variable is allowed to share the same syntax (so this syntax is used iff it's a class member). It makes conversions from other languages a little more work, but I think it will pay off.

-------------------

>Regular Functions: Functions should start with a capital letter and have a capital letter for each new word. No underscores:<

That's ugly.

-------------------

>Spaces vs. Tabs: Use only spaces, and indent 2 spaces at a time.<

4 spaces are more readable :-)

-------------------

Pointer and Reference Expressions:

// These are fine, space following.
char* c;    // but remember to do "char* c, *d, *e, ...;"!

That's good in D but bad in C/C++. They are wrong here.

-------------------

>Class Format: Sections in public, protected and private order, each indented one space.<

There are no good solutions to this. I use 4 spaces for them too.

-------------------

Loops and Conditionals:

for ( ; i < 5 ; ++i) {  // For loops always have a space after the
  ...                   // semicolon, and may have a space before the
                        // semicolon.

That space before the ; is quite important. But I don't think there's a need for a warning if it's absent.

-------------------

Bye,
bearophile