\\ This page is based on the excellent books of Scott Meyers : __Effective C++ (50 specific ways to improve your programs and designs)__ and __More effective C++ (35 new ways to improve your programs and designs)__.
====== Effective C++ ======
===== Shifting from C to C++ =====
=== Item 1 : Use const and inline instead of #define ===
Because constants don't appear in the symbole table which is annoying for debugging, use constant variables (it is as fast) :
const float ASPECT_RATIO = 1.653;
And because there are lot of traps with macros, use inline (templated if necessary) :
template inline T& MAX(T& a, T& b) { return a>b ? a : b; }
Main traps of macros are :
* if you forget to put parentheses around all variable names :
#define sqr(x) x*x // wrong way
sqr(1+2) == 1+2*1+2 == 1+2+2 == 5 != 9
#define sqr(x) (x)*(x) // right way
* if you call a function as a macro parameter (''sqr(s.toInt())''), the function call will be done two times
Inline functions have none of these traps, and are as fast.
=== Item 2 : Prefer iostream to stdio.h ===
Because of type safety and extensiblity.
To define stream operators in your class :
friend ostream& operator<<(ostream& s, const ComplexInt& c);
ostream& operator<<(ostream& s, const ComplexInt& c) { s << c.r << " " << c.i; return s; }
=== Item 3 : Use new and delete instead of malloc and free ===
Because they don't call constructors and destructors !
=== Item 4 : Prefer C++-style comments ===
Because you can't embed C-style comments in other comments !
===== Memory management =====
=== Item 5 : Use the same form in corresponding calls to new and delete ===
''new'' with ''delete'', and ''new[]'' with ''delete[]'', or memory leaks ! (delete with new[] don't call all destructors)
=== Item 6 : Call delete on pointer members in destructors ===
* Initialize all pointer members in each of the constructors (at ''NULL'' if not allocated)
* Delete for all pointer members the existing memory and assign new memory in the assignment operator
* Delete the memory in the destructor
=== Item 7 : Check the return value of new ===
Set an error-handling function, globally :
void noMoreMemory() { cerr << "Unable to satisfy request for memory\n"; abort(); }
main() { set_new_handler(noMoreMemory); ... }
Or for a specific class :
typedef void (*PEHF)(); // PEHF = pointer to error handling function
class X
{
private:
static PEHF currentPEHF;
public:
static PEHF set_new_handler(PEHF p);
void * operator new(size_t size);
};
PEHF X::currentPEHF; //sets currentPEHF to 0 by default
PEHF X::set_new_handler(PEHF p)
{
PEHF oldPEHF = currentPEHF;
currentPEHF = p;
return oldPEHF;
}
void * X::operator new(size_t size)
{
PEHF currentHandler = ::set_new_handler(currentPEHF);
void *memory = ::new char[size];
::set_new_handler(currentHandler);
return memory;
}
=== Item 8 : Adhere to convention when writing new ===
Means having the right return value (pointer or 0), and calling an error-handling function when insufficient memory is avalaible :
void * operator new(size_t size) // your operator new could take additional params
{
while(true)
{
// HERE attempt to allocate size bytes
if (the allocation was sucessful)
return (a pointer to the memory);
PEHF currentHandler = set_new_handler(0); // get the current ...
set_new_handler(currentHandler); // ... error handling function
if (currentHandler) (*currentHandler)(); else return 0;
}
}
If the operator new is in a class X, you should add before to attempt to allocate memory :
if (size != sizeof(X)) return ::new char[size];
This occur when you inheritate from the class without rewriting the new operator.
=== Item 9 : Avoid hiding the global new ===
If you add parameters to your ''new'' redefinition, it blocks access to the usual form of new, so rewrite also the classical form :
class X
{
void * operator new(size_t size, PEHF pehf);
void * operator new(size_t size) { return ::new char[size]; }
};
void specialErrorHandler();
X *px1 = new(specialErrorHandler) X;
X *px2 = new X; // doesn't work if you don't define new(size_t)
=== Item 10 : Write delete if you write new ===
You can rewrite ''new'' to allocate small objects in a large memory zone (in order to speed up allocations and save memory), but you have also to write ''delete'' !
class Airplane
{
private:
Airplane *rep;
static Airplane *headOfFreeList;
public:
void * operator new(size_t, size);
void operator delete(void *deadObject, size_t size);
};
Airplane *Airplane::headOfFreeList; // initialized to 0 by default
void * Airplane::operator new(size_t size)
{
if (size != sizeof(Airplane)) return ::new char[size];
Airplane *p = headOfFreeList;
if (p) headOfFreeList = (Airplane*) p->rep;
else
{
Airplane *newBlock = (Airplane*) ::new char[256 * sizeof(Airplane)]; // don't call constructor !
if (newBlock == 0) return 0;
for(int i 0; i < 255; i++) // link the memory chunks together
newBlock[i].rep = (Airplane*) &newBlock[i+1];
newBlock[255].rep = 0;
p = newBlock;
headOfFreeList = &newBlock[1];
}
return p;
}
void Airplane::operator delete(void *deadObject, size_t size)
{
if (size != sizeof(Airplane)) { ::delete[] ((char*) deadObject); return; }
Airplane *carcass = (Airplane*) deadObject;
carcass->rep = (AirplaneRep*) headOfFreeList;
headOfFreeList = carcass;
}
===== Constructors, Destructors, and Assignment operators =====
=== Item 11 : Define a copy constructor and an assignment operator for classes with dynamically allocated memory ===
Because if you don't, the compiler with use the default ones which are bit-to-bit copies. Thus it will copy pointers, and if you free it in one object, the other one will point to nothing ...
=== Item 12 : Prefer initialization to assignment in constructors ===
You don't have choice for constants or reference members, but in general it is more efficient (constructors are always called once, and with initialization all members are written in raw which is faster than assigning them one by one).
=== Item 13 : List members in an initialization list in the order in which they are declared ===
Because it is always the order of the declarations which is followed by the compiler,
and never the order of the initialization list,
so you can fall into traps if you don't have the same order (eg you initialize ''size'', then initialize data with ''new int[size]'', but if ''data'' is declared before ''size'' the ''new'' will be done before the initialization of ''size''...)
=== Item 14 : Make destructors virtual in base classes ===
Because if you don't, when you create a derived object and put it in a pointer of the base class, and then delete it, it won't call the destructor of the derived class !
But do it only if it is a base class (you plan to inheritate from it), which generally corresponds to the fact that there is at least one virtual function, because you loose a little performance.
Tip : if you want to have an abstract class, but you don't have any function to make pure virtual, declare a pure virtual destructor ! (but you have nevertheless to provide a definition for this pure virtual destructor, and you can't declare it inline).
=== Item 15 : Have operator= return a reference to *this ===
To allow to chain assignments ''a=b=c=0'' :
C& C::operator=(const C&);
=== Item 16 : Assign to all data members in operator= ===
Because if you want to assign some, the compiler won't anymore assign the other ones. Moreover if it is a derived class you have also to initialize the data members of the base class :
((A&) *this = rhs;
(if base class doesn't provide = operator)
A::operator=(rhs);
(if it does)
=== Item 17 : Check for assignment to self in operator= ===
That's to say, always begin by :
if (this == &rhs) return *this;''
Not only because it saves time, but above all because it will create serious problems with freeing and reallocating resources !
But be careful this test won't work with multiple inheritance because one same object can have different addresses according to the type it is casted to. If so you could add an unique identifier for each object.
===== Classes and Functions : Design and Declaration =====
=== Item 18 : Strive for class interfaces that are complete and minimal ===
Because clients can do whatever they want to do, but it remains easy to learn (no confusion), easier to maintain, and shorter to compile. In brief, think carefully about whether the convenience of a new function justifies the additional costs.
=== Item 19 : Differentiate among member functions, global functions, and friend functions ===
When in doubt, try to be object-oriented (overloading operators etc).
But for example if you create a Fraction class and want to implement multiplication. If you overload ''operator*'' as a member, ''x*2'' will work, because the compiler know how to cast ''2'' into a Fraction thanks to your constructor, but ''2*x'' will fail because it can only do that on parameters. So if you want a commutative multiplication, you have to declare ''operator*'' as a global function taking two fractions. And if it needs to access protected members you have to declare it as a friend of Fraction.
In the same way, stream operators are always global function, because if they were members the variable should be on the left, like ''s >> cin'', what is contrary to the convention and would confuse everyone.
=== Item 20 : Avoid data members in the public interface ===
Because :
* it is simplier for the user (all members are functions, so there no need to try to remember whether to use parentheses),
* it gives you control over the accessibility of the members,
* it gives you functional abstraction (you can change a data member by computations for example).
=== Item 21 : Use const whenever possible ===
You can use it outside of classes for :
* global constants
* static objects (local to either a file or a block)
Inside classes for :
* static data members
* non static data members
With pointers :
char *p = "Hello"; // non-const pointer, non-const data
const char *p = "Hello"; // non-const pointer, const data
char * const p = "Hello"; // const pointer, non-const data
const char * const p = "Hello"; // const pointer, const data
And of in function declarations :
* individual parameters
* return value
* function as a whole for member functions (doesn't modify the object, excluding static members : can be invoked on a ''const'' object). Functions differing only in their constness can be overloaded.
C++ makes a bitwise test for const member functions : it checks it doesn't modify any member. But this is not perfect, because you can however modify a const object, eg if a const function returns a pointer to a member (the member can be modified outside).
Moreover sometimes you can want to modify a member in a function, but this function is const in the sense that the modifications are undetectable by a client (eg if you want to cache the length of a string, you don't modify the string so the function is conceptually const, but you modify the cache variable so the compiler won't accept it). Fortunately, you can cast away constness (''this'' is viewed as ''C * const this;'' for non-const member functions, and as ''const C * const this;'' for const member functions) :
unsigned String::length() const
{
String * const localThis = (String * const) this;
dataLength = ...;
...
}
=== Item 22 : Pass and return objects by reference instead of by value ===
Problems of returning by value :
* it can cost a lot of time because copy constructors of the object and all sub-objects are called for each copy
* the //slicing problem// can be dangerous : if you have a base class with a virtual function, and a derivated class, if you pass a derivated object to a function that takes a base object by value, it will lose all virtual functions of the derivated object because it is the copy constructor of the base class which is called !
=== Item 23 : Don't try to return a reference when you must return an object ===
If you have to return a new object (eg for operator+), you must return it by value. Indeed this new object will be created either in the stack and disappear when you go out of the function (=> segfault), or in the heap (with ''new'') but then you don't know who should delete it and there will be memory leaks.
=== Item 24 : Choose carefully between function overloading and parameter defining ===
Both permit to call functions with a different number of parameters. Parameter defining (default values) makes easier to avoid duplication of code. But you can only use it if :
* a reasonable default value exists (and if a generic algorithm exists and is not less efficient that what you could have done by knowing the number of arguments), or
* you can use a //magic number// which tells that the parameter is not set.
To cope with the duplication code problem with overloading, you can either call another overloaded function, or create a common underlying private member function that do the common work.
=== Item 25 : Avoid overloading on a pointer and a numerical type ===
ie avoid :
void f(int x);
void f(char *p);
Because there is no ambiguity for the compiler when calling ''f(0)'' (it will call ''f(int)'' because 0 is an int), but there is an ambiguity for the programmer. Indeed ''f(NULL)'' will call ''f(int)'' also because ''NULL'' is most often defined by ''#define NULL 0'', and there is no way to define it as a pointer which works with all pointer types without need for cast.
=== Item 26 : Guard against potential ambiguity ===
This code is correct and compiles :
class A {
public:
A(const class B&); // constructor from B
};
class B {
public:
operator A() const; // cast operator to A
};
But it can create ambiguities :
void g(const A&);
B b;
g(b); // error ! - ambiguous : calls constructor from B or cast to A ?
The problem with ambiguities is that you can miss it at the beginning, and if it is the client who is faced to and if he doesn't have the source code, he may have no way around the problem (note : maybe ''g(A(b))'', but it will cost a constructor call).
This can happened also with ''f(int)'' and ''f(char)'' called with ''double'', or with multiple inheritance :
class Base1 {
public:
int doIt();
};
class Base2 {
public:
void doIt();
};
class Derived : Base1, Base2 {} d;
d.doIt() // error - ambiguous
You have to explicitly specify which one of the functions you want to call : ''d.Base1::doIt()''. Even modifying accessibility (one private and the other one public) won't change anything, and that's normal because otherwise changing accessibility would change the function which is called by the same code, and that's very bad !
=== Item 27 : Explicitly disallow use of implicitly generated member functions you don't want ===
The compiler automatically generates some functions such as ''operator='' and copy constructor if you don't provide it. So you want to forbid them, you have to declare them private, and to not define them (for member and friend functions) : it will create a link error if someone tries to use it, even implicitly.
=== Item 28 : Use structs to partition the global namespace ===
Define your global constants in a struct, to avoid clashes between libraries (note : why not using namespaces ??)
===== Classes and Functions : Implementation =====
=== Item 29 : Avoid returning "handles" to internal data from const member functions ===
A handle is a pointer or a reference. If you do so, then it is possible to modify a const object using the handle returned by the const member function ! One way to go around it could be to return the result by value, or to return a handle on a copy of the data, but it costs time, and creating copies of objects lead to risks of memory leak. Another way is to return a const pointer or reference : it is both fast and safe, but it is not the same thing and may restrict callers unnecessarily (note: I don't agree with these drawbacks, the user can do a copy himself if he need to modify it).
=== Item 30 : Avoid member functions that return pointers or references to members less accessible than themselves ===
Because if you do so, you change the access level of the returned member (so what was the matter to give it this access level). If you have no choice, try at least to return a const handle.
=== Item 31 : Never return a reference to a local object or a dereferenced pointer initialized by new within the function ===
If you return a reference to a local object, as the object disappears when going out the function, this is automatic segmentation fault when using it. If you return a pointer initialized by a new within the function, you have to ask the user to delete it, but it is not reasonable, and sometimes impossible because of temporary objects (for example the result of operator + in ''s=a+b+c''). (note: in some cases, you can try to put the result in a class member and return reference to it, if you want to avoid the creation of a new object).
=== Item 32 : Use enums for integral class constants ===
If you want to use different constants for classes, you can't use a global const variable or a ''#define'' because it can't be different for different classes, neither use a const member because it must be initialized outside the declaration of the class thus is not known at compilation time. The solution is to use enum :
class X {
enum { BUFSIZE=100 };
char buffer[BUFSIZE];
}
=== Item 33 : Use inlining judiciously ===
Inlining avoid the cost of a function call, but it can increase dramatically the size of the code, what can be a problem on systems with limited memory, and can slow a lot the program by leading to pathological paging behavior on systems with virtual memory.
But there is more. The inline directive is just a request, and the compiler can decide to not inline the function (for example if it is recursive, or too long). When it is not inlined, as it must be defined in the header, the compiler will declare it static in order to avoid linking problems when several source files include the same header. Then there will be several copies of the code (and you still pay the cost of function calls). Moreover if you ask somewhere the address of the function, it will also create the body of the function. And debuggers cannot go through inline functions (note: with VisualStudio and g++, it is just disabled in debug mode ...).
A good methodology with inline functions is to inline only obvious functions at the beginning (getters&setters), then try to find which functions are called often, and try to inline them checking in the warning messages of the compiler that it is really inlined.
=== Item 34 : Minimize compilation dependencies between files ===
C++ doesn't really separate interface from implementation, because private stuff is declared with public stuff. Thus if you modify the implementation of a class, all files which use this class will be recompiled, even if the interface which is the only important thing didn't change. And for big projects it can be painful.
There are two solutions for separating interface from implementation :
* Put all private stuff in another class (implementation class), and put a pointer to this class in the main class (interface class). The cost is one level of indirection for access to private (implementation) stuff.
* Make the interface class an abstract base class which contains public stuff (only pure virtual functions), and inheritate from it an implementation class. The cost is virtuality : one indirection for each function call, and more important you lose inlining (but this is also true for the previous solution).
But don't dismiss these methods because they have a cost, use them during development to minimize the impact on clients when implementation change, and replace interface and implementation classe with one concrete for production use if there is a real difference in speed or size.
===== Inheritance and Object-Oriented design =====
=== Item 35 : Make sure public inheritance models "isa" ===
=== Item 36 : Differentiate between inheritance of interface and inheritance of implementation ===
=== Item 37 : Never redefine an inherited nonvirtual function ===
=== Item 38 : Never redefine an inherited default parameter value ===
=== Item 39 : Avoid casts down the inheritance hierarchy ===
=== Item 40 : Model "has-a" or "is-implemented-in-terms-of" through layering ===
=== Item 41 : Use private inheritance judiciously ===
=== Item 42 : Differentiate between inheritance and templates ===
=== Item 43 : Use multiple inheritance judiciously ===
=== Item 44 : Say what you mean ; understand what you're saying ===
===== Miscellany =====
=== Item 45 : Know what functions C++ silently writes and calls ===
=== Item 46 : Prefer compile-time and link-time errors to runtime errors ===
=== Item 47 : Ensure that global objects are initialized before they're used ===
=== Item 48 : Pay attention to compile warnings ===
=== Item 49 : Plan for coming language features ===
=== Item 50 : Read the ARM ===
====== More effective C++ ======
===== Basics =====
=== Item 1 : Distinguish between pointers and references ===
References must refer to an object (no null reference), and that's why they must be initialized. Indeed you don't have to check if a reference parameter is null or not, contrary to pointers. But references can't be reassigned to refer to different objects.
=== Item 2 : Prefer C++-style casts ===
Because they are easier to parse (both for humans and tools), and because it can avoid errors.\\
''static_cast(expression)'' : normal cast, for example ''double'' to ''int''.\\
''const_cast'' to cast away the constness or volatileness of an expression :
void update(X *px);
const X x;
update(const_cast(&x));
''dynamic_cast'' to perform safe casts down or across an inheritance hierarchy (base to derived objects), and it returns NULL if it fails (or throw an exception when casting references).\\
''reinterpret_cast'' for implementation-defined casts (rarely portable), for example casting between function pointer types :
typedef void (*FuncPtr)(); // a FuncPtr is a pointer to "void foo()"
FuncPtr funcPtrArray[10];
int doSomething();
funcPtrArray[0] = reinterpret_cast(&doSomething); // can work, but can yield incorrect results too !
=== Item 3 : Never treat arrays polymorphically ===
That's to say don't place derived class objects in an array of base class pointers, if they don't have the same size, because indexing will use size of the base class. So problems when you write your loop, or when you delete the array (the compiler generates a loop, and it is said in the language specification that the result is undefined).
=== Item 4 : Avoid gratuitious default constructors ===
===== Operators =====
=== Item 5 : Be wary of user-defined conversion functions ===
=== Item 6 : Distinguish between prefix and postfix forms of increment and decrement operators ===
=== Item 7 : Never overload &&, || or , ===
=== Item 8 : Understand the different meanings of new and delete ===
===== Exceptions =====
=== Item 9 : Use destructors to prevent resource leaks ===
=== Item 10 : Prevent resource leaks in constructors ===
=== Item 11 : Prevent exceptions from leaving destructors ===
=== Item 12 : Understand how throwing and exception differs from passing a parameter or calling a virtual function ===
=== Item 13 : Catch exceptions by reference ===
=== Item 14 : Use exception specifications judiciously ===
=== Item 15 : Understand the costs of exception handling ===
===== Efficiency =====
=== Item 16 : Remember the 80-20 rule ===
=== Item 17 : Consider using lazy evaluation ===
=== Item 18 : Amortize the cost of expected computations ===
=== Item 19 : Understand the origin of temporary objects ===
=== Item 20 : Facilitate the return value optimization ===
=== Item 21 : Overload to avoid implicit type conversions ===
=== Item 22 : Consider using op= instead of stand-alone op ===
=== Item 23 : Consider alternative libraries ===
=== Item 24 : Understand the costs of virtual functions, multiple inheritance, virtual base classes, and RTTI ===
Virtual functions works with virtual tables (one per class) containing pointers to functions, and virtual table pointers (one per object) pointing to the good virtual table. Hence it increases size of objects, per-class data, and reduce performance : because of indirections (but that's almost nothing), and mainly because it prevents inlining (except if the function is called from an object and not a pointer or a reference, but that's almost never the case).
Multiple inheritance leads to more per-class data (special virtual tables must be generated for base classes), increases size of objects (multiple virtual table pointers within a single object), and the runtime invocation cost of virtual function grows slightly (offset of virtual table pointers are more complicated to calculate).
Moreover if you declare base classes virtual (what you must do to avoid data replication if there are more than one inheritance path to a base class), it increases size of objects by adding several pointers to virtual base class.
RTTI (RunTime Type Identification) stores a ''type_info'' object in the virtual table (so it only works if there are virtual functions in the class, and increases a little the per-class data), and the ''typeid'' operator let us discover informations about objects at runtime.
To conclude it is important to understand the costs of theses functionalities, but also to understand that if we need it we will pay for it, so there is no point in trying to emulate it by another way. But you can have legitimate reasons to bypass the compiler-generated services for example because pointers to virtual tables can make it difficult to store C++ objects in databases or to move them across process boundaries, but it will be less efficient !
===== Techniques =====
=== Item 25 : Virtualizing constructors and non-member functions ===
=== Item 26 : Limiting the number of objects of a class ===
=== Item 27 : Requiring or prohibiting heap-based objects ===
=== Item 28 : Smart pointers ===
=== Item 29 : Reference counting ===
=== Item 30 : Proxy classes ===
=== Item 31 : Making functions virtual with respect to more than one object ===
===== Miscellany =====
=== Item 32 : Program in the future tense ===
=== Item 33 : Make non-leaf classes abstract ===
=== Item 34 : Understand how to combine C++ and C in the same program ===
=== Item 35 : Familiarize yourself with the language standard ===