This page is based on the excellent books of Scott Meyers : Effective C++ (50 specific ways to improve your programs and designs) and More effective C++ (35 new ways to improve your programs and designs).
Because constants don't appear in the symbole table which is annoying for debugging, use constant variables (it is as fast) :
const float ASPECT_RATIO = 1.653;
And because there are lot of traps with macros, use inline (templated if necessary) :
template<class T> inline T& MAX(T& a, T& b) { return a>b ? a : b; }
Main traps of macros are :
#define sqr(x) x*x // wrong way sqr(1+2) == 1+2*1+2 == 1+2+2 == 5 != 9 #define sqr(x) (x)*(x) // right way
sqr(s.toInt())
), the function call will be done two timesInline functions have none of these traps, and are as fast.
Because of type safety and extensiblity.
To define stream operators in your class :
friend ostream& operator<<(ostream& s, const ComplexInt& c); ostream& operator<<(ostream& s, const ComplexInt& c) { s << c.r << " " << c.i; return s; }
Because they don't call constructors and destructors !
Because you can't embed C-style comments in other comments !
new
with delete
, and new[]
with delete[]
, or memory leaks ! (delete with new[] don't call all destructors)
NULL
if not allocated) Set an error-handling function, globally :
void noMoreMemory() { cerr << "Unable to satisfy request for memory\n"; abort(); } main() { set_new_handler(noMoreMemory); ... }
Or for a specific class :
typedef void (*PEHF)(); // PEHF = pointer to error handling function class X { private: static PEHF currentPEHF; public: static PEHF set_new_handler(PEHF p); void * operator new(size_t size); }; PEHF X::currentPEHF; //sets currentPEHF to 0 by default PEHF X::set_new_handler(PEHF p) { PEHF oldPEHF = currentPEHF; currentPEHF = p; return oldPEHF; } void * X::operator new(size_t size) { PEHF currentHandler = ::set_new_handler(currentPEHF); void *memory = ::new char[size]; ::set_new_handler(currentHandler); return memory; }
Means having the right return value (pointer or 0), and calling an error-handling function when insufficient memory is avalaible :
void * operator new(size_t size) // your operator new could take additional params { while(true) { // HERE attempt to allocate size bytes if (the allocation was sucessful) return (a pointer to the memory); PEHF currentHandler = set_new_handler(0); // get the current ... set_new_handler(currentHandler); // ... error handling function if (currentHandler) (*currentHandler)(); else return 0; } }
If the operator new is in a class X, you should add before to attempt to allocate memory :
if (size != sizeof(X)) return ::new char[size];
This occur when you inheritate from the class without rewriting the new operator.
If you add parameters to your new
redefinition, it blocks access to the usual form of new, so rewrite also the classical form :
class X { void * operator new(size_t size, PEHF pehf); void * operator new(size_t size) { return ::new char[size]; } }; void specialErrorHandler(); X *px1 = new(specialErrorHandler) X; X *px2 = new X; // doesn't work if you don't define new(size_t)
You can rewrite new
to allocate small objects in a large memory zone (in order to speed up allocations and save memory), but you have also to write delete
!
class Airplane { private: Airplane *rep; static Airplane *headOfFreeList; public: void * operator new(size_t, size); void operator delete(void *deadObject, size_t size); }; Airplane *Airplane::headOfFreeList; // initialized to 0 by default void * Airplane::operator new(size_t size) { if (size != sizeof(Airplane)) return ::new char[size]; Airplane *p = headOfFreeList; if (p) headOfFreeList = (Airplane*) p->rep; else { Airplane *newBlock = (Airplane*) ::new char[256 * sizeof(Airplane)]; // don't call constructor ! if (newBlock == 0) return 0; for(int i 0; i < 255; i++) // link the memory chunks together newBlock[i].rep = (Airplane*) &newBlock[i+1]; newBlock[255].rep = 0; p = newBlock; headOfFreeList = &newBlock[1]; } return p; } void Airplane::operator delete(void *deadObject, size_t size) { if (size != sizeof(Airplane)) { ::delete[] ((char*) deadObject); return; } Airplane *carcass = (Airplane*) deadObject; carcass->rep = (AirplaneRep*) headOfFreeList; headOfFreeList = carcass; }
Because if you don't, the compiler with use the default ones which are bit-to-bit copies. Thus it will copy pointers, and if you free it in one object, the other one will point to nothing …
You don't have choice for constants or reference members, but in general it is more efficient (constructors are always called once, and with initialization all members are written in raw which is faster than assigning them one by one).
Because it is always the order of the declarations which is followed by the compiler,
and never the order of the initialization list,
so you can fall into traps if you don't have the same order (eg you initialize size
, then initialize data with new int[size]
, but if data
is declared before size
the new
will be done before the initialization of size
…)
Because if you don't, when you create a derived object and put it in a pointer of the base class, and then delete it, it won't call the destructor of the derived class !
But do it only if it is a base class (you plan to inheritate from it), which generally corresponds to the fact that there is at least one virtual function, because you loose a little performance.
Tip : if you want to have an abstract class, but you don't have any function to make pure virtual, declare a pure virtual destructor ! (but you have nevertheless to provide a definition for this pure virtual destructor, and you can't declare it inline).
To allow to chain assignments a=b=c=0
:
C& C::operator=(const C&);
Because if you want to assign some, the compiler won't anymore assign the other ones. Moreover if it is a derived class you have also to initialize the data members of the base class :
((A&) *this = rhs;
(if base class doesn't provide = operator)
A::operator=(rhs);
(if it does)
That's to say, always begin by :
if (this == &rhs) return *this;''
Not only because it saves time, but above all because it will create serious problems with freeing and reallocating resources !
But be careful this test won't work with multiple inheritance because one same object can have different addresses according to the type it is casted to. If so you could add an unique identifier for each object.
Because clients can do whatever they want to do, but it remains easy to learn (no confusion), easier to maintain, and shorter to compile. In brief, think carefully about whether the convenience of a new function justifies the additional costs.
When in doubt, try to be object-oriented (overloading operators etc).
But for example if you create a Fraction class and want to implement multiplication. If you overload operator*
as a member, x*2
will work, because the compiler know how to cast 2
into a Fraction thanks to your constructor, but 2*x
will fail because it can only do that on parameters. So if you want a commutative multiplication, you have to declare operator*
as a global function taking two fractions. And if it needs to access protected members you have to declare it as a friend of Fraction.
In the same way, stream operators are always global function, because if they were members the variable should be on the left, like s » cin
, what is contrary to the convention and would confuse everyone.
Because :
You can use it outside of classes for :
Inside classes for :
With pointers :
char *p = "Hello"; // non-const pointer, non-const data const char *p = "Hello"; // non-const pointer, const data char * const p = "Hello"; // const pointer, non-const data const char * const p = "Hello"; // const pointer, const data
And of in function declarations :
const
object). Functions differing only in their constness can be overloaded.C++ makes a bitwise test for const member functions : it checks it doesn't modify any member. But this is not perfect, because you can however modify a const object, eg if a const function returns a pointer to a member (the member can be modified outside).
Moreover sometimes you can want to modify a member in a function, but this function is const in the sense that the modifications are undetectable by a client (eg if you want to cache the length of a string, you don't modify the string so the function is conceptually const, but you modify the cache variable so the compiler won't accept it). Fortunately, you can cast away constness (this
is viewed as C * const this;
for non-const member functions, and as const C * const this;
for const member functions) :
unsigned String::length() const { String * const localThis = (String * const) this; dataLength = ...; ... }
Problems of returning by value :
If you have to return a new object (eg for operator+), you must return it by value. Indeed this new object will be created either in the stack and disappear when you go out of the function (⇒ segfault), or in the heap (with new
) but then you don't know who should delete it and there will be memory leaks.
Both permit to call functions with a different number of parameters. Parameter defining (default values) makes easier to avoid duplication of code. But you can only use it if :
To cope with the duplication code problem with overloading, you can either call another overloaded function, or create a common underlying private member function that do the common work.
ie avoid :
void f(int x); void f(char *p);
Because there is no ambiguity for the compiler when calling f(0)
(it will call f(int)
because 0 is an int), but there is an ambiguity for the programmer. Indeed f(NULL)
will call f(int)
also because NULL
is most often defined by #define NULL 0
, and there is no way to define it as a pointer which works with all pointer types without need for cast.
This code is correct and compiles :
class A { public: A(const class B&); // constructor from B }; class B { public: operator A() const; // cast operator to A };
But it can create ambiguities :
void g(const A&); B b; g(b); // error ! - ambiguous : calls constructor from B or cast to A ?
The problem with ambiguities is that you can miss it at the beginning, and if it is the client who is faced to and if he doesn't have the source code, he may have no way around the problem (note : maybe g(A(b))
, but it will cost a constructor call).
This can happened also with f(int)
and f(char)
called with double
, or with multiple inheritance :
class Base1 { public: int doIt(); }; class Base2 { public: void doIt(); }; class Derived : Base1, Base2 {} d; d.doIt() // error - ambiguous
You have to explicitly specify which one of the functions you want to call : d.Base1::doIt()
. Even modifying accessibility (one private and the other one public) won't change anything, and that's normal because otherwise changing accessibility would change the function which is called by the same code, and that's very bad !
The compiler automatically generates some functions such as operator=
and copy constructor if you don't provide it. So you want to forbid them, you have to declare them private, and to not define them (for member and friend functions) : it will create a link error if someone tries to use it, even implicitly.
Define your global constants in a struct, to avoid clashes between libraries (note : why not using namespaces ??)
A handle is a pointer or a reference. If you do so, then it is possible to modify a const object using the handle returned by the const member function ! One way to go around it could be to return the result by value, or to return a handle on a copy of the data, but it costs time, and creating copies of objects lead to risks of memory leak. Another way is to return a const pointer or reference : it is both fast and safe, but it is not the same thing and may restrict callers unnecessarily (note: I don't agree with these drawbacks, the user can do a copy himself if he need to modify it).
Because if you do so, you change the access level of the returned member (so what was the matter to give it this access level). If you have no choice, try at least to return a const handle.
If you return a reference to a local object, as the object disappears when going out the function, this is automatic segmentation fault when using it. If you return a pointer initialized by a new within the function, you have to ask the user to delete it, but it is not reasonable, and sometimes impossible because of temporary objects (for example the result of operator + in s=a+b+c
). (note: in some cases, you can try to put the result in a class member and return reference to it, if you want to avoid the creation of a new object).
If you want to use different constants for classes, you can't use a global const variable or a #define
because it can't be different for different classes, neither use a const member because it must be initialized outside the declaration of the class thus is not known at compilation time. The solution is to use enum :
class X { enum { BUFSIZE=100 }; char buffer[BUFSIZE]; }
Inlining avoid the cost of a function call, but it can increase dramatically the size of the code, what can be a problem on systems with limited memory, and can slow a lot the program by leading to pathological paging behavior on systems with virtual memory.
But there is more. The inline directive is just a request, and the compiler can decide to not inline the function (for example if it is recursive, or too long). When it is not inlined, as it must be defined in the header, the compiler will declare it static in order to avoid linking problems when several source files include the same header. Then there will be several copies of the code (and you still pay the cost of function calls). Moreover if you ask somewhere the address of the function, it will also create the body of the function. And debuggers cannot go through inline functions (note: with VisualStudio and g++, it is just disabled in debug mode …).
A good methodology with inline functions is to inline only obvious functions at the beginning (getters&setters), then try to find which functions are called often, and try to inline them checking in the warning messages of the compiler that it is really inlined.
C++ doesn't really separate interface from implementation, because private stuff is declared with public stuff. Thus if you modify the implementation of a class, all files which use this class will be recompiled, even if the interface which is the only important thing didn't change. And for big projects it can be painful.
There are two solutions for separating interface from implementation :
But don't dismiss these methods because they have a cost, use them during development to minimize the impact on clients when implementation change, and replace interface and implementation classe with one concrete for production use if there is a real difference in speed or size.
References must refer to an object (no null reference), and that's why they must be initialized. Indeed you don't have to check if a reference parameter is null or not, contrary to pointers. But references can't be reassigned to refer to different objects.
Because they are easier to parse (both for humans and tools), and because it can avoid errors.
static_cast<type>(expression)
: normal cast, for example double
to int
.
const_cast
to cast away the constness or volatileness of an expression :
void update(X *px); const X x; update(const_cast<X*>(&x));
dynamic_cast
to perform safe casts down or across an inheritance hierarchy (base to derived objects), and it returns NULL if it fails (or throw an exception when casting references).
reinterpret_cast
for implementation-defined casts (rarely portable), for example casting between function pointer types :
typedef void (*FuncPtr)(); // a FuncPtr is a pointer to "void foo()" FuncPtr funcPtrArray[10]; int doSomething(); funcPtrArray[0] = reinterpret_cast<FuncPtr>(&doSomething); // can work, but can yield incorrect results too !
That's to say don't place derived class objects in an array of base class pointers, if they don't have the same size, because indexing will use size of the base class. So problems when you write your loop, or when you delete the array (the compiler generates a loop, and it is said in the language specification that the result is undefined).
Virtual functions works with virtual tables (one per class) containing pointers to functions, and virtual table pointers (one per object) pointing to the good virtual table. Hence it increases size of objects, per-class data, and reduce performance : because of indirections (but that's almost nothing), and mainly because it prevents inlining (except if the function is called from an object and not a pointer or a reference, but that's almost never the case).
Multiple inheritance leads to more per-class data (special virtual tables must be generated for base classes), increases size of objects (multiple virtual table pointers within a single object), and the runtime invocation cost of virtual function grows slightly (offset of virtual table pointers are more complicated to calculate).
Moreover if you declare base classes virtual (what you must do to avoid data replication if there are more than one inheritance path to a base class), it increases size of objects by adding several pointers to virtual base class.
RTTI (RunTime Type Identification) stores a type_info
object in the virtual table (so it only works if there are virtual functions in the class, and increases a little the per-class data), and the typeid
operator let us discover informations about objects at runtime.
To conclude it is important to understand the costs of theses functionalities, but also to understand that if we need it we will pay for it, so there is no point in trying to emulate it by another way. But you can have legitimate reasons to bypass the compiler-generated services for example because pointers to virtual tables can make it difficult to store C++ objects in databases or to move them across process boundaries, but it will be less efficient !