Rationale Behind C++ Commandments (5) – OOP design

The idea of bundling code and program into a layout (classes) and injecting it with different data (objects) leads to a ‘new’ way (newer than C) of organizing our programs through the worldview of objects.

Every unit is seen as

• a state: all member variables
• possible actions: methods = member functions.

that is ready to interact with other objects.

Encapsulation (through access control)

The first improvement upon OOP is privacy (data encapsulation). You can have finer controls of what to share and with who. In C++, your options are:

• public: everybody
• private: only within yourself (internal use)
• protected: only shared with descendants (inheritance discussed below)

Granting certain class as friend (anywhere in the class declaration with friend class F) exposes the non-public sections specifically to the friend F. This is often a ‘loophole’ to access control that finds few legitimate uses other than testing.

friend functions are traditionally used in binary (2-input) operator overloading, but the modern wisdom is to screw it and just leave it out there as free functions!

protected has very few good uses other than preventing heap delete through base pointer non-polymorphically (child destructor not called: BAD) by making the base destructor non-public (i.e. meaning it’d be impossible to have base objects on stack) while letting the child chain the parent’s destructor (child can’t access it if it’s marked as private).

protected member variables are almost always a bad idea.

Inheritance

The second improvement is to allow classes to build on top of existing ones. What gets interesting (and difficult) is when the child ‘improve’ on the parent by either by replacing what they have (member variables) and what they do (methods) with their own.

Static data members inherit REFERENCES to the parent!

Inheritance AT LEAST always inherits an interface (can optionally inherit implementation).

Whenever the member (function or variable) name is used in any form (even with different argument types or signatures), the parent member with the same name will be hidden. The behavior is called shadowing, and it applies unless you’ve overridden ALL versions (signatures) of virutal parent methods which shares the same function name mentioned in child.

• Any non-overriden method with the same name as the parent appearing in the child will shadow all parent methods with the same name regardless of whether they are declared virtual and overriden at child.
• You can unhide parent methods with the same name (but different signature) by using Parent::f(..) declared at the child class.
• Shadowing implies there’s always one parent version and one child version stored separately under all conditions {static or non-static}x{function or variable}
• Static members don’t really ‘shadow’ because there’s only one global storage for each (parent and child) if you declare the same variable name again in the child. There’s nothing to hide because you cannot cast or slice a namespace! With static members, you have to be explicit about which class you are calling from with SRO like Parent::var or Child::var so there’s no potential for ambiguities.

Overriding

Just like C, C++ uses static binding that takes the programmer’s word for it for their declared types, especially through handles. Overriding is a concept only needed when you plan to upcast your objects (child accessed through pointer/reference) to handle a broader class of objects but intend to the underlying object’s own version (usually child) of the methods (especially destructors) called by default.

We do this by declaring the parent method virtual and implement the child versions (must be of the same function signature). Overriding only make sense for non-static methods because

• data members cannot be overridden (it’d confusing if it’s possible. We down-delegate functions/behavior but not the data/state). It’s better off hiding data members behind getters/setters to declare the intention.
• static members and methods behaves like static variable/functions (living in .data or .bss) using namespaces, so we can only refer to them with SRO by the class names like Parent::f() and Child::a, not a class type like Parent p; p.f() and Child c; c.a. There’s no object c for you to upcast to Parent so there’s place for polymorphic behavior.

Overriding involves leaving clues in objects so the upcasted references can figure out the correct methods of the underlying objects to call. In C++ it’s done with having a vtable (pointers to overridable methods, often stored in .rodata with string literals) for each class in the hierarchy and each object contains a pointer to the vtable that matches its underlying class.

[38] virtual only applies to methods’ signatures (function name and the data types in the argument list). vtable do not keep track of argument’s default values (if assigned) for efficiency (it’ll always read the static upcast, aka parent methods’ default values).

Classes (after considering inheritance)

Design relationships

• class behaves through public methods
• Inheritance at least always inherits an interface
• IS-A relationship is done with public-inheritance
• … (incomplete, will update later)

47 total views

Rationale Behind C++ Commandments (4) – Method Signature System

Function signature system, which allows users to use the same function name in different functions as long as they differ in the combination of

• input arguments types
• const modifiers counts as a different input argument type
• object const-ness (whether it’s const-method or not) – this only make sense with classes

and C++ will figure out what to call by matching the call with the available combinations (signatures).

C does not allow the same function name to be used in different places, so under the hood, it’s done through name mangling (generating a unique ‘under-the-hood’ function name based on the signature). This mechanism has a lot of implications that a professional programmer should observe:

• since C does not mangle its names in the object code, they’ll need to be wrapped around with extern “C” block in a C++ program so C++ won’t pervert (mangle) their function names with input arguments.
• [24] parameter defaulting might be ambiguous with another function that does not have the said parameter (the compiler will cry about it)
• [26] access controls/levels must play no part in resolving signatures because access level must not change the meaning of a program!

• free functions (free functions are at the root namespace), as well as
• classes (the name of the class itself is the namespace)

48 total views

Rationale Behind C++ Commandments (3) – Classes came from emulating POD data types through struct and namespaces

In structured programming (like C and C++), the building abstractions is program (functions) and data (variables).

Under the hood, especially in von-Neumann architecture’s perspective, functions and variables are both just data (a stream of numbers) that can be moved and manipulated the same way just like data. It’s all up to how the program designer and the hardware choose to give meaning to the bit stream.

Namespaces

In C, we can only scope our variables 3 ways: global, static (stays within same file/translation unit) and local. Sharing variables across functions in different translation units can only be done through

• globals (pollutes namespace and it’s difficult to keep track of who is doing what to the variables and the state at any time)
• passing (the more solid way that gives tighter control and clearer data flow, but managing how to pass many variables in many places is messy, even with struct syntax)

Bundling program with data gives a new way to tightly control the scope of variables: you can specify a group functions allowed to share the same set of variables in the bundle WITHOUT PASSING arguments.

The toolchain modified to recognize the user-defined scope boundaries which bundles program and data into packages, thus reducing root namespace pollution. The is implemented as namespace keyword in C++

Organizing with namespaces is basically justifying the mentality of using globals (in place of passing variables around intended functions) except it’s in a more controlled manner to keep the damages at bay. The same nasty things with gloabls can still appear if we didn’t design the namespace boundaries tightly so certain functions have access to variables that’s not intended for it.

Therefore, namespaces works nearly identical to a super-simple purely static class (see below) except you lose inheritance and access modifiers in classes in exchange for allowing anonymous namespaces.

Basically namespaces + structs + inheritance + encapsulation (access modifiers) = classes

Classes

Classes extends the idea of namespaces by allowing objects (each assigned their own storage space for the variables following the same variable layout) to be instantiated, so they behave like POD (Plain Old Data) in C. We should observe that when overloading operators

• [15] allow (a=b)=c chaining by returning *this for operator=
• [21] disallow rvalue assignment (a+b)=c by returning const object

In the most primitive form (no dynamic binding and types, aka virtuals and RTTI), function (method) info is not stored within instantiated objects as the compiler will sort out what classes/namespace they belong to. So it screams struct in C!

C struct is what makes (instantiates) objects from classes!

Note that C structs do not allow ‘static fields’ because static members is solely a construct of namespaces idea in C++! C++ has chosen to expand structs to be synonymous to classes that defaults to private access (if not specified) so code written as C structs behaves as expected in C++.

48 total views

‘Static classes’ are unlike instantiable (object-bearing) classes in many ways

Technically there’s no static class in C, but a class with all members and functions declared static.

Static classes are like namespaces in many ways. Because no object is constructed (it’s just holding a bunch of variables and functions in the free space), a lot of features and syntax with regular classes do not make sense with static classes.

Because no objects are instantiated

• No constructors or destructors (no objects to make/destroy)
• No operator overloading (you need an instantiation to pass arguments to operator methods)
• No overriding because there are no objects for you to upcast
(nor there’s an object to store the vtable from the virtual keyword)!

Static members and methods are treated as free objects scoped by namespaces

• Like C, static members variables live in .bss (not explicitly initialized ones will be zero-initialized) or .data (initialized) sections, not on stack/heap!
Exception: static const int is internally seen as enum, which the compiler uses it to plug values in the code instead of allocating space for it.
• Therefore the syntax is pretty much like free static/global variables
• No constructor to build member variables within the class definition, so they must be defined OUTSIDE the class definition at the top level (just like static/globals), with a SRO (scope resolution operator).
• Static methods acts like (and function overloads the same way as) free functions.
That’s why we often use static methods for helpers.

Namespaces has no access modifiers (public/protected/private/friend), but in return only namespaces can be unnamed/anonymous (which behaves as private)!

Namespaces cannot be inherited, but static classes can!

• Inherited members ARE REFERENCES to the parent!
There’s no extra copies of underlying data if that member is successfully inherited (not shadowed)!
• Members (function or variables) can only be shadowed in the child (never overridden since it’s not an object), which creates a NEW stack variable and hid the reference to the parent member

Static class’s inheritance behavior is the same across static classes object-bearing classes! It’s actually more explicit with static members as you’ll need two declarations outside the classes if you shadow.

I am pointing this out to show that inheriting static classes IS NOT cloning namespaces! Static classes behaves as if it’s just ONE CHILD object created on the .bss/.data section (the section for static variables).

This means unlike object-bearing classes, the static class Parent cannot exist on its own if its children are defined!

C++ rules are almost always sensible and coherent; but when combined, sometimes the implications could be surprising on the first sight! When we try to extrapolate expected behaviors in C++, very often we have to think not in terms of the convenient syntax, but the implications of its ground rules (a lot of them stems from C)!

56 total views

Rationale Behind C++ Commandments (2) – Philosophy of C

Everything is seen as a bitstream

• pointers are just integers to memory locations
– [25] integer and pointers might be indistinguishable in signature resolution
• code (CPU instructions) and addresses are treated the same way as a stream of data
– concept of function pointers leads to lambda (functors)
– classes came from structs containing data and function pointers (combined with namespaces)!
• unchecked type declarations: the compiler trusts your interpretations of data
– leads to run-time features such as overriding (virtual methods)
• handles (pointers and references) has unrestricted power
– [29] can const_cast it away if the handle is exposed (bad idea)

Performance-first design choice

• do not pay performance penalty for features not used
– static compilation and binding by default
– unchecked type declarations (see above)
• static compilation: the compiler tries to know everything at compile time
• static binding by default (cheapest)
– pay extra to use virtual methods (overriding)
– [38] default parameter values are statically bound and not stored in vtable (i.e. overridden child method’s default values are ignored and parent’s default values are used ONLY WHEN called through up-casts)
• inline is at the mercy of the optimizer (which can choose to emit an object if decided inlining is counter-productive). Mechanism that forces a function pointer to exist (pointing the function, virtual functions creates the pointer in vtable)

Toolchain

1. preprocessor (parser & macros)
2. compiler (create object files per translation unit, which is .c file in C)
– access control (encapsulation) extends the old trick of emulating private in C++ through macros by marking functions as static (local within translation unit) in C.

Templates behaves like a combination of macros (copy-and-paste with parameters) except it’s spread across the toolchain like inline optimizations:

• Code bloat (one copy per type combination)
• Can only live in the header files (it’s a template, not realized code, so no object is emitted like a .cpp file)

Parsing (language design)

• most vexing parse [Effective STL Item 6]: if something can be interpreted as a function declaration, it will be interpreted as a function declaration

Plain Old Data Types (C++ classes tried to emulate in their operator overloading behavior)

• [15] allow (a=b)=c chaining by returning *this for operator=
• [21] disallow rvalue assignment (a+b)=c by returning const object

44 total views

Rationale Behind C++ Commandments (1) – Introduction

If you’ve programmed in (or studied) C++ long enough, like you have read Scott Meyers’s Effective C++, which is a book organized in, jokingly, commandments like ‘thou shall make destructors virtual’. There’s a lot of stuff to remember.

I’ve found an approach to make the ideas stick: by understanding the rationale behind these commandments through the lens of ‘What would you do if you were to make C++ (features) out of C?

C++ is not a language designed from scratch. A lot of quirks and oddities in C++ came straight from the philosophy and the language features naturally available in C. With the right jargons (concepts), you will find a lot of the seemingly counter-intuitive behavior ‘it ought to be like this because of (insert design choice here)‘.

This is what we are going to explore in the “Rational Behind C++ Commandments” (RBCC) blog post series which came from my notes when I was going through Scott Meyer’s book. Once you get the ideas, you should be able to come up with the rules in Effective C on your own (so you don’t have to blindly remember them).

46 total views

Oversimplified: Getting rid of data in STL containers Summary of Item 9 in "Effective STL"

Unless deleting a known range of elements directly through iterators (no conditions to match), which rangeerase() method can be used directly, targeting specific key/value/predicate requires understanding of the container’s underlying data structure.

I’d like to give a summary of Item#9 in “Effective STL” by defining the names and concepts so the complicated rules can be sensibly deduced by a few basic facts.

The words ‘remove‘ and ‘erase‘ has very specific meaning for STL that are not immediately intuitive.

There is a remove() method for lists, which is an old STL naming inconsistency (they should have called it erase() like for associative containers). Treat it as a historical mistake.

The usage is easy to remember once you understand it with the right wording above:

Note that there are two steps for sequential (contiguous+lists) containers , hence the erase-remove idiom. It’s really two steps:

auto tail = remove(c.begin(), c.end(), T);
c.erase(tail, c.end());

but they can be combined in one line since the tail is only useful at one place. Hence

c.erase( remove(c.begin(), c.end(), T), c.end() );

Lists provides a efficient shortcut method (see table below) since linked-lists does not need to be rearranged (just short the pointers).

Never try range-based remove_?() for associative containers. It is a data corruption trap if any attempt is made to use anything named remove on associative containers.

The trap used to be possible since <algorithms> and containers were separate, but newer C++ protects you from the trap by checking if the element you are moving is of a MoveAssignable type. Since associative containers’ keys cannot be modified (without a rescan), the elements are not move-assignable.

As for erasing through for-loops (necessary if you want to sneak in an extra step while iterating), C++11 now returns an iterator following the last erased element uniformly across all containers. This helps to preserve the running iterator that gets invalidated immediately after the erase through i=c.erase(i);

* For brevity, I twisted the term unordered here to mean that the native (implementation) data order is dependent on the data involved.

When I said ‘cannot rearrange’, I meant ‘cannot efficiently rearrange’, since there are no cheap O(1) next() or prev() traversal.

It’s a mess to simply copy one element over another (during rearrangement), leaving orphans there, and re-balance a BST or re-hash a hash-map. Nobody wants to go through such pains to remove element when there are tons of more direct ways out there.

954 total views

Understanding the difference between recognized arrays and pointers 'Recognized' means sizeof(array_name) gives the underlying allocated size

array≠ pointer:

pointer only contains a memory location,
while an array already has memory allocated to hold the data.

The confusion comes from the fact that array names are always seen as pointers anywhere in C, but when an array name is referred in places that the scope happens know the allocated size, namely

• Global arrays: everybody knows the size
• Local arrays: only the instantiating function knows its size.

, the array name itself has a superpower that pointers lack: report the underlying allocated data size (NOT pointer size) using sizeof(array).

Definition: An array is ‘recognized‘ if the array name is used in the scope that knows the underlying data size.

Corollary: Calling the array name with sizeof() gives the underlying allocated data size.

Examples of consequences that can be derived from the definition above:

• Heap allocations always return a pointer type, NOT an array name!
So heap arrays are never recognized.
• VLA in C99 are considered local stack arrays, so it’s recognized
• x[] is just a cosmetic shorthand for *x: it doesn’t prevent any recognized array from decaying into a pointer across boundary.
• The storage duration (static or not) does not matter. e.g.
• Heap pointers at global level are not recognized arrays
• Static local array still loses the recognition across function boundaries
(unless passed carefully by data type T (&array)[N]).

Most often recognized arrays cannot be aliased without decaying into a pointer. However, we can bind a recognized array to a reference to an array, which is a completely different type. Example:

int v[]{1,2,3,4};
int (&w)[4]=v;  // w is a reference to an array of size 4

int* p = v;     // Decays v to a pointer. Size information lost.
// int &w[4]=v; // Does not compile: this means an array of 4 references.


Note that the syntax requires a bracket for reference name. Omitting it will lead the compiler to misinterpret it as an array of references, which cannot* be compiled.

This means contrary to common beliefs, you can pass a recognized array across functions through reference, but this is rarely done because of the hassle of explicitly entering the number of elements (4 for the example above) as part of the data type. This can still be done through templates/constexpr, but for such inconvenience, we’re better off using std::vector (or std::array if you want near zero overhead).

However, so far I haven’t found a way to re-recognize an array from a pointer. That means there is no way to keep a local array’s recognition across function boundaries in C since it does not have references like C++.

To summarize with a usage example: this post has described the entire logic needed to decide whether sizeof(x)/sizeof(x[0]) gives you the number of array elements, or how many times your machine pointer type is bigger than the element storage.

* references must be bound on creation. Declaring an array of references means you want to bound references in batches. There are no mechanisms to do so as of C++14.

1,123 total views

begin() and end() is defined for arrays in C++11 and above

I was a little confused the first time I saw range-based for-loop in C++11 on “Tour of C++” that it works right out of the box for both recognized array and STL containers and yet the text says it requires begin() and end() to be defined for the object for range-based for-loop to run.

I later learned that despite typical STL usage example writes v.begin(), v.end(), the most bulletproof way is to write begin(v), end(v) instead (Herb Sutter recommends it). Then I started to suspect that C++11 must have defined free-form (non-member) begin(), end() functions that takes in arbitrary recognized arrays. I pulled up my code editor and ran this:

#include <iostream>
int main()
{
int v[4]={1,2,3,4};
std::cout << *(std::crend(v)-1) << std::endl;

return 0;
}


It compiled and ran uneventfully, printing ‘1’ as expected (I’m using crend(), to see if they implemented the more obscure ones). It makes more sense now why range-based for-loop works for arbitrary recognized arrays without making an exception to the begin(), end() requirement.

To confirm that it is the case (since “Tour of C++” didn’t say anything about why arbitrary array works for range-based for loop), I looked up the STL source code from libc++ in LLVM, namely <iterator>, and saw this:

template <class T, size_t N> constexpr T* begin(T (&array)[N]);

Bingo! There’s a mechanism to do so. But before I close, Stephan T. Lavavej (Mr. STL) mentioned that the template quoted above is no longer required (by the standard) to implement range-based for-loop in C++11.

Now the conclusion becomes that begin(), end() that takes in recognized arrays exist (which completes the logic behind range-based for-loop), but the range-based for loop can (and typically will) handle recognized arrays without these templated functions defined.

930 total views

Do not help the compiler at the expense of readability Unless you read the assembly code emitted at the bottleneck and did benchmarks

Compilers has gotten smarter and smarter nowadays that they’d be able to analyze our code for common patterns (or logically deduce away steps that doesn’t have to be  performed at runtime).

Matt Godbolt gave a nice presentation at CppCon 2017 named “What Has My Compiler Done for Me Lately?”. Through observing the emitted assembly code at different optimization levels, he showed that the compiler doesn’t need to be micromanaged (through performance hacks in our code) anymore, as it will emit instructions as the performance-hacked code intended when it is better to do so.

It means the compiler writers already know our bag of performance hack tricks way better than we do. Their efforts spared us from premature optimization and leave us more time to find a better data structure or algorithm to solve the problem.

What I got from the lecture is NOT that we are free to write clumsy code and let the compiler sort it out (though it occasionally can, like factoring a loop doing simple arithmetic series into a one line closed form solution), but we don’t have to make difficult coding choices to accommodate performance anymore.

The most striking facts I learned from his lecture are

• The compiler can emit a one-line CPU instruction that does not have a corresponding native operation C/C++ if your hardware architecture supports it. (e.g. clang can convert a whole loop that counts the number of set bits into just ‘popcnt eax, edi‘)
• Through Link-Time Optimization (LTO), we don’t have to pay the performance penalty for language features that are ultimately necessary for the current compilation (e.g. virtuals are automatically dropped if the linker finds that nowhere in the output currently needs it)

With such LTO,  why not do away the virtual specifier and make everything unspecified virtual by default anyway (like Java)? For decades, we’ve been making up stories that some classes are not meant to be derived (like STL containers), but the underlying motive is that we don’t want to pay for vtable if we don’t have to.

Instead of confusing new programmers about when should they make a method virtual (plenty of rule-of-thumbs became dogma), focus on telling them whenever they (choose to upcast a reference/pointer to the parent anywhere in their code and) invoke the destructor through the parent reference/pointer, they will pay a ‘hefty’ price of vtable and vptr.

I don’t think anybody (old codebase) will get harmed by turning on virtuals by default and let the linker decide if those virtuals can be dropped. If it changes anything, it might turn buggy code with the wrong destructor called into correct code which runs slower and takes up more space. In terms of correctness, this change might break low-level hacks that expects the objects to be of certain size (e.g. alignment) without vptr.

Even better, add a class specifier that mandates that all uses of its child must not invoke vtable (have the compiler catch that) unless explicitly overridden (the users decide to pay for the vtable). This way the compiler can warn about performance and space issues for the migration.

The old C++’s ideal was “you only pay for the language features you used (written)”, but as compilers gets better, we might be able change it to “you pay extra only for the language features that are actually used (in the finally generated executable) with your permission”.

I’d also like to add Return Value Optimization (RVO) into my list of compiler advances that changes the way we code. C++11 added move semantics, but I think it’s something that the compiler in the future could be able to manage themselves. Even with an old C++ compiler like the one shipped with VisualDSP 5.0, the copy constructor was not called (yes, skipping it is legal even if the copy constructor has side effects) when I do this:

Matrix operator+(const Matrix& a, const Matrix& b)
{
Matrix c(a.dim);
// ... for all element i, c.raw[i] = a.raw[i]+b.raw[i]
return c;
}
Matrix c = a + b;

Actually, the compiler at that time was not that smart about RVO, the actual code I wrote originally had two return branches, which defeats RVO (it’s a defined behavior by the specs):

Matrix operator+(Matrix a, Matrix b)
{
Dims m = a.dims;
if( m == b.dims ) // Both inputs must have same dimensions
{
Matrix c(m); // Construct matrix c with same dimension as a
// ... for all i, c.raw[i] = a.raw[i] + b.raw[i]
return c;
}
else
{
return Matrix::dummy; // A static member, which is a Matrix object
}
}

Matrix operator+(Matrix a, Matrix b)
{
Dims m = a.dims;
if( m == b.dims ) // Both inputs must have same dimensions
{
Matrix c(m); // Construct matrix c with same dimension as a
// ... for all i, c.raw[i] = a.raw[i] + b.raw[i]
}
else
{
Matrix c = Matrix::dummy; // or just "Matrix c";
}
return c;
}

I think days are counting before C++ compilers can do “copy-on-write” like MATLAB does if independent compilation are no longer mandatory!

Given my extensive experience with MATLAB, I’d say it took me a while to get used designing my code with “copy-on write” behavior in mind. Always start with expressive, maintainable, readable and correct code keeping in mind the performance concerns only happens under certain conditions (i.e. passed object gets modified inside the function).

If people start embracing the mentality of letting the compiler do most of the mechanical optimization, we’ll move towards a world that debugging work are gradually displaced by performance-bottleneck hunting. In my view, anything that can be done systematically by programming (like a boilerplate code or idioms) can eventually be automated by better compiler/linker/IDE and language design. It’s the high-level business logic that needs a lot of software designers/engineers to translate fuzzy requirements into concrete steps.

Matt also developed a great website (http://godbolt.org/) that compiles your code repeatedly on the fly and shows you the corresponding assembly code. Here’s an example of how I use it to answer my question of “Should I bother to use std::div() if I want both the quotient and remainder without running the division twice?”:

The website also included a feature to share the pasted code through an URL.

As seen from the emitted assembly code, the answer is NO. The compiler can figure out that I’m repeating the division twice and do only one division and use the quotient (stored in eax) and remainder (stored in edx). Trying to enforce one division through std::div() requires an extra function call, which is strictly worse.

The bottom line: don’t help the compiler! Modern compiler does context free optimizations better than we do. Use the time and energy to rethink about the architecture and data structure instead!

975 total views