Rationale Behind C++ Commandments (5) – OOP design

The idea of bundling code and program into a layout (classes) and injecting it with different data (objects) leads to a ‘new’ way (newer than C) of organizing our programs through the worldview of objects.

Every unit is seen as

  • a state: all member variables
  • possible actions: methods = member functions.

that is ready to interact with other objects.


Encapsulation (through access control)

The first improvement upon OOP is privacy (data encapsulation). You can have finer controls of what to share and with who. In C++, your options are:

  • public: everybody
  • private: only within yourself (internal use)
  • protected: only shared with descendants (inheritance discussed below)

Granting certain class as friend (anywhere in the class declaration with friend class F) exposes the non-public sections specifically to the friend F. This is often a ‘loophole’ to access control that finds few legitimate uses other than testing.

friend functions are traditionally used in binary (2-input) operator overloading, but the modern wisdom is to screw it and just leave it out there as free functions!

protected has very few good uses other than preventing heap delete through base pointer non-polymorphically (child destructor not called: BAD) by making the base destructor non-public (i.e. meaning it’d be impossible to have base objects on stack) while letting the child chain the parent’s destructor (child can’t access it if it’s marked as private).

protected member variables are almost always a bad idea.


Inheritance

The second improvement is to allow classes to build on top of existing ones. What gets interesting (and difficult) is when the child ‘improve’ on the parent by either by replacing what they have (member variables) and what they do (methods) with their own.

Static data members inherit REFERENCES to the parent!

Inheritance AT LEAST always inherits an interface (can optionally inherit implementation).

Base implementation MUST NOT be inheritedpure virtual methods
Base implementation inherited by defaultvirtual
Base implementation MUST be inheritednon-virtual (and not shadow it)

Shadowing

Whenever the member (function or variable) name is used in any form (even with different argument types or signatures), the parent member with the same name will be hidden. The behavior is called shadowing, and it applies unless you’ve overridden ALL versions (signatures) of virutal parent methods which shares the same function name mentioned in child.

  • Any non-overriden method with the same name as the parent appearing in the child will shadow all parent methods with the same name regardless of whether they are declared virtual and overriden at child.
  • You can unhide parent methods with the same name (but different signature) by using Parent::f(..) declared at the child class.
  • Shadowing implies there’s always one parent version and one child version stored separately under all conditions {static or non-static}x{function or variable}
  • Static members don’t really ‘shadow’ because there’s only one global storage for each (parent and child) if you declare the same variable name again in the child. There’s nothing to hide because you cannot cast or slice a namespace! With static members, you have to be explicit about which class you are calling from with SRO like Parent::var or Child::var so there’s no potential for ambiguities.

Overriding

Just like C, C++ uses static binding that takes the programmer’s word for it for their declared types, especially through handles. Overriding is a concept only needed when you plan to upcast your objects (child accessed through pointer/reference) to handle a broader class of objects but intend to the underlying object’s own version (usually child) of the methods (especially destructors) called by default.

We do this by declaring the parent method virtual and implement the child versions (must be of the same function signature). Overriding only make sense for non-static methods because

  • data members cannot be overridden (it’d confusing if it’s possible. We down-delegate functions/behavior but not the data/state). It’s better off hiding data members behind getters/setters to declare the intention.
  • static members and methods behaves like static variable/functions (living in .data or .bss) using namespaces, so we can only refer to them with SRO by the class names like Parent::f() and Child::a, not a class type like Parent p; p.f() and Child c; c.a. There’s no object c for you to upcast to Parent so there’s place for polymorphic behavior.

Overriding involves leaving clues in objects so the upcasted references can figure out the correct methods of the underlying objects to call. In C++ it’s done with having a vtable (pointers to overridable methods, often stored in .rodata with string literals) for each class in the hierarchy and each object contains a pointer to the vtable that matches its underlying class.

[38] virtual only applies to methods’ signatures (function name and the data types in the argument list). vtable do not keep track of argument’s default values (if assigned) for efficiency (it’ll always read the static upcast, aka parent methods’ default values).


Classes (after considering inheritance)

Design relationships

  • class behaves through public methods
  • Inheritance at least always inherits an interface
  • IS-A relationship is done with public-inheritance
  • … (incomplete, will update later)

 86 total views

Rationale Behind C++ Commandments (3) – Classes came from emulating POD data types through struct and namespaces

In structured programming (like C and C++), the building abstractions is program (functions) and data (variables).

Under the hood, especially in von-Neumann architecture’s perspective, functions and variables are both just data (a stream of numbers) that can be moved and manipulated the same way just like data. It’s all up to how the program designer and the hardware choose to give meaning to the bit stream.


Namespaces

In C, we can only scope our variables 3 ways: global, static (stays within same file/translation unit) and local. Sharing variables across functions in different translation units can only be done through

  • globals (pollutes namespace and it’s difficult to keep track of who is doing what to the variables and the state at any time)
  • passing (the more solid way that gives tighter control and clearer data flow, but managing how to pass many variables in many places is messy, even with struct syntax)

Bundling program with data gives a new way to tightly control the scope of variables: you can specify a group functions allowed to share the same set of variables in the bundle WITHOUT PASSING arguments.

The toolchain modified to recognize the user-defined scope boundaries which bundles program and data into packages, thus reducing root namespace pollution. The is implemented as namespace keyword in C++

Organizing with namespaces is basically justifying the mentality of using globals (in place of passing variables around intended functions) except it’s in a more controlled manner to keep the damages at bay. The same nasty things with gloabls can still appear if we didn’t design the namespace boundaries tightly so certain functions have access to variables that’s not intended for it.

Therefore, namespaces works nearly identical to a super-simple purely static class (see below) except you lose inheritance and access modifiers in classes in exchange for allowing anonymous namespaces.

Basically namespaces + structs + inheritance + encapsulation (access modifiers) = classes


Classes

Classes extends the idea of namespaces by allowing objects (each assigned their own storage space for the variables following the same variable layout) to be instantiated, so they behave like POD (Plain Old Data) in C. We should observe that when overloading operators

  • [15] allow (a=b)=c chaining by returning *this for operator=
  • [21] disallow rvalue assignment (a+b)=c by returning const object

In the most primitive form (no dynamic binding and types, aka virtuals and RTTI), function (method) info is not stored within instantiated objects as the compiler will sort out what classes/namespace they belong to. So it screams struct in C!

C struct is what makes (instantiates) objects from classes!

Note that C structs do not allow ‘static fields’ because static members is solely a construct of namespaces idea in C++! C++ has chosen to expand structs to be synonymous to classes that defaults to private access (if not specified) so code written as C structs behaves as expected in C++.

 85 total views

Rationale Behind C++ Commandments (2) – Philosophy of C

Everything is seen as a bitstream

  • pointers are just integers to memory locations
    – [25] integer and pointers might be indistinguishable in signature resolution
  • code (CPU instructions) and addresses are treated the same way as a stream of data
    – concept of function pointers leads to lambda (functors)
    – classes came from structs containing data and function pointers (combined with namespaces)!
  • unchecked type declarations: the compiler trusts your interpretations of data
    – leads to run-time features such as overriding (virtual methods)
  • handles (pointers and references) has unrestricted power
    – [29] can const_cast it away if the handle is exposed (bad idea)

Performance-first design choice

  • do not pay performance penalty for features not used
    – static compilation and binding by default
    – unchecked type declarations (see above)
  • static compilation: the compiler tries to know everything at compile time
  • static binding by default (cheapest)
    – pay extra to use virtual methods (overriding)
    – [38] default parameter values are statically bound and not stored in vtable (i.e. overridden child method’s default values are ignored and parent’s default values are used ONLY WHEN called through up-casts)
  • inline is at the mercy of the optimizer (which can choose to emit an object if decided inlining is counter-productive). Mechanism that forces a function pointer to exist (pointing the function, virtual functions creates the pointer in vtable)

Toolchain

  1. preprocessor (parser & macros)
  2. compiler (create object files per translation unit, which is .c file in C)
    – access control (encapsulation) extends the old trick of emulating private in C++ through macros by marking functions as static (local within translation unit) in C.
  3. linker (combine object files and adjust the addresses)

Templates behaves like a combination of macros (copy-and-paste with parameters) except it’s spread across the toolchain like inline optimizations:

  • Code bloat (one copy per type combination)
  • Can only live in the header files (it’s a template, not realized code, so no object is emitted like a .cpp file)

Parsing (language design)

  • most vexing parse [Effective STL Item 6]: if something can be interpreted as a function declaration, it will be interpreted as a function declaration

Plain Old Data Types (C++ classes tried to emulate in their operator overloading behavior)

  • [15] allow (a=b)=c chaining by returning *this for operator=
  • [21] disallow rvalue assignment (a+b)=c by returning const object

 81 total views

Rationale Behind C++ Commandments (1) – Introduction

If you’ve programmed in (or studied) C++ long enough, like you have read Scott Meyers’s Effective C++, which is a book organized in, jokingly, commandments like ‘thou shall make destructors virtual’. There’s a lot of stuff to remember.

I’ve found an approach to make the ideas stick: by understanding the rationale behind these commandments through the lens of ‘What would you do if you were to make C++ (features) out of C?

C++ is not a language designed from scratch. A lot of quirks and oddities in C++ came straight from the philosophy and the language features naturally available in C. With the right jargons (concepts), you will find a lot of the seemingly counter-intuitive behavior ‘it ought to be like this because of (insert design choice here)‘.

This is what we are going to explore in the “Rational Behind C++ Commandments” (RBCC) blog post series which came from my notes when I was going through Scott Meyer’s book. Once you get the ideas, you should be able to come up with the rules in Effective C on your own (so you don’t have to blindly remember them).

 86 total views

Using Dart’s C-style syntax to make chain of lambda functions concrete

Start with an example lambda calculus expression f\equiv\lambda x.(y+x). It is:

  • [programming view] a function with x as an input argument and it uses y from the outer workspace (called a free-variable). f(var x) { return y+x; }
  • [mathematical view] can also be written as f(x) = y+x where y is seen as a fixed/snapshotted value relative to the expression.
g(int y) => (int x) => y + x

Lambda calculus is right-associative

g(int y) => ( (int x) => y + x )

Unrolling in C-style it will give better insights to the relationships between layers

g(int y) {
  return (int x) { return y + x; };
}

Note that (int x) { return y + x; } is a functor. To emphasize this, the same code can be rewritten by assigning the name f to it and returning f:

g(int y) {
  f(int x) { return y + x; };
  return f;
}

Use the C++11 style syntax so that it doesn’t look like nested function implementation body instead of a functor nested inside a function:

g(int y) {
  var f = (int x) { return y + x; };
  return f;
}

Note that conceptually what we are doing with the wrapper cascade is indeed nested functions. However, in the wrapper, we spit out a functor (which did most of the partial work) so the user can endow/evaluating it with the last piece of needed info/input:

g(int y) {
  f(int x) { 
   return y + x;
  };
  return f;
}

More commonly as seen in Dart docs, this formatting shows a (binder/capturing) wrapper function returning its own nested function:

g(int y) {
  return (int x) { 
           return y + x;
         };
}

 94 total views

Understanding the difference between recognized arrays and pointers 'Recognized' means sizeof(array_name) gives the underlying allocated size

array≠ pointer:

pointer only contains a memory location,
while an array already has memory allocated to hold the data.


The confusion comes from the fact that array names are always seen as pointers anywhere in C, but when an array name is referred in places that the scope happens know the allocated size, namely

  • Global arrays: everybody knows the size
  • Local arrays: only the instantiating function knows its size.

, the array name itself has a superpower that pointers lack: report the underlying allocated data size (NOT pointer size) using sizeof(array).


Definition: An array is ‘recognized‘ if the array name is used in the scope that knows the underlying data size.

Corollary: Calling the array name with sizeof() gives the underlying allocated data size.

Examples of consequences that can be derived from the definition above:

  • Heap allocations always return a pointer type, NOT an array name!
    So heap arrays are never recognized.
  • VLA in C99 are considered local stack arrays, so it’s recognized
  • x[] is just a cosmetic shorthand for *x: it doesn’t prevent any recognized array from decaying into a pointer across boundary.
  • The storage duration (static or not) does not matter. e.g.
    • Heap pointers at global level are not recognized arrays
    • Static local array still loses the recognition across function boundaries
      (unless passed carefully by data type T (&array)[N]).

Most often recognized arrays cannot be aliased without decaying into a pointer. However, we can bind a recognized array to a reference to an array, which is a completely different type. Example:

int v[]{1,2,3,4};
int (&w)[4]=v;  // w is a reference to an array of size 4

int* p = v;     // Decays v to a pointer. Size information lost.
// int &w[4]=v; // Does not compile: this means an array of 4 references.

Note that the syntax requires a bracket for reference name. Omitting it will lead the compiler to misinterpret it as an array of references, which cannot* be compiled.

This means contrary to common beliefs, you can pass a recognized array across functions through reference, but this is rarely done because of the hassle of explicitly entering the number of elements (4 for the example above) as part of the data type. This can still be done through templates/constexpr, but for such inconvenience, we’re better off using std::vector (or std::array if you want near zero overhead).

However, so far I haven’t found a way to re-recognize an array from a pointer. That means there is no way to keep a local array’s recognition across function boundaries in C since it does not have references like C++.


To summarize with a usage example: this post has described the entire logic needed to decide whether sizeof(x)/sizeof(x[0]) gives you the number of array elements, or how many times your machine pointer type is bigger than the element storage.


* references must be bound on creation. Declaring an array of references means you want to bound references in batches. There are no mechanisms to do so as of C++14.

 

 1,161 total views

Do not help the compiler at the expense of readability Unless you read the assembly code emitted at the bottleneck and did benchmarks

Compilers has gotten smarter and smarter nowadays that they’d be able to analyze our code for common patterns (or logically deduce away steps that doesn’t have to be  performed at runtime).

Matt Godbolt gave a nice presentation at CppCon 2017 named “What Has My Compiler Done for Me Lately?”. Through observing the emitted assembly code at different optimization levels, he showed that the compiler doesn’t need to be micromanaged (through performance hacks in our code) anymore, as it will emit instructions as the performance-hacked code intended when it is better to do so.

It means the compiler writers already know our bag of performance hack tricks way better than we do. Their efforts spared us from premature optimization and leave us more time to find a better data structure or algorithm to solve the problem.

What I got from the lecture is NOT that we are free to write clumsy code and let the compiler sort it out (though it occasionally can, like factoring a loop doing simple arithmetic series into a one line closed form solution), but we don’t have to make difficult coding choices to accommodate performance anymore.

The most striking facts I learned from his lecture are

  • The compiler can emit a one-line CPU instruction that does not have a corresponding native operation C/C++ if your hardware architecture supports it. (e.g. clang can convert a whole loop that counts the number of set bits into just ‘popcnt eax, edi‘)
  • Through Link-Time Optimization (LTO), we don’t have to pay the performance penalty for language features that are ultimately necessary for the current compilation (e.g. virtuals are automatically dropped if the linker finds that nowhere in the output currently needs it)

With such LTO,  why not do away the virtual specifier and make everything unspecified virtual by default anyway (like Java)? For decades, we’ve been making up stories that some classes are not meant to be derived (like STL containers), but the underlying motive is that we don’t want to pay for vtable if we don’t have to.

Instead of confusing new programmers about when should they make a method virtual (plenty of rule-of-thumbs became dogma), focus on telling them whenever they (choose to upcast a reference/pointer to the parent anywhere in their code and) invoke the destructor through the parent reference/pointer, they will pay a ‘hefty’ price of vtable and vptr.

I don’t think anybody (old codebase) will get harmed by turning on virtuals by default and let the linker decide if those virtuals can be dropped. If it changes anything, it might turn buggy code with the wrong destructor called into correct code which runs slower and takes up more space. In terms of correctness, this change might break low-level hacks that expects the objects to be of certain size (e.g. alignment) without vptr.

Even better, add a class specifier that mandates that all uses of its child must not invoke vtable (have the compiler catch that) unless explicitly overridden (the users decide to pay for the vtable). This way the compiler can warn about performance and space issues for the migration.

The old C++’s ideal was “you only pay for the language features you used (written)”, but as compilers gets better, we might be able change it to “you pay extra only for the language features that are actually used (in the finally generated executable) with your permission”.


I’d also like to add Return Value Optimization (RVO) into my list of compiler advances that changes the way we code. C++11 added move semantics, but I think it’s something that the compiler in the future could be able to manage themselves. Even with an old C++ compiler like the one shipped with VisualDSP 5.0, the copy constructor was not called (yes, skipping it is legal even if the copy constructor has side effects) when I do this:

Matrix operator+(const Matrix& a, const Matrix& b)
{
  Matrix c(a.dim);
  // ... for all element i, c.raw[i] = a.raw[i]+b.raw[i]
  return c;
}
Matrix c = a + b;

Actually, the compiler at that time was not that smart about RVO, the actual code I wrote originally had two return branches, which defeats RVO (it’s a defined behavior by the specs):

Matrix operator+(Matrix a, Matrix b)
{
  Dims m = a.dims;
  if( m == b.dims ) // Both inputs must have same dimensions
  {
    Matrix c(m); // Construct matrix c with same dimension as a
    // ... for all i, c.raw[i] = a.raw[i] + b.raw[i]
    return c;
  } 
  else 
  {
    return Matrix::dummy; // A static member, which is a Matrix object
  }
}

To take advantage of RVO, I had to reword my code

Matrix operator+(Matrix a, Matrix b)
{
  Dims m = a.dims;
  if( m == b.dims ) // Both inputs must have same dimensions
  {
    Matrix c(m); // Construct matrix c with same dimension as a
    // ... for all i, c.raw[i] = a.raw[i] + b.raw[i]
  } 
  else 
  {
    Matrix c = Matrix::dummy; // or just "Matrix c";
  }
  return c;
}

I think days are counting before C++ compilers can do “copy-on-write” like MATLAB does if independent compilation are no longer mandatory!

Given my extensive experience with MATLAB, I’d say it took me a while to get used designing my code with “copy-on write” behavior in mind. Always start with expressive, maintainable, readable and correct code keeping in mind the performance concerns only happens under certain conditions (i.e. passed object gets modified inside the function).

If people start embracing the mentality of letting the compiler do most of the mechanical optimization, we’ll move towards a world that debugging work are gradually displaced by performance-bottleneck hunting. In my view, anything that can be done systematically by programming (like a boilerplate code or idioms) can eventually be automated by better compiler/linker/IDE and language design. It’s the high-level business logic that needs a lot of software designers/engineers to translate fuzzy requirements into concrete steps.


Matt also developed a great website (http://godbolt.org/) that compiles your code repeatedly on the fly and shows you the corresponding assembly code. Here’s an example of how I use it to answer my question of “Should I bother to use std::div() if I want both the quotient and remainder without running the division twice?”:

The website also included a feature to share the pasted code through an URL.

As seen from the emitted assembly code, the answer is NO. The compiler can figure out that I’m repeating the division twice and do only one division and use the quotient (stored in eax) and remainder (stored in edx). Trying to enforce one division through std::div() requires an extra function call, which is strictly worse.

The bottom line: don’t help the compiler! Modern compiler does context free optimizations better than we do. Use the time and energy to rethink about the architecture and data structure instead!

 1,015 total views,  1 views today

C Traps and Pitfalls

Here’s a concise paper describing common C programming pitfalls by Andrew Koening (www.literateprogramming.com/ctraps.pdf) corresponding to be book with the same title.

As a reminder to myself, I’ll spend this page summarizing common mistakes and my history with it.

Here are the mistakes that I don’t make because of certain programming habits:

  • Operator precedence: I use enough parenthesis to not rely on operator precedence
  • Pointer dereferencing: I always do *(p++) instead of *p++ unless it’s idiomatic.
  • for() or if() statements executing only first line: I always surround the block with {} even if it’s just one line. Too often we need to inject an extra line and without {} it becomes a trap.
  • Undefined side effect order: I never do something like y[i++]=x[i]
  • char* p, q: very tempting since C++ style emphasize on pointer as a type over whether the variable is a pointer. I almost never declare multiple variables in one line.
  • Macro repeating side effects: use inline functions instead whenever possible. Use templates in C++.
  • Unexpected macro associations: guard expressions with (). Use typedef.

Did these once before, adjusted my programming habits to avoid it:

  • counting down in for() loop with unsigned running variable: I stick with signed running variables in general. If I’m forced to use unsigned, I’ll remind myself that I can only stop AFTER hitting 1, but not 0 (i.e. i=0 never got executed).

Haven’t got a chance to run into these, but I’ll program defensively:

  • Integer overflow: do a<b instead of (a-b)<0. Calculate mean by adding halfway length to the smaller number (i.e. (a+b)/2 == a + (b-a)/2 given a<b). Shows up in binary search.
  • Number of bits to shift is always unsigned (i.e. -1 is a big number!)

What I learned from the paper:

  • stdio buffer on stack (registered with setbuf()) freed before I/O flushed: use static buffer (or just make sure the buffer lives outside the function call).
  • char type might be signed (128 to 255 are -128 to -1) so it sign extends during upcast. Use unsigned char go guarantee zero extend for upcasting.
  • toupper()/tolower() might be implemented as a simple macro (no checks, incorrect /w side effects)
  • Can index into a string literal: "abcdefg"[3] gives 'd'

Mistakes that I usually make when I switch back from full-time MATLAB programming:

  • Logical negation using ~ operator instead of ! operator.

Common mistakes I rarely make because of certain understanding:

  • Forgetting to break at every case in switch block. It’s hard to forget once you’re aware of the Duff’s device.
  • sizeof(a[])/sizeof(a[0]) after passing into a function does not give array length: hard to get it wrong once you understand that array (declared on stack) has meta-info that cannot be accessed beyond the stack level it’s initialized.

 1,356 total views