characters (\d digits,\w word (i.e. letter/digit/underscore), \s whitespace).
[]character classes (define rules over what characters are accepted, unlike the . wildcard) [3-7] hypen inside [] bracket can specify ranges to mean things such as `[3,4,5,6,7]` [^ ...] is the mirror of it to exclude the mentioned characters
|choices (think of it as OR)
Complement (i.e. everything but) version are capitalized, such as \D is everything not a \d
whitespaces (\n newline, \t tab,
Modifiers
repetition quantifiers (? 0~1 times, + at least once, * any times, {match how many times})
(? ...)inline modifiers alters behaviors such as how newlines, case sensitivity, whether (...) captures or just groups, and comments within patterns are handled
Positioning rules
anchors (^ begins with, $ ends with)
\b word boundary
Output behavior
(...)capturing group, (?: ...)non-capturing group
\(index)content of previous matched groups/chunks referred to by indices. This feature generates derived new content instead of just extracting
(?( = | <= | ! | <! ) ...assertions...)lookaroundsskips the contents mentioned in ...assertion... before/after the pattern so you can toss out the matched assertion from your capture results.
(?s) Also match newline characters (‘single-line’ or DOTALL mode)
Starting with (?s)flag (also called inline modifiers) expands the . (dot) single character pattern to ALSO match multiple lines (not by default).
Useful for extracting the contents of HTML blocks blindly and post-process it elsewhere
(?m) Pattern starts over as a new string for each line (‘multi-line’ mode)
Starting with (?m) flag tells anchors ^ (begin with) and $ (end with) to
Assertions: use lookarounds to skip (not capture) patterns (?( = | <= | ! | <! ) assertion pattern)
< is lookbehind, no prefix-character is lookahead. -ahead/-behind refers to WHERE the you want TO CAPTURE relative to the assertion pattern, NOT what you want to assert (match and throw) away (inside the (? ...) )
= (positive) asserts the pattern inside the lookaround bracket, ! (negative) asserts the pattern inside the lookaround bracket MUST BE FALSE.
Assertions are very useful for getting to the meat you really want to capture rather than sifting through patterns introduced solely for making assertions that you intended to throw away
For GUI development, we often start with controls (or widgets) that user interact with and it’ll emit/run the callback function we registered for the event when the event happens.
Most of the time we just want to read the state/properties of certain controls, process them, and update other controls to show the result. Model-View-Controller (MVC) puts strict boundaries between interaction, data processing and display.
The most common schematic for MVC is a circle showing the cycle of Controller->Model->View, but in practice, it’s the controller that’s the brains. The view can simultaneously accept user interactions, such as a editable text box or a list. The model usually don’t directly update the view directly on its own like the idealized diagram.
From https://www.educative.io/blog/mvc-tutorial
With MVC, basically we are concentrating the control’s callbacks to the controller object instead of just letting each control’s callback interact with the data store (model) and view in an unstructured way.
When learning Flutter, I was exposed to the Redux pattern (which came from React). Because the tutorials was designed around the language features of Dart, the documentation kind of obscured the essence of the idea (why do we want to do this) as it dwelt on the framework (structure can be refactored into a package). The docs talked a lot about boundaries but wasn’t clearly why they have to be meticulously observed, which I’ll get to later.
The core inspiration in Redux/BLoC is taking advantage of the concept of ‘listening to a data object for changes’ (instead of UI controls/widget events)!
Instead of having the UI control’s callback directly change other UI control’s state (e.g. for display), we design a state vector/dictionary/struct/class that holds contents (state variables) that we care. It doesn’t have to map 1-1 to input events or 1-1 to output display controls.
When an user interaction (input) event emitted a callback, the control’s callback do whatever it needs to produce the value(s) for the relevant state variable(s) and change the state vector. The changed state vector will trigger the listener that scans for what needs to be updated to reflect the new state and change the states of the appropriate view UI controls.
This way the input UI controls’ callbacks do not have to micromanage what output UI controls to update, so it can focus on the business logic that generates the content pool that will be picked up by the view UI controls to display the results. In Redux, you are free to design your state variables to match more closely to the input data from UI controls or output/view controls’ state. I personally prefer a state vector design that is closer to the output view than input controls.
The intuition above is not the complete/exact Redux, especially with Dart/Flutter/React. We also have to to keep the state in ONE place and make the order of state changes (thus behavior) predictable!
Actions and reducers are separate. Every input control fires a event (action signal) and we’ll wait until the reducers (registered to the actions) to pick it up during dispatch() instead of jumping on it. This way there’s only ONE place that can change states. Leave all the side effects in the control callback where you generate the action. No side effects (like changing other controls) allowed in reducers!
Reducers do not update the state in place (it’s read only). Always generate a new state vector to replace the old one (for performance, we’ll replace the state vector if we verified the contents actually changed). This will make timing predictable (you are stepping through state changes one by one)
In Javascript, there isn’t really a listener actively listening state variable changes. Dispatch (which will be called every time the user interacts using control) just runs through all the listeners registered at the very end after it has dispatched all the reducers. In MATLAB, you can optionally set the state vector to be Observable and attach the change listener callback instead of explicitly calling it within dispatch.
Here is an example of a MATLAB class that captures the spirit of Redux. I added a 2 second delay to emulate long operations and used enableDisableFig() to avoid dealing with queuing user interactions while it’s going through a long operation.
classdef ReduxStoreDemo < handle
% Should be made private later
properties (SetAccess = private, SetObservable)
state % {count}
end
methods (Static)
% Made static so reducer cannot call dispatch and indirectly do
% side effect or create loops
function state = reducer(state, action)
% Can use str2fun(action) here or use a function map
switch action
case 'increment'
fprintf('Wait 2 secs before incrementing\n');
pause(2)
state.count = state.count + 1;
fprintf('Incremented\n');
end
end
end
% We keep all the side-effect generating operations (such as
% temporarily changing states in the GUI) in dispatch() so
% there's only ONE PLACE where state can change
methods
function dispatch(obj, action, src, evt)
% Disable all figures during an interaction
figures = findobj(groot, 'type', 'figure');
old_fig_states = arrayfun(@(f) enableDisableFig(f, 'off'), figures);
src.String = 'Wait ...';
new_state = ReduxStoreDemo.reducer(obj.state, action);
% Don't waste cycles updating nops
if( ~isequal(new_state, obj.state) )
% MATLAB already have listeners attached.
% So no need to scan listeners like React Redux
obj.state = new_state;
end
% Re-enable figure obj.controls after it's done
arrayfun(@(f, os) enableDisableFig(f, os), figures, old_fig_states);
src.String = 'Increment';
end
end
methods
function obj = ReduxStoreDemo()
figure();
obj.state.count = 0;
h_1x = uicontrol('style', 'text', 'String', '1x Box', ...
'Units', 'Normalized', ...
'Position', [0.1 0.3, 0.2, 0.1], ...
'HorizontalAlignment', 'left');
addlistener(obj, 'state', ...
'PostSet', @(varargin) obj.update_count_1x( h_1x , varargin{:}));
uicontrol('style', 'pushbutton', 'String', 'Increment', ...
'Units', 'Normalized', ...
'Position', [0.1 0.1, 0.15, 0.1], ...
'Callback', @(varargin) obj.dispatch('increment', varargin{:}));
% Force trigger the listeners to reflect the initial state
obj.state = obj.state;
end
end
%% These are 'renders' registered when the uiobj.controls are created
% Should stick to reading off the state. Do not call dispatch here
% (just leave it for the next action to pick up the consequentials)
methods
% The (src, event) is useless for listeners because it's not the
% uicontrol handle but the state property's metainfo (access modifiers, etc)
function update_count_1x(obj, hObj, varargin)
hObj.String = num2str(obj.state.count);
end
end
end
The idea of bundling code and program into a layout (classes) and injecting it with different data (objects) leads to a ‘new’ way (newer than C) of organizing our programs through the worldview of objects.
Every unit is seen as
a state: all member variables
possible actions: methods = member functions.
that is ready to interact with other objects.
Encapsulation (through access control)
The first improvement upon OOP is privacy (data encapsulation). You can have finer controls of what to share and with who. In C++, your options are:
public: everybody
private: only within yourself (internal use)
protected: only shared with descendants (inheritance discussed below)
Granting certain class as friend (anywhere in the class declaration with friend class F) exposes the non-public sections specifically to the friend F. This is often a ‘loophole’ to access control that finds few legitimate uses other than testing.
friend functions are traditionally used in binary (2-input) operator overloading, but the modern wisdom is to screw it and just leave it out there as free functions!
protected has very few good uses other than preventing heap deletethrough base pointer non-polymorphically (child destructor not called: BAD) by making the base destructor non-public (i.e. meaning it’d be impossible to have base objects on stack) while letting the child chain the parent’s destructor (child can’t access it if it’s marked as private).
The second improvement is to allow classes to build on top of existing ones. What gets interesting (and difficult) is when the child ‘improve’ on the parent by either by replacing what they have (member variables) and what they do (methods) with their own.
Inheritance AT LEAST always inherits an interface (can optionally inherit implementation).
Base implementation MUST NOT be inherited
pure virtual methods
Base implementation inherited by default
virtual
Base implementation MUST be inherited
non-virtual (and not shadow it)
Shadowing
Whenever the member (function or variable) name is used in any form (even with different argument types or signatures), the parent member with the same name will be hidden. The behavior is called shadowing, and it applies unless you’ve overridden ALL versions (signatures) of virutal parent methods which shares the same function name mentioned in child.
Any non-overriden method with the same name as the parent appearing in the child will shadow all parent methods with the same name regardless of whether they are declared virtual and overriden at child.
You can unhide parent methods with the same name (but different signature) by using Parent::f(..) declared at the child class.
Shadowing implies there’s always one parent version and one child version stored separately under all conditions {static or non-static}x{function or variable}
Static members don’t really ‘shadow’ because there’s only one global storage for each (parent and child) if you declare the same variable name again in the child. There’s nothing to hide because you cannot cast or slice a namespace! With static members, you have to be explicit about which class you are calling from with SRO like Parent::var or Child::var so there’s no potential for ambiguities.
Overriding
Just like C, C++ uses static binding that takes the programmer’s word for it for their declared types, especially through handles. Overriding is a concept only needed when you plan to upcast your objects (child accessed through pointer/reference) to handle a broader class of objects but intend to the underlying object’s own version (usually child) of the methods (especially destructors) called by default.
We do this by declaring the parent method virtual and implement the child versions (must be of the same function signature). Overriding only make sense for non-static methods because
data members cannot be overridden (it’d confusing if it’s possible. We down-delegate functions/behavior but not the data/state). It’s better off hiding data members behind getters/setters to declare the intention.
static members and methods behaves like static variable/functions (living in .data or .bss) using namespaces, so we can only refer to them with SRO by the class names like Parent::f() and Child::a, not a class type like Parent p; p.f() and Child c; c.a. There’s no object c for you to upcast to Parent so there’s place for polymorphic behavior.
Overriding involves leaving clues in objects so the upcasted references can figure out the correct methods of the underlying objects to call. In C++ it’s done with having a vtable (pointers to overridable methods, often stored in .rodata with string literals) for each class in the hierarchy and each object contains a pointer to the vtable that matches its underlying class.
[38] virtual only applies to methods’ signatures (function name and the data types in the argument list). vtable do not keep track of argument’s default values (if assigned) for efficiency (it’ll always read the static upcast, aka parent methods’ default values).
Function signature system, which allows users to use the same function name in different functions as long as they differ in the combination of
input arguments types
const modifiers counts as a different input argument type
object const-ness (whether it’s const-method or not) – this only make sense with classes
and C++ will figure out what to call by matching the call with the available combinations (signatures).
C does not allow the same function name to be used in different places, so under the hood, it’s done through name mangling (generating a unique ‘under-the-hood’ function name based on the signature). This mechanism has a lot of implications that a professional programmer should observe:
since C does not mangle its names in the object code, they’ll need to be wrapped around with extern “C” block in a C++ program so C++ won’t pervert (mangle) their function names with input arguments.
[24] parameter defaulting might be ambiguous with another function that does not have the said parameter (the compiler will cry about it)
[26] access controls/levels must play no part in resolving signatures because access level must not change the meaning of a program!
C++ resolve function overloading using signatures within its local namespace. Function overloading works for both
free functions (free functions are at the root namespace), as well as
classes (the name of the class itself is the namespace)
In structured programming (like C and C++), the building abstractions is program (functions) and data (variables).
Under the hood, especially in von-Neumann architecture’s perspective, functions and variables are both just data (a stream of numbers) that can be moved and manipulated the same way just like data. It’s all up to how the program designer and the hardware choose to give meaning to the bit stream.
Namespaces
In C, we can only scope our variables 3 ways: global, static (stays within same file/translation unit) and local. Sharing variables across functions in different translation units can only be done through
globals (pollutes namespace and it’s difficult to keep track of who is doing what to the variables and the state at any time)
passing (the more solid way that gives tighter control and clearer data flow, but managing how to pass many variables in many places is messy, even with struct syntax)
Bundling program with data gives a new way to tightly control the scope of variables: you can specify a group functions allowed to share the same set of variables in the bundle WITHOUT PASSING arguments.
The toolchain modified to recognize the user-defined scope boundaries which bundles program and data into packages, thus reducing root namespace pollution. The is implemented as namespace keyword in C++
Organizing with namespaces is basically justifying the mentality of using globals (in place of passing variables around intended functions) except it’s in a more controlled manner to keep the damages at bay. The same nasty things with gloabls can still appear if we didn’t design the namespace boundaries tightly so certain functions have access to variables that’s not intended for it.
Therefore, namespaces works nearly identical to a super-simple purely static class (see below) except you lose inheritance and access modifiers in classes in exchange for allowing anonymous namespaces.
Classes extends the idea of namespaces by allowing objects (each assigned their own storage space for the variables following the same variable layout) to be instantiated, so they behave like POD (Plain Old Data) in C. We should observe that when overloading operators
[15] allow (a=b)=c chaining by returning *this for operator=
[21] disallow rvalue assignment (a+b)=c by returning const object
In the most primitive form (no dynamic binding and types, aka virtuals and RTTI), function (method) info is not stored within instantiated objects as the compiler will sort out what classes/namespace they belong to. So it screams struct in C!
C struct is what makes (instantiates) objects from classes!
Note that C structs do not allow ‘static fields’ because static members is solely a construct of namespaces idea in C++! C++ has chosen to expand structs to be synonymous to classes that defaults to private access (if not specified) so code written as C structs behaves as expected in C++.