Dictionary of equivalent/analogous concepts in programming languages

CommonCC++MATLABPython
Variable arguments<stdarg.h>
T f(...)
Packed in va_arg
Very BAD!

Cannot overload
when signatures are uncertain.
varargin
varargout

Both packed as cells.

MATLAB does not have named arguments
*args (simple, stored as tuples)

**kwargs (specify input by keyword, stored as a dictionary)
Referencing
N/A
operator[](_) is for references
subsindex
subsassgn


[_] is for concat
{_} is for (un)pack
__getitem__()
__setitem__()
Default
values
N/ASupportedNot supported.
Manage with inputParser() or
newer arguments
Non-intuitive static data behavior. Stick to None or immutables.
Name-Value
Argument
Matching
Old way:
.., 'PropName', Value
and parse varargin

Since R2021a:
Name=Value
options in arguments
Name=Value
**kwargs
Major
Dimension
RowRowColumnRow (Native/Numpy)
Column for Pandas
ConstnessconstconstOnly in classesN/A (Consenting adults)
Variable
Aliasing
PointersReferencesNO! Rely on Copy-on-write
(No in-place functions*)

Handle classes under limited circumstances
References
= assignmentCopy one
element
Values: Copy
References: Bind
New Copy
Copy-on-write
NO VALUES
Bind references only
(could be to unnamed objects)
Chained
access operators
N/ADifficult to operator overload it rightDifficult to get it right. MATLAB had some chaining bugs with dataset() as well.Chains correctly natively
Assignment
expressions
(assignment evaluates to assigned lvalue)
==N/ANamed Expression :=
Version ManagementverLessThan()
isMATLABReleaseOlderThan
virtenv (Virtual Environment)
Exponentiation<math.h>
pow()
<cmath>
pow()
^**
Stream
(Conveyor belt mechanism. Saves memory)
I/O (std, file, sockets)
iterator in
STL containers
MATLAB doesn’t do references. Just increment indices.iterators (uni-directional only)
iter(): __iter__()
next(): __next__()
Loopingfor(init, cont_cond, next)C-style

for(auto running: iterable)
for k = array to iterate
list-comp

for (index, thing) in enumerate(lists)
Since MATLAB doesn’t do references, iterators (by extension generators) and functions that do in-place operations do not make sense (unless you bend it very hard with anti-patterns such as handles and dbstack).

Data Types

CommonCC++MATLABPython
SetsN/Astd::setOnly set operations, not set data type{ , , ...}
Dictionariesstd::unordered_map– Dynamic fieldnames
(qualified varnames as keys)
containers.Map() or dictionary() since R2022b
Dictionaries
{key:value}
(Native)
Heterogeneous containerscells {}lists (mutable)
tuples (immutable)
Structured
Heterogeneous containers
table()
dataset() [Old]

Mix in classes
Pandas Dataframe
Array,
Matrices &
Tensors
Native [ , ; , ]Numpy/PyTorch
Recordsstructclass
(members)
dynamic field (structs)
properties (class)

getfield()/setfield()
No structs
(use dicts)

attribute (class)
getattr()/setattr()
Type deductionN/AautoNativeNative
Type extractionN/Adecltype() for compile time (static)

typeid() for RTTI (runtime)
class()type()
Native sets operations in Python are not stable and there’s no option to use stable algorithm like MATLAB does. Consider installing orderly-set package.

Array Operations

CommonMATLABPython
Repeatrepmat()[] * N
np.repeat()
Logical IndexingNativeList comprehension
Boolean Indexing (Numpy)
Equally spaced numbersInternally colon():
start:step:end

linspace/logspace
range(begin, past_end, step)
produces an iterator

list(range()) or tuple(range())
iterates to realize the vector
Equally spaced indexingMATLAB has no generators,
so produced vector only
[start:past_end:step] is internally
slice() which produces a slice object, not range/lists/tuple. Faster but not iterable
Shallow copyDeep copy-on-writeSlice: x = y[:]
copy.copy()
Deep copyDeep copy-on-writecopy.deepcopy()

Editor Syntax

CommonCC++MATLABPython
Commenting/* ... */

// (only for newer C)
// (single line)

/* ... */ (block)
% (single line)

(Block):
%{
...
%}
# (single line)

""" or '''
is docstring which might be undersirably picked up
Reliable multi-line
commenting
(IDE)
Ctrl+(Shift)+R(Windows), / (Mac or Linux)[Spyder]:
Ctrl+1(toggle), 4(comment), 5(uncomment)
Code cell
(IDE)
%%[Spyder]:
# %%
Line
Continuation
\\...\
Console
Precision
format%precision (IPython)
Clear variablesclear / clearvars%reset -sf (IPython)
Macros only make sense in C/C++. This makes code less transparent and is frowned upon in higher level programming languages. Even its use in C++ should be limited. Use inline functions whenever possible.

Python is messy about the workspace, so if you just delete

Object Oriented Programming Constructs

CommonC++MATLABPython
Getters
Setters
No native syntax.

Name mangle (prefix or suffix) yourself to manage
Define methods:
get.x
set.x
Getter:
@property
def x(self): ...


Setter:
@x.setter
def x(self, value): ...
DeletersMembers can’t be
changed on the fly
Members can’t be
changed on the fly
Deleter (removing attributes
dynamically by del)
Overloading
(Dispatch function by signature)
OverloadingOverload only by
first argument
@overload (Static type)
@singledispath
@multipledispatch
Initializing class variablesInitializer Lists
Constructor
ConstructorConstructor
ConstructorClassName()
Does not return
(*this is implicit)
obj=ClassName(...)
MUST output the constructed object
__init__(self, ...)
Object to be constructed is 1st argument
Destructor~ClassName()delete()__del__()
Special
methods
Special member functions(no name)
method that control specific behaviors
Magic/Dunder methods
Operator overloadingoperatoroperator methods to defineDunder methods
Resource
Self-cleanup
RIAAonCleanup(): make a dummy object with cleanup operation as destructor to be removed when it goes out of scopewith Context Managers
Naming for the object itselfClass: (class’s own name by SRO ::)
Instance: *this
Class: (class’s own name)
Instance: obj (or any output name defined in constructor)
Class: cls
Instance: self
(Recommended PEP8 names)
Python allows adding members (attributes) on the fly with setattr(), which includes methods. MATLAB’s dynamicprops allows adding properties (data members) on the fly with addprop

onCleanup() does not work reliably on Python because MATLAB’s object destructor time is deterministic (MATLAB specifically do not garbage collect user objects to avoid this mess. It only garbage collects PODs) while Python leaves it up to garbage collector.

*this is implicitly passed in C++ and not spelled out in the method declaration. The self object must be the first argument in the instance method’s signature/prototype for both MATLAB and Python.

Functional Programming Constructs

CommonC++MATLABPython
Function as
variable
Functors
(Function Objects)
operator()
Function HandleCallables
(Function Objects)
__call__()
Lambda
Syntax
Lambda
[capture](inputs) {expr} -> optional trailing return type
Anonymous Function
@(inputs) expr
Lambda
lambda inputs: expr
Closure
(Early binding): an
instance of function objects
Capture [] only as necessary.

Early binding [=] is capture all.
Early binding ONLY for anonymous functions (lambda).

Late binding for function handles to loose or nested functions.
Late binding* by default, even for Lambdas.

Can capture Po through default values
lambda x,P=Po: x+P
(We’re relying users to not enter the captured/optional input argument)
Concepts of Early/Late Binding also apply to non-lambda functions. It’s about when to access (usually read) the ‘global’ or broader scope (such as during nested functions) variables that gets recruited as a non-input variable that’s local to the function itself.

An instance of a function object is not a closure if there’s any parameter that’s late bound. All lambdas (anonymous functions) in MATLAB are early bound (at creation).

The more proper way (without creating an extra optional argument that’s not supposed to be used, aka defaults overridden) to convert late binding to early binding (by capturing variables) is called partial application, where you freeze the parameters (to be captured) by making them inputs to an outer layer function and return a function object (could be lambda) that uses these parameters.

The same trick (partial application) applies to bind (capture) variables in simple/nested function handles in MATLAB which do behave the same way (early binding) like anonymous functions (lambda).

Currying is partial application one parameter at a time, which is tedious way to stay faithful to pure functional programming.

List comprehension is a shorthand syntax for transform/map() and copy_if/remove_if/filter() in one shot, but not accumulate/reduce(). MATLAB and C/C++ does not have listcomp, but listcomp is not specific to Python. Even Powershell has it.

Listcomp syntax, if wrapped in round brackets like (x**x for x in range(5)), gives a generator. Wrapping in square bracket is the shortcut of casting the generator into a list, so [x**x for x in range(5)] is the same as list(x**x for x in range(5)).

Coroutines / Asynchronous Programming

MATLAB natively does not support coroutines.

CommonC++20Python
GeneratorsInput IteratorsFunctions that yield value_to_spit_out_on_next
(Implicitly return a generator/functor with iter and next)
CoroutinesFunctions that value_accepted_from_outside = yield
Send value to the continuation by g.send(user_input)

async/await (native coroutines)

Matrix Arrays

The way Numpy requires users to specify matrices with a bracket for every row drives me nuts. Not only there’s a lot of typing, the superfulous brackets reinforce C’s idea of row-major which is horrendous to people with a proper math background who see matrices as column-major \mathbf{A}_{r,c}. Pytorch is the same.

Once you are trained in APL/MATLAB’s matrix world-view, you’ll discover going back to the world where matrices aren’t first class citizens is clumsy AF.

With Python, you lose the clutter free readability where your MATLAB code is one step away from the matrix equations in your scientific computing work, despite a lot of the features that addresses frequent use patterns are implemented earlier in Python than MATLAB.

Don’t believe those who haven’t lived and breathed MATLAB tell you Python is strictly superior. No it isn’t. They just didn’t know what they were missing as they haven’t made the intellectual leap in MATLAB yet. Python is very convenient as a swiss-army knife but scientific computing is an afterthought in Python’s language design.

The only way to use MATLAB-like semi-colon to change rows only works for np.matrix() type, which they plan to deprecate. For now one can cast matrix into array like np.array(np.matrix(matrix_string)).

Even numpy’s ndarray (or matrix to be deprecated) are CONCEPTUALLY equivalent to a matrix of cells in MATLAB. There isn’t native numerical matrices like in MATLAB that doesn’t have the overhead of unpacking arbitrary data types. You don’t want to do numerical matrices in MATLAB with cell matrices as it’s insanely slow.

You get away without the unpacking penalty in Numpy if all the contents of the ndarray happens to have the same dtype (such as numerical), aka known to be uniform. In other words, MATLAB’s matrices are uniform if it’s formed by [] and heterogeneous if formed by {}, while for Python [] is context-dependent, kept track of by dtype.

ConceptMATLABNumpy
Construction[8,9;6,4]np.array([[8,9],[6,4]])
Size by dimensionsize()A.shape
Concatenate
within existing dimensions
[A;B] or vertcat()
[A,B] or horzcat()
cat(dim, A, B, ...)
np.vstack()
np.hstack()
np.concatenate(list, dim)
Concatenate expanding
to 3D (expand in last dimension)
cat(3, A, B, ...)np.dstack()
‘d’ for depth (3rd dimension)
Concatenate
expanding dimensions
cat(newdim, A, B, ...)
then permute()
np.stack([A, ..], expand_at_axis)
np.array([A, ..]) expands at first
dimension as outermost bracket
refers to first dimension
Tilingrepmat()np.tile()
Fill with same valuerepmat()np.full()
Fill with ones/zerosones(), zeros()np.ones(), np.zeros()
Fill minicking another
array’s size
repmat(x, size(B))
ones(x, size(B))

zeros(x, size(B))
np.full_like(B, x)
np.ones_like(B)
np.zeros_like(B)
PreallocateAny of the above
(Must be initialized)
np.empty()
np.empty_like()
UNINITIALIZED
repelem() is just repmat() with the repetition by axes vector expanded out as variable input arguments one per dimension. Using ones vector to broadcast a singleton instead of repmat() is horrendously inefficient and non-intuitive.

Heterogeneous Data Structures

Heterogeneous Data Structures are typically column major as it is a concept that derives from Structs of Arrays (SoA) and people typically expect columns to have the same data type from spreadsheets.

While Pandas offers a lot of useful features that I’ve easily implemented with wrappers in MATLAB, the indexing syntax of Pandas/Python is awkward and confusing. It’s due to the nature that matrix is a first-class citizen in MATLAB while it’s an afterthought in Python.

Python does not have the { } cell pack/unpack operator in MATLAB, so in Pandas, you select the Series object (think of it as a supercharged list with conveniences such as handling missing values and keeping track of row/column labels) then call its .values attribute.

However, Pandas is a lot more advanced than MATLAB in terms of using multiple columns as keys and have more tools to exploit multi-key row names (row names not mandatory in MATLAB but mandatory in Pandas). In the old days I had to write my own MATLAB function with unique(.., 'rows') exploit its index output to build unique keys under the hood.

ConceptMATLABPython (Pandas
Dataframe)
RowsObservations (dataset())
Row (table())
Rows
index
ColumnsVariablesColumns
Select rows/columnsT(rows, cols)T.loc[r, col_name]
T.iloc[r,c]

Caveats:

– single index
(not wrapped in list)
have content extracted

iloc on LHS cannot
expand table but loc can, but it can only inject 1 row
Extract one columnT{:, c}T[c].values
Extract one entryT{r, c}T.at[r,col_name]
T.iat[r,c]

Faster than loc/iloc
Show first few rowsT(1:5, :)T.head()
Ordinalcategorical()
ordinal()
Categorical()
Index()
Getting column names/labelsT.Properties.VariableNames
(returns cellstr() only)
T.columns
(returns Index() or RangeIndex())
Getting row
names/labels
T.Properties.RowNamesT.index
Move columns
by name
movevars() since R2023a
Rename columnsrenamevars() since R2020aT.rename(columns={source:target})
Rename rowsT.Properties.RowNamesT.rename(index={source:target})
Reorder or partial selectionT[rows, cols]T.reindex(columns=..., index=...)
New labels will autofill by NaN
Select columnsT[:, cols]T[list_of_cols]
Blindly concatenate columns of 2 tables[T1 T2]

If you defined optional rownames, they must match. You can delete it with T.Properties.RowNames = {}
Pandas assign row indices (labels) by default.

Mismatched row labels do not combine in the same row. Consider reset_index() or overwrite the row indices of one table with another, like
pd.concat([T1, T2.set_index(T1.index)]
Format exportwritetable().to_*()
MATLAB tables does not support ranging through column names (such as 'apple':'grapes') yet Pandas DataFrame support it. I don’t think it’s fine to use it in the interpreter to poke around, but this is just asking for confusing logic bugs when the columns are moved around and the programmer has a false sense of security knowing exactly what’s where because they are using only names.

Dataframe is a little smarter than MATLAB’s table() in terms of managing column names and indices as it’s tracked with Index() type which is the same idea as MATLAB’s ordinal() ordered categorical type, where uniques names are mapped to unique indices and it’s the indices under the hood. This is how 'apple':'grapes' can work in Python but not MATLAB.

MATLAB T.Properties.VariableNames is a little clumsy. I usually implement a consistent interface called varnames() that’d output the same cellstr() headings whether it’s struct, dataset or table objects.

MATLAB’s table() by default do not make up row names. Pandas make up row names by default sequentially.

MATLAB table() do requires qualified string characters as variable names. Dataframe doesn’t care what labels you use as long as Index() takes it. It can get confusing because you can have a number 1 and ‘1’ as column headers at the same time and they look the same when displayed in the console.

Loading

Dart Language: late binding in lambda (no capture syntax)

Dart’s documentation at the time of writing (2021-10-27) is not as detailed as I hoped for so I don’t know if the Lambda (anonymous function) is late-binding or early-binding.

For those who don’t know what late/early-binding is:

Bindingearlylate
Expressionclosed (closure)open (not a closure)
Self-containedyesno
Free symbols
(relevant variables
in outer workspace)
bound/captured at creation
(no free symbols)
left untouched until evaluated/executed
(has free symbols)
Snapshotduring creationduring execution

Late-binding is almost a always a gotcha as it’s not natural. Makes great Python interview problems since the combination of ‘everything is a reference‘ and ‘reference to unnamed objects‘ spells trouble with lazy evaluation.

I’d say unlike MATLAB, which has a more mature thinking that is not willing to trade a slow-but-right program for a fast-but-wrong program, Python tries to squeeze all the new cool conceptual gadgets without worrying about how to avoid certain toxic combinations.

I wrote a small program to find out Dart’s lambda is late-bound just like Python:

void main()
{
  int y=3;
    var f = (int x) => y+x;	
    // can also write it as 
    // f(int x) => y+x;
  y=1000;	

  print( f(10) );
}
The result showed 1010, which means Dart use open lambdas (late-binding)

LanguageBinding rules for lambda (anonymous functions)
DartLate binding
Need partial application (discussed below) to enforce early binding.
PythonLate binding.
Can capture (bind early) by endowing running (free) variables (symbols) with defaults based on workspace variables
C++11Early binding (no access to caller workspace variables not captured within the lambda)
MATLABEarly binding (universal snapshot, like [=] capture with C++11 lambdas)

I tried to see if there’s a ‘capture’ syntax in Dart like in C+11 but I couldn’t find any.

I also tried to see if I can endow a free symbol with a default (using an outer workspace variable) in Dart, but the answer is no because unlike Python, default value expressions are required to be constexpr (literals or variables that cannot change during runtime) in Dart.

So far the only way to do early binding is with Partial Application (currying is different: it’s doing partial application of multiple variables by breaking it into a cascade of function compositions when you partially substitute one free variable/symbol at a time):

  1. Add an extra outer function layer (I call it the capture/binder wrapper) putting all free variables you want to bind as input arguments (which shadows the variables names outside the lambda expression if you chose not to rename the parameter variables to avoid conceptual confusion),
  2. then immediately EVALUATE the wrapper with the with the outer workspace variables (free variables) to capture them into the functor (closed lambda, or simply closure) which binder wrapper spits out.

Note I used a different style of lambda syntax in the partial application code example below

void main()
{
  var y=3;
    f(int x) => y+x;
	
    // Partial Application
    g(int w) => (int x) => w+x;	
    // Meat: EVALUATING g() AT y saves the content of y in h()
    var h = g(y);
	
  y=1000;	

  // y is free in f 
  print( f(10) );	

  // y is not involved in g until now (y=1000)	
  print( g(y)(10) );		

  // y is bounded for h when it was still y=3
  print( h(10) );
}

Only h() captures y when it’s still 3. The snapshot won’t happen if you cascade it with other lambda (since the y remains free as they show up in the lambda expression).

You MUST evaluate it at y at the instance you want y captured. In other words, you can defer capturing y until later points in your code, though most likely you’d want to do it right after the lambda expression was declared.


As a side note, I could have used ‘y’ instead of ‘w’ as parameter in the partial application statement

g(int y) => (int x) => y + x

but the ‘y‘ inside g() has shadowed (no longer refers to) the ‘y‘ in the outer workspace!

What makes it confusing here is that quite a few authors think it’s helpful (mnemonics-wise) to use the free variable (outer scope) name you are going to inject into the dummy (argument) as the name of the dummy! This gives the false impression of the variable being free while it’s bounded (through feeding it as a parameter in the wrapper/outer function)!

Scoping rules is a nasty source of confusion in understanding lambda calculus so I decide to give it a different name ‘w‘. I’m generally dismissive of shadowing under any circumstances, to the extent that I find Python’s @ syntactic sugar for decorators shadowing the underlying function which could have been reused somewhere (because it’s implicitly doing f=g(f) instead doing h=g(f). I hate it when you see the function with name f then f was repurposed to not mean the definition of f you saw right below but you have to keep in mind that it’s actually a g(f)). See my rants here.

Notational ambiguity, even if resolvable, is NOT helpful at all here especially when there are so many level of abstractions squeezed into so few symbols! People jumping into learning the subject quickly for the first time should not have unnecessarily keep track of the obscure scoping rules in their head to resolve the ambiguities!

Loading

Reading Lambda Calculus

Complex expressions in lambda calculus notation here is hard to parse in the head correctly until the day I started using Dart Language and noticed the connection between => lambda notation and C-style function layout. Here’s what I learned:

  • The expressions are right-associative because the right-most expression is the deepest nested function
  • Function application (feeding values to the arguments) always starts from left-to-right because you can only gain entry to a nest of functions starting from the outermost level, peeling it layer by later like an onion. So if applying arg to statement is denoted by (statement arg), the order of input would be (((statement arg1) arg2) arg3)
  • Names showing up on the parameter list of the nearest layer sticks to it (local to the function) and excludes everybody else (variable shadowing): this means the same name have absolutely no bearing as a free variable or being a free variable anywhere, so you can (and should) replace it with a unique name.

Here’s the calculus syntax in the book:

This example is the function application used above


Here’s where reusing the variable x in multiple places to mean different things gets confusing as hell, and the author (as well as the nomenclature) do not emphasize the equivalence to variable shadowing rules:

This can be rewritten as having

  • \lambda w . w^2
  • \lambda z . (z+1)
  • \lambda f . \lambda g . \lambda x.(f (g x) )

Upon application,

((\lambda f . \lambda g . \lambda x.(f \quad (g  \quad x) ) \qquad \lambda w . w^2) \qquad \lambda z . (z+1))

\implies (\lambda g . \lambda x.(\lambda w . w^2 \qquad (g \quad x) ) \qquad \lambda z . (z+1))

\implies \lambda x.(\lambda w . w^2 \qquad (\lambda z . (z+1) \quad x) )

which clearly reads for an input of x, it’ll be incremented first then squared. Resolving further:

\implies \lambda x.(\lambda w . w^2 \quad (x+1) )

\implies \lambda x.(x+1)^2

Fucking evils of variable shadowing! Not only it makes the code hard to read and reason, it also make the mathematics hard to follow!

Loading

Using Dart’s C-style syntax to make chain of lambda functions concrete

Start with an example lambda calculus expression f\equiv\lambda x.(y+x). It is:

  • [programming view] a function with x as an input argument and it uses y from the outer workspace (called a free-variable). f(var x) { return y+x; }
  • [mathematical view] can also be written as f(x) = y+x where y is seen as a fixed/snapshotted value relative to the expression.
g(int y) => (int x) => y + x

Lambda calculus is right-associative

g(int y) => ( (int x) => y + x )

Unrolling in C-style it will give better insights to the relationships between layers

g(int y) {
  return (int x) { return y + x; };
}

Note that (int x) { return y + x; } is a functor. To emphasize this, the same code can be rewritten by assigning the name f to it and returning f:

g(int y) {
  f(int x) { return y + x; };
  return f;
}

Use the C++11 style syntax so that it doesn’t look like nested function implementation body instead of a functor nested inside a function:

g(int y) {
  var f = (int x) { return y + x; };
  return f;
}

Note that conceptually what we are doing with the wrapper cascade is indeed nested functions. However, in the wrapper, we spit out a functor (which did most of the partial work) so the user can endow/evaluating it with the last piece of needed info/input:

g(int y) {
  f(int x) { 
   return y + x;
  };
  return f;
}

More commonly as seen in Dart docs, this formatting shows a (binder/capturing) wrapper function returning its own nested function:

g(int y) {
  return (int x) { 
           return y + x;
         };
}

Loading

Styles for (Lambda) anonymous function (named or unnamed) in Dart Language

Official Dart docs is sometimes too simple to provide ultimate answers for various language questions. I discoverd an alternative syntax for named lambda here and here. In Dart,

(args) => expr

is the shorthand for

(args) { 
  return expr 
}

So the Lambda-style (C++11) names variable by assigning it to a messy functor type (inferred with var in Dart and auto in C++11):

var f = (args) => expr

Which can also be rewritten using C-style function pointer-like declaration except

  1. the body expr is included (which is not allowed in C),
  2. return must be explicit in curly braces block.
  3. arrow notation => takes whatever the expr evaluates to (which can be null for statements like print)
  4. it’s simply the function name f in Dart instead of pointer (*f) in C
[optional return type] f(args) { 
  return expr;
}

which can be simplified with the arrow operator

[optional return type] f(args) => { expr }

Loading