Common | C | C++ | MATLAB | Python | |
Variable arguments | <stdarg.h> T f(...) Packed in va_arg | Very BAD! Cannot overload when signatures are uncertain. | varargin varargout Both packed as cells. MATLAB does not have named arguments | *args (simple, stored as tuples)**kwargs (specify input by keyword, stored as a dictionary) | |
Referencing | N/A | operator[] | (_) is for referencessubsindex [_] is for concat {_} is for (un)pack | __getitem__() | |
Default values | N/A | Supported | Not supported. Manage with inputParser() or newer arguments | Non-intuitive static data behavior. Stick to None or immutables. | |
Name-Value Argument Matching | Old way:.., 'PropName', Value and parse varargin Since R2021a: Name=Value options in arguments | Name=Value **kwargs | |||
Major Dimension | Row | Row | Column | Row (Native/Numpy) Column for Pandas | |
Constness | const | const | Only in classes | N/A (Consenting adults) | |
Variable Aliasing | Pointers | References | NO! Rely on Copy-on-write (No in-place functions*) Handle classes under limited circumstances | References | |
= assignment | Copy one element | Values: Copy References: Bind | New Copy Copy-on-write | NO VALUES Bind references only (could be to unnamed objects) | |
Chained access operators | N/A | Difficult to operator overload it right | Difficult to get it right. MATLAB had some chaining bugs with dataset() as well. | Chains correctly natively | |
Assignment expressions (assignment evaluates to assigned lvalue) | = | = | N/A | Named Expression := | |
Version Management | verLessThan() isMATLABReleaseOlderThan | virtenv (Virtual Environment) | |||
Exponentiation | <math.h> pow() | <cmath> pow() | ^ | ** | |
Stream (Conveyor belt mechanism. Saves memory) | I/O (std, file, sockets) | iterator inSTL containers | MATLAB doesn’t do references. Just increment indices. | iterators (uni-directional only)iter(): __iter__() next(): __next__() | |
Looping | for(init, cont_cond, next) | C-style for(auto running: iterable) | for k = array to iterate | list-comp for (index, thing) in enumerate(lists) | |
Data Types
Common | C | C++ | MATLAB | Python |
Sets | N/A | std::set | Only set operations, not set data type | { , , ...} |
Dictionaries | std::unordered_map | – Dynamic fieldnames (qualified varnames as keys) – containers.Map() or dictionary() since R2022b | Dictionaries{key:value} (Native) | |
Heterogeneous containers | cells {} | lists (mutable) tuples (immutable) | ||
Structured Heterogeneous containers | table() dataset() [Old]Mix in classes | Pandas Dataframe | ||
Array, Matrices & Tensors | Native [ , ; , ] | Numpy/PyTorch | ||
Records | struct | class (members) | dynamic field (structs) properties (class) getfield()/setfield() | No structs (use dicts) attribute (class) getattr()/setattr() |
Type deduction | N/A | auto | Native | Native |
Type extraction | N/A | decltype() for compile time (static)typeid() for RTTI (runtime) | class() | type() |
orderly-set
package.Array Operations
Common | MATLAB | Python | |
Repeat | repmat() | [] * N np.repeat() | |
Logical Indexing | Native | List comprehension Boolean Indexing (Numpy) | |
Equally spaced numbers | Internally colon() :start:step:end linspace /logspace | range(begin, past_end, step) produces an iterator list(range()) or tuple(range()) iterates to realize the vector | |
Equally spaced indexing | MATLAB has no generators, so produced vector only | [start:past_end:step] is internallyslice() which produces a slice object, not range/lists/tuple. Faster but not iterable | |
Shallow copy | Deep copy-on-write | Slice: x = y[:] copy.copy() | |
Deep copy | Deep copy-on-write | copy.deepcopy() |
Editor Syntax
Common | C | C++ | MATLAB | Python | ||
Commenting | /* ... */ // (only for newer C) | // (single line)/* ... */ (block) | % (single line)(Block): %{ | # (single line)""" or ''' is docstring which might be undersirably picked up | ||
Reliable multi-line commenting (IDE) | Ctrl+(Shift)+R (Windows), / (Mac or Linux) | [Spyder]: Ctrl+ 1 (toggle), 4 (comment), 5 (uncomment) | ||||
Code cell (IDE) | %% | [Spyder]: # %% | ||||
Line Continuation | \ | \ | ... | \ | ||
Console Precision | format | %precision (IPython) | ||||
Clear variables | clear / clearvars | %reset -sf (IPython) |
Python is messy about the workspace, so if you just delete
Object Oriented Programming Constructs
Common | C++ | MATLAB | Python | ||
Getters Setters | No native syntax. Name mangle (prefix or suffix) yourself to manage | Define methods:get.x set.x | Getter:@property Setter: @x.setter | ||
Deleters | Members can’t be changed on the fly | Members can’t be changed on the fly | Deleter (removing attributes dynamically by del ) | ||
Overloading (Dispatch function by signature) | Overloading | Overload only by first argument | @overload (Static type)
| ||
Initializing class variables | Initializer Lists Constructor | Constructor | Constructor | ||
Constructor | ClassName() Does not return ( *this is implicit) | obj=ClassName(...) MUST output the constructed object | __init__(self, ...) Object to be constructed is 1st argument | ||
Destructor | ~ClassName() | delete() | __del__() | ||
Special methods | Special member functions | (no name) method that control specific behaviors | Magic/Dunder methods | ||
Operator overloading | operator | operator methods to define | Dunder methods | ||
Resource Self-cleanup | RIAA | onCleanup() : make a dummy object with cleanup operation as destructor to be removed when it goes out of scope | with Context Managers | ||
Naming for the object itself | Class: (class’s own name by SRO :: ) Instance: *this | Class: (class’s own name) Instance: obj (or any output name defined in constructor) | Class: cls Instance: self (Recommended PEP8 names) |
setattr()
, which includes methods. MATLAB’s dynamicprops allows adding properties (data members) on the fly with addproponCleanup()
does not work reliably on Python because MATLAB’s object destructor time is deterministic (MATLAB specifically do not garbage collect user objects to avoid this mess. It only garbage collects PODs) while Python leaves it up to garbage collector.*this
is implicitly passed in C++ and not spelled out in the method declaration. The self object must be the first argument in the instance method’s signature/prototype for both MATLAB and Python. Functional Programming Constructs
Common | C++ | MATLAB | Python | ||
Function as variable | Functors (Function Objects) operator() | Function Handle | Callables (Function Objects) __call__() | ||
Lambda Syntax | Lambda[capture](inputs) {expr} -> optional trailing return type | Anonymous Function@(inputs) expr | Lambdalambda inputs: expr | ||
Closure (Early binding): an instance of function objects | Capture [] only as necessary.Early binding [=] is capture all. | Early binding ONLY for anonymous functions (lambda). Late binding for function handles to loose or nested functions. | Late binding* by default, even for Lambdas. Can capture Po through default valueslambda x,P=Po: x+P (We’re relying users to not enter the captured/optional input argument) |
An instance of a function object is not a closure if there’s any parameter that’s late bound. All lambdas (anonymous functions) in MATLAB are early bound (at creation).
The more proper way (without creating an extra optional argument that’s not supposed to be used, aka defaults overridden) to convert late binding to early binding (by capturing variables) is called partial application, where you freeze the parameters (to be captured) by making them inputs to an outer layer function and return a function object (could be lambda) that uses these parameters.
The same trick (partial application) applies to bind (capture) variables in simple/nested function handles in MATLAB which do behave the same way (early binding) like anonymous functions (lambda).
Currying is partial application one parameter at a time, which is tedious way to stay faithful to pure functional programming.
List comprehension is a shorthand syntax for transform/map() and copy_if/remove_if/filter() in one shot, but not accumulate/reduce(). MATLAB and C/C++ does not have listcomp, but listcomp is not specific to Python. Even Powershell has it.
Listcomp syntax, if wrapped in round brackets like (x**x for x in range(5))
, gives a generator. Wrapping in square bracket is the shortcut of casting the generator into a list, so [x**x for x in range(5)]
is the same as list(x**x for x in range(5))
.
Coroutines / Asynchronous Programming
MATLAB natively does not support coroutines.
Common | C++20 | Python |
Generators | Input Iterators | Functions that yield value_to_spit_out_on_next (Implicitly return a generator/functor with iter and next ) |
Coroutines | Functions that value_accepted_from_outside = yield Send value to the continuation by g.send(user_input) async /await (native coroutines) | |
Matrix Arrays
The way Numpy requires users to specify matrices with a bracket for every row drives me nuts. Not only there’s a lot of typing, the superfulous brackets reinforce C’s idea of row-major which is horrendous to people with a proper math background who see matrices as column-major . Pytorch is the same.
Once you are trained in APL/MATLAB’s matrix world-view, you’ll discover going back to the world where matrices aren’t first class citizens is clumsy AF.
With Python, you lose the clutter free readability where your MATLAB code is one step away from the matrix equations in your scientific computing work, despite a lot of the features that addresses frequent use patterns are implemented earlier in Python than MATLAB.
Don’t believe those who haven’t lived and breathed MATLAB tell you Python is strictly superior. No it isn’t. They just didn’t know what they were missing as they haven’t made the intellectual leap in MATLAB yet. Python is very convenient as a swiss-army knife but scientific computing is an afterthought in Python’s language design.
The only way to use MATLAB-like semi-colon to change rows only works for np.matrix() type, which they plan to deprecate. For now one can cast matrix into array like np.array(np.matrix(matrix_string))
.
Even numpy’s ndarray (or matrix to be deprecated) are CONCEPTUALLY equivalent to a matrix of cells in MATLAB. There isn’t native numerical matrices like in MATLAB that doesn’t have the overhead of unpacking arbitrary data types. You don’t want to do numerical matrices in MATLAB with cell matrices as it’s insanely slow.
You get away without the unpacking penalty in Numpy if all the contents of the ndarray happens to have the same dtype
(such as numerical), aka known to be uniform. In other words, MATLAB’s matrices are uniform if it’s formed by []
and heterogeneous if formed by {}
, while for Python []
is context-dependent, kept track of by dtype
.
Concept | MATLAB | Numpy |
Construction | [8,9;6,4] | np.array([[8,9],[6,4]]) |
Size by dimension | size() | A.shape |
Concatenate within existing dimensions | [A;B] or vertcat() [A,B] or horzcat() cat(dim, A, B, ...) | np.vstack() np.hstack() np.concatenate(list, dim) |
Concatenate expanding to 3D (expand in last dimension) | cat(3, A, B, ...) | np.dstack() ‘d’ for depth (3rd dimension) |
Concatenate expanding dimensions | cat(newdim, A, B, ...) then permute() | np.stack([A, ..], expand_at_axis) np.array([A, ..]) expands at first dimension as outermost bracket refers to first dimension |
Tiling | repmat() | np.tile() |
Fill with same value | repmat() | np.full() |
Fill with ones/zeros | ones(), zeros() | np.ones(), np.zeros() |
Fill minicking another array’s size | repmat(x, size(B)) zeros(x, size(B)) | np.full_like(B, x) np.ones_like(B) np.zeros_like(B) |
Preallocate | Any of the above (Must be initialized) | np.empty() np.empty_like() UNINITIALIZED |
repelem()
is just repmat()
with the repetition by axes vector expanded out as variable input arguments one per dimension. Using ones vector to broadcast a singleton instead of repmat() is horrendously inefficient and non-intuitive. Heterogeneous Data Structures
Heterogeneous Data Structures are typically column major as it is a concept that derives from Structs of Arrays (SoA) and people typically expect columns to have the same data type from spreadsheets.
While Pandas offers a lot of useful features that I’ve easily implemented with wrappers in MATLAB, the indexing syntax of Pandas/Python is awkward and confusing. It’s due to the nature that matrix is a first-class citizen in MATLAB while it’s an afterthought in Python.
Python does not have the { }
cell pack/unpack operator in MATLAB, so in Pandas, you select the Series
object (think of it as a supercharged list
with conveniences such as handling missing values and keeping track of row/column labels) then call its .values
attribute.
However, Pandas is a lot more advanced than MATLAB in terms of using multiple columns as keys and have more tools to exploit multi-key row names (row names not mandatory in MATLAB but mandatory in Pandas). In the old days I had to write my own MATLAB function with unique(.., 'rows')
exploit its index output to build unique keys under the hood.
Concept | MATLAB | Python (Pandas Dataframe) |
Rows | Observations (dataset() )Row ( table() ) | Rows index |
Columns | Variables | Columns |
Select rows/columns | T(rows, cols) | T.loc[r, col_name] T.iloc[r,c] Caveats: – single index (not wrapped in list) have content extracted – iloc on LHS cannot expand table but loc can, but it can only inject 1 row |
Extract one column | T{:, c} | T[c].values |
Extract one entry | T{r, c} | T.at[r,col_name] T.iat[r,c] Faster than loc/iloc |
Show first few rows | T(1:5, :) | T.head() |
Ordinal | categorical() ordinal() | Categorical() Index() |
Getting column names/labels | T.Properties.VariableNames (returns cellstr() only) | T.columns (returns Index() or RangeIndex() ) |
Getting row names/labels | T.Properties.RowNames | T.index |
Move columns by name | movevars() since R2023a | |
Rename columns | renamevars() since R2020a | T.rename(columns={source:target}) |
Rename rows | T.Properties.RowNames | T.rename(index={source:target}) |
Reorder or partial selection | T[rows, cols] | T.reindex(columns=..., index=...) New labels will autofill by NaN |
Select columns | T[:, cols] | T[list_of_cols] |
Blindly concatenate columns of 2 tables | [T1 T2] If you defined optional rownames, they must match. You can delete it with T.Properties.RowNames = {} | Pandas assign row indices (labels) by default. Mismatched row labels do not combine in the same row. Consider reset_index() or overwrite the row indices of one table with another, like pd.concat([T1, T2.set_index(T1.index)] |
Format export | writetable() | .to_*() |
'apple':'grapes'
) yet Pandas DataFrame support it. I don’t think it’s fine to use it in the interpreter to poke around, but this is just asking for confusing logic bugs when the columns are moved around and the programmer has a false sense of security knowing exactly what’s where because they are using only names.Dataframe is a little smarter than MATLAB’s table() in terms of managing column names and indices as it’s tracked with
Index()
type which is the same idea as MATLAB’s ordinal()
ordered categorical type, where uniques names are mapped to unique indices and it’s the indices under the hood. This is how 'apple':'grapes'
can work in Python but not MATLAB.MATLAB
T.Properties.VariableNames
is a little clumsy. I usually implement a consistent interface called varnames()
that’d output the same cellstr()
headings whether it’s struct, dataset or table objects.MATLAB’s
table()
by default do not make up row names. Pandas make up row names by default sequentially.MATLAB
table()
do requires qualified string characters as variable names. Dataframe doesn’t care what labels you use as long as Index()
takes it. It can get confusing because you can have a number 1 and ‘1’ as column headers at the same time and they look the same when displayed in the console.”