GUI Paradigms (1): Redux (Flutter/React) translated to MATLAB

For GUI development, we often start with controls (or widgets) that user interact with and it’ll emit/run the callback function we registered for the event when the event happens.

Most of the time we just want to read the state/properties of certain controls, process them, and update other controls to show the result. Model-View-Controller (MVC) puts strict boundaries between interaction, data processing and display.

The most common schematic for MVC is a circle showing the cycle of Controller->Model->View, but in practice, it’s the controller that’s the brains. The view can simultaneously accept user interactions, such as a editable text box or a list. The model usually don’t directly update the view directly on its own like the idealized diagram.

MVC with User Action
From https://www.educative.io/blog/mvc-tutorial

With MVC, basically we are concentrating the control’s callbacks to the controller object instead of just letting each control’s callback interact with the data store (model) and view in an unstructured way.


When learning Flutter, I was exposed to the Redux pattern (which came from React). Because the tutorials was designed around the language features of Dart, the documentation kind of obscured the essence of the idea (why do we want to do this) as it dwelt on the framework (structure can be refactored into a package). The docs talked a lot about boundaries but wasn’t clearly why they have to be meticulously observed, which I’ll get to later.

The core inspiration in Redux/BLoC is taking advantage of the concept of ‘listening to a data object for changes’ (instead of UI controls/widget events)!

Instead of having the UI control’s callback directly change other UI control’s state (e.g. for display), we design a state vector/dictionary/struct/class that holds contents (state variables) that we care. It doesn’t have to map 1-1 to input events or 1-1 to output display controls.

When an user interaction (input) event emitted a callback, the control’s callback do whatever it needs to produce the value(s) for the relevant state variable(s) and change the state vector. The changed state vector will trigger the listener that scans for what needs to be updated to reflect the new state and change the states of the appropriate view UI controls.

This way the input UI controls’ callbacks do not have to micromanage what output UI controls to update, so it can focus on the business logic that generates the content pool that will be picked up by the view UI controls to display the results. In Redux, you are free to design your state variables to match more closely to the input data from UI controls or output/view controls’ state. I personally prefer a state vector design that is closer to the output view than input controls.


The intuition above is not the complete/exact Redux, especially with Dart/Flutter/React. We also have to to keep the state in ONE place and make the order of state changes (thus behavior) predictable!

  • Actions and reducers are separate. Every input control fires a event (action signal) and we’ll wait until the reducers (registered to the actions) to pick it up during dispatch() instead of jumping on it. This way there’s only ONE place that can change states. Leave all the side effects in the control callback where you generate the action. No side effects (like changing other controls) allowed in reducers!
  • Reducers do not update the state in place (it’s read only). Always generate a new state vector to replace the old one (for performance, we’ll replace the state vector if we verified the contents actually changed). This will make timing predictable (you are stepping through state changes one by one)

In Javascript, there isn’t really a listener actively listening state variable changes. Dispatch (which will be called every time the user interacts using control) just runs through all the listeners registered at the very end after it has dispatched all the reducers. In MATLAB, you can optionally set the state vector to be Observable and attach the change listener callback instead of explicitly calling it within dispatch.

https://gist.github.com/gaearon/ffd88b0e4f00b22c3159

Here is an example of a MATLAB class that captures the spirit of Redux. I added a 2 second delay to emulate long operations and used enableDisableFig() to avoid dealing with queuing user interactions while it’s going through a long operation.

classdef ReduxStoreDemo < handle

    % Should be made private later
    properties (SetAccess = private, SetObservable)
        state % {count}
    end
        
    methods (Static)
        % Made static so reducer cannot call dispatch and indirectly do
        % side effect or create loops
        function state = reducer(state, action)        
            % Can use str2fun(action) here or use a function map
            switch action
                case 'increment'
                    fprintf('Wait 2 secs before incrementing\n');
                    pause(2)
                    state.count = state.count + 1;
                    fprintf('Incremented\n');
            end                
        end        
    end
    
    % We keep all the side-effect generating operations (such as
    % temporarily changing states in the GUI) in dispatch() so
    % there's only ONE PLACE where state can change
    methods
        function dispatch(obj, action, src, evt)
            % Disable all figures during an interaction
            figures = findobj(groot, 'type', 'figure');
            old_fig_states = arrayfun(@(f) enableDisableFig(f, 'off'), figures);      
            src.String = 'Wait ...';
            
                new_state = ReduxStoreDemo.reducer(obj.state, action);
                
                % Don't waste cycles updating nops 
                if( ~isequal(new_state, obj.state) )
                    % MATLAB already have listeners attached.
                    % So no need to scan listeners like React Redux            
                    obj.state = new_state;
                end                
                
            % Re-enable figure obj.controls after it's done
            arrayfun(@(f, os) enableDisableFig(f, os), figures, old_fig_states);                        
            src.String = 'Increment';
        end
    end
    
    methods
        function obj = ReduxStoreDemo()
            figure();            
            obj.state.count = 0;                        
            
            h_1x  = uicontrol('style', 'text', 'String', '1x Box', ...
                              'Units', 'Normalized', ...
                              'Position', [0.1 0.3, 0.2, 0.1], ...
                              'HorizontalAlignment', 'left');                          
            addlistener(obj, 'state', ...
                        'PostSet', @(varargin) obj.update_count_1x( h_1x , varargin{:})); 
                        
            uicontrol('style', 'pushbutton', 'String', 'Increment', ...
                      'Units', 'Normalized', ...
                      'Position', [0.1 0.1, 0.15, 0.1], ...
                      'Callback', @(varargin) obj.dispatch('increment', varargin{:}));                             
            
            % Force trigger the listeners to reflect the initial state
            obj.state = obj.state;
        end
    end
    
    %% These are 'renders' registered when the uiobj.controls are created
    % Should stick to reading off the state. Do not call dispatch here
    % (just leave it for the next action to pick up the consequentials)
    methods
        % The (src, event) is useless for listeners because it's not the 
        % uicontrol handle but the state property's metainfo (access modifiers, etc)
        function update_count_1x(obj, hObj, varargin)            
            hObj.String = num2str(obj.state.count);
        end        
    end
    
end

 4 total views

MATLAB repeating arrays (elementwise array replication, interleaved ‘repmat’)

Since MATLAB R2015b, there’s a new feature called repelem(V, dim1, dim2, ...) which repeats each element by dimX times over dimension X. If N (dim1) is scalar, each V is uniformly repeated by N times. If N is a vector, it has to be the same length as V and each element of N says how many times the corresponding element in V is repeated.

Here are some historical ways of doing it (as mentioned in MATLAB array manipulation tips)

The scalar case (repeat uniformly) can be emulated by a Kronecker product multiplying everything with 1 (self):

kron(V, ones(N,1))
Just replace all the elements b with 1 so we are left with elements of A repeating the way we wanted

Kron method is conceptually smart but it has unnecessary arithmetic (multiply by 1). Nonetheless this method is reasonable fast until TMW finally developed a built-in function for it that outperforms all the tricks people have accumulated over decades.

The vector case (each element is repeated a different number of times according to vector N) is basically decoding Run-Length Encoding (RLE), aka counts to placements, which you can download maturely written programs on MATLAB File Exchange (FEX). There are a bunch of cumsum/diff/accumarray/reshape tricks but at the end of the day, they are RLE decoding in vectorized forms.


There’s a name for almost each recurring problem that we can think of in MALTAB. Before jumping in and implementing your for loop, ask around and try to find the right keyword/terms to describe your problem! >99.9% of the time your problem is not new!

The most odd-ball MATLAB algorithm scenario I’ve ever came across that requires original thought is the ‘Jenga Matrix‘ (I coined the name) while I was working at Stanford University Medical School as a research assistant for MADIT-CRT.

MATLAB’s OOP was not mature at that time, so dataset() objects didn’t surface. The reason for the ‘Jenga Matrix’ was to create ‘sparse cells’ which uses a sparse matrix with non-zero indices mapping to a cell vector so I can make a table (that’s approximately the guts of heterogenous data structure).

As I remove elements of the ‘sparse cell matrix’, I didn’t want holes in it to accumulate so I’ll have to periodically compact the underlying cell vector and shift the indices to reflect the indices after compacting. Normally if you have to mess with these kind of ingenious indexing algorithms, you are working on some generic abstractions/tools rather than the business logic itself.

There’s no ultimate correct way to implement something in MATLAB, but there are tons of bad ways that is strictly worse under all circumstances! Being smart with these little toy (Cody) problems like array manipulation do not really show practical proficiency in MATLAB. Anybody can spend a day or two to solve a genuinely new algorithm puzzle or just ask around in the forums if you run into it once in a blue moon. Who cares if you can do it 5 times faster if it’s just <1% of the development time?

Most of your time should be spent on using MATLAB to succinctly and intuitively describe your business logic (which requires exploring and understanding your project requirements deeply), and hide the boring background work with generic abstractions (e.g. RDBMS and RLE)! People should be able to read your function and variable names and form a clear picture of what your codebase is trying to achieve instead of stumbling over smart-ass idioms that’s not immediately obvious (which should buried in the lowest level of generic tool functions if you had to develop it in-house).

Even a mathematician in Linear Algebra using MATLAB for 40 years doesn’t mean he’s good at MATLAB! The real MATLAB skills are keeping up with MATLAB has to offer for a variety of scenarios relevant to the task at hand (or know enough abstract concepts like functional programming, OOP, database, etc, to be able to find out the right tools quickly), which is a hell lot of knowledge considering MATLAB covered most common scenario imaginable (the vast majority of MATLAB users aren’t aware of the full offerings and used MATLAB the wrong/hard way)!

 13 total views

Data Relationships of Spreadsheets: Relational Database vs. Heterogenous Data Tables

This blog post is development in process. Will fill in the details missing details (especially pandas) later. Some of the MATLAB syntax are inaccurate in the sense that it’s just a description that is context dependent (such as column names can be cellstr, char string or linear/logical indices).

From data relationship point of view, relation database (RDMBS), heterogenous data tables (MATLAB’s dataset/table or Python Panda’s Dataframe) are the same thing. But a proper database have to worry about concurrency issues and provide more consistency tools (ACID model).

Heterogenous data tables are almost always column-oriented database (mainly for analyzing data) where MySQL and Postgres are row-store database. You can think of column-store database as Struct of Arrays (SoA) and row-store database as Array of Struct (AoS). Remember locality = performance: in general, you want to put the stuff you frequently want to access together as close to each other as possible.

Mechanics:

ConceptsSQLMATLAB tablePandas Dataframe
tablesFROM(work with T)(work with df)
columns
variables
fields
SELECT T.(field)
T(:, cols/varnames)
rows
records
WHERE

HAVING
T( cond(T), : )

T_grp( cond(T_grp), : )
conditionsNOT
IS
IN
BETWEEN
~
==, isequal*()
ismember()
a<=b & b<=c
Inject table to
another table
INSERT INTO t2
SELECT vars FROM t1
WHERE rows
T2(end+(1:#rows), vars) = T1(rows, vars)
(Doable, throws warning)

Insert record/rowINSERT INTO t (c1, c2, ..)
VALUES (v1, v2, ..)
T=[T; {v1, v2, ...}]
(Cannot default for unspecified column*)
update records/elementsUPDATE table
SET column = content
WHERE row_cond
T.(col)(row_cond) = content
New table
from selection
SELECT vars
INTO t2
FROM t1
WHERE rows
T2 = T1(rows, vars)
clear tableTRUNCATE TABLE tT( :, : )=[]
delete rowsDELETE FROM t WHERE cond
(if WHERE is not specified, it kills all rows one by one with consistency checks. Avoid it and use TRUNCATE TABLE instead)
T( cond, : ) = []
* I developed sophisticated tools to allow partial row insertion, but it’s not something TMW supports right out of the box. This involves overloading the default value generator for each data type then extract the skeleton T( [], : ) to identify the data types.

Core database concepts:

ConceptsSQLMATLAB (table/dataset)Pandas (Dataframe)
linear indexCREATE INDEX idx ON T (col)T.idx = (1:size(T,1))'
group indexCREATE UNIQUE INDEX idx ON T (cols)[~, T.idx] = sortrows(T, cols)
(old implementation is grp2idx())
set operationsUNION
INTERSET
union()
intersect()
setdiff(), setxor()
sortORDER BYsortrows()
uniqueSELECT DISTINCTunique()
reduction
aggregration
F()@reductionFunctions
groupingGROUP BYSpecifying ‘GroupingVariables’ in varfun(), rowfun(), etc.
partitioning(set partition option in Table Definition)T1=T(:, {'key', varnames_1})
T2=T(:, {'key', varnames_2})
joins[type] JOIN*join(T1, T2, ...)df.join(df2, …)
cartesian productCROSS JOIN
(misnomer, no keys)
T_cross = [repelem(T1, size(T2,1), 1), repmat(T2, [size(T1,1), 1])]
Function programming concepts map (linear index), filter (logical index), reduce (summary & group) are heavily used with databases

Formal databases has a Table Definition (Column Properties) that must be specified ahead of time and can be updated in-place later on (think of it as static typing). Heterogenous Data Tables can figure most of that out on the fly depending on context (think of it as dynamic typing). This impacts:

  • data type (creation and conversion)
  • unspecified entries (NULL).
    Often NaN in MATLAB native types but I extended it by overloading relevant data types with a isnull() function and consistently use the same interface
  • default values
  • keys (Indices)

SQL features not offered by heterogenous data tables yet:

  • column name aliases (AS)
  • wildcard over names (*)
  • pattern matching (LIKE)

SQL features that are unnatural with heterogeneous data tables’ syntax:

  • implicitly filter a table with conditions in another table sharing the same key.
    It’s an implied join(T, T_cond)+filter operation in MATLAB. Often used with ANY, ALL, EXISTS

Fundamentally heterogenous data types expects working with snapshots that doesn’t update often. Therefore they do not offer active checking (callbacks) as in SQL:

  • Invariant constraints (CHECK, UNIQUE, NOT NULL, Foreign key).
  • Auto Increment
  • Virtual (dependent) tables (CREATE VIEW)

Know these database/spreadsheet concepts:

  • Tall vs wide tables

Language logistics (not related to database)

ConceptsSQLMATLAB (table/dataset)Pandas (Dataframe)
Partial displayMySQL: LIMIT
Oracle: FETCH FIRST
T( 1:10, : )df.head()
Comments-- or /* */% or %{ %}# or """"""
functionCREATE PROCEDURE fcnfunction [varargout{:}]=fcn(varargin{:})def fcn:
caseCASE WHEN THEN ELSE ENDswitch case end(no case structure, use dictionary)
Null if no resultsIFNULL ( statement )function X=null_if_empty(T, cond)
X=T( cond, : );
if( isempty(X) ) X=NaN;
Replace nullsISNULL(col, target_val)T.col(isnan(T.col)) = target_val
T = standardizeMissing( T, ... )

 35 total views,  1 views today

MATLAB assign index for unique rows

MATLAB’s dataset/table objects’ internals often involves identifying unique contents and assigning a unique (grouping) index to it so the indices can be mapped or joined without actually going through the contents of each row.

In the old days when I were using dataset(), the first generation of table() objects before the rewrite, there is a tool called grp2idx() which assigns the same number to identical items regardless of data types. It was part of Statistics Toolbox (needs to pay extra for it) and it does not work if you have multiple columns that you want to assign an unique index unless the ROWS are identical.

Upon inspection. grp2idx() is overrated. There are two ways to get it without paying for the toolbox:

  • double(categorical(X)): cast a categorical type (technically you can use nominal/ordinal, but it’s part of statistics toolbox)
  • Use the 2nd output argument for sort() or sortrows() function. I recommend sortrows() because it’s can be overloaded on table() objects and it works on multiple rows.

 25 total views

Anonymous Functions (MATLAB) vs Lambdas (Python) Anonymous Functions in MATLAB is closure while Lambdas in Python are not

Lambdas in Python does not play by the same rules as anonymous functions in MATLAB

  • MATLAB takes a snapshot of (capture) the workspace variables involved in the anonymous function AT the time the anonymous function handle is created, thus the captured values will live on afterwards (by definition a proper closure).
  • Lambda in Python is NOT closure! [EDIT: I’ll need to investigate the definition of closure more closely before I use the term here] The free variables involved in lambda expressions are simply read on-the-fly (aka, the last state) when the functor is executed.

It’s kind of a mixed love-and-hate situation for both. Either design choice will be confusing for some use cases. I was at first thrown off by MATLAB’s anonymous function’s full variable capture behavior, then after I get used to it, Python’s Lambda’s non-closure tripped me. Even in the official FAQ, it address the surprise that people are not getting what they expected creating lambdas in a for-loop.

To enable capture in Python, you assign the value you wanted to capture to a lambda input argument (aka, using a bound variable as an intermediary and initialize it with the free variable that needs to be captured), then use the intermediary in the expression. For example:

lambda: ser.close()      # does not capture 'ser'
lambda s=ser: s.close()  # 'ser' is captured by s.

I usually keep the usage of nested functions to the minimum, even in MATLAB, because effectively it’s kind of a compromised ‘global’ between nested levels, or a little bit like protected classes in C++. It breaks encapsulation (intentionally) for functions in your inner circle (nest).

It’s often useful for coding up GUI in MATLAB quick because you need to share access to the UI controls within the same group. For GUI that gets more complicated, I actually avoided nested functions altogether and used *appdata() to share UI object handles.

Functors of nested functions are closures in both MATLAB and Python! Only Lambdas in Python behave slightly differently.

 729 total views

Handling resources that needs to be cleaned up in Python and MATLAB Files, sockets, ports, etc.

Using try/catch to handle resource (files, ports, etc) cleanup is out of fashion in both MATLAB and Python. In MATLAB, it uses a very slick idea based on closures to let you sneak your cleanup function (as a function object) into an onCleanup object that will be executed (in destructor) when that object is cleaned up by being out of scope.

Python does not provide the same mechanism. Instead, it relies on the resource class (like file IO or PySerial) to implement as a Context Manager (has __enter__) and provide the cleanup in the manager’s __exit__ method. Then you use the with keyword with the returned resource object put after as keyword, like this:

with File('test.txt', 'w') as f:
    f.write('SPFCCsMfT!')

The body of with-block will not run (and therefore object f won’t be created) if the with-statement throws an exception. Unfortunately, it’s a fill-or-kill (or try/finally) instead of try/catch. So if the resource failed to open, resource object f is simply not created. No other clue is generated. This is what I hate about the with-statement. There are two ways to kind of get around it, but they are not reliable and might cause other bugs if you don’t keep track of the variable names in the local context:

  1. Check for the existence of the resource object
    if 'f' in dir(): del f   # Avoid name conflicts
    with File() as f
        print("Done");
    if not 'f' in dir():
        print('Cannot create file');
  2. Use the body code to indicate that the with-block is executed
    isSuccess = false     # Signaling
    with File() as f
        isSuccess = true
        print("Done')
    if not isSuccess
        print("Cannot create file");
    
    
  3. Back to the old way
    try:
        f = File();
    except:
        print("Cannot open file")
    else:
        cleanup_obj = onCleanup(lambda x = f: x.custom_cleanup())
        # run core code that uses resource f
    

Actually there’s another mess here. In PySerial, creating the Serial object with a wrong port string will throw an exception as well, which with-as statements cannot handle. Therefore you’ll need to do both:

try:
    ser = serial.Serial(dev_str)
except:
    print(dev_str + " not accessible (either the wrong port of something else opened it)");
else:
    with ser:
        # meat

If your resource initializer does not have context manager built in, and you want a quick-and-dirty solution (given your cleanup is a one-liner). Use my library (lang.py) that recreates onCleanup():

"""
@author: [2019-04-23] Hoi Wong 
"""
class onCleanup:
    '''make sure you 'capture' the lambdas by initializing an intermediate running variable
       e.g. lambda s=ser: s.close()
       lambda: ser.close() will NOT work as ser is not 'captured''''
    def __init__(self, functor):
        self.task = functor;
    def __del__(self):
        self.task()

Then you can use the old way without nesting try/except:

try:
    f = File()
except:
    print("Cannot open file")
else:
    cleanup_obj = onCleanup(lambda x = f: x.custom_cleanup())
    # run core code that uses resource f

Check with the provider of your resource initializer to see if context manager is already implemented. Use onCleanup() only when you don’t have this facility and you don’t want to build a whole context manager (even with decorators) for a one-liner cleanup.

 533 total views

Python startup management

The startup script is simply startup.m in whatever folder MATLAB start with.

Now how about Python? For plain Python (anything that you launch in command line, NOT Spyder though), you’ll need to ADD a new environment variable PYTHONSTARTUP to point to your startup script (same drill for Windows and Linux).

For Spyder, it’s Tools>Preferences>IPython console>Startup>”Run a file”:

but you don’t need that if you already have new environment variable PYTHONSTARTUP correctly setup.

 

 802 total views

MATLAB and Python paths

MATLAB’s path() is equal to Python’s sys.path().


To add paths in MATLAB, use the obviously named function addpath(). Supply the optional -end argument if you don’t want any potential shadowing (i.e. the folder to import has lower priority if there’s an existing function with the same name).

I generally avoid userpath() or the graphical tools because the results are sticky (persists between sessions). The best way is to exclusively manage your paths with startup.m so you always know what you are getting. If you want full certainty, you can start with restoredefaultpath() in MATLAB.


Python’s suggested these as equivalents of MATLAB’s addpath():

sys.path.insert(0, folder_to_add_to_path)
sys.path.append(folder_to_add_to_path)

but just like MATLAB’s addpath() which works with strings only (not cellstr), these Python options do not work correctly  with Python lists because the methods in sys.path are as primitive as doing [sys.path, new_stuff]:

  1. This means you’ll end up with list of lists if you supplied Python lists as inputs to the above
    (MATLAB will throw an exception if you try to feed it with cellstr instead of polluting your path space with garbage)
  2. This also means it doesn’t check for duplicates! It’ll keep stacking entries!

To address the first problem, we use sys.path.extend() instead. It’s like doing addpath(..., '-end') in MATLAB. If you want it to be inserted at the front (higher priority, shadows existing), you’ll need sys.path = list_of_new_paths + sys.path. For MATLAB, you can make a path string like DOS by using pathsep:

addpath(strjoin(cellstr_of_paths, pathsep)))

Note that  sys.path.extend() expect iterables so if you feed it a string, which Python will consider it a list of characters, you will get a bunch of one character paths inserted!

On the other hand, DO NOT TRY to get around it in Python with the same trick like MATLAB by doing sys.path.append( ';'.join(path_list)). Python recognize sys.path as a list, NOT a one long string like MATLAB/Windows path, despite insert() and append() accepts only strings!

Aargh!

The second problem (which does NOT happen in MATLAB) is slightly more work. You’ll need to subtract out the existing paths before you add to it so that you won’t drag your system down by casually adding paths as you see fit. One way to do it:

def keep_only_new_set_of_paths(p):
    return set(p)-set(sys.path)

You should organize your programs and libraries in a directory tree structure and use code to crawl the right branch into a path list! Don’t let the lack of built-in support to tempt you to organize files in a mess. Keep the visuals clean as mental gymnastics/overheads can seriously distract you from the real work such as thinking through the requirements and coming up with the right architecture and data structures. If you constantly need to jump a few hoops to do something, do it only once or twice using the proper way (aka, NOT copying-and-pasting boilerplate code), and reuse the infrastructure.

At my previous workplaces, they had dozens and dozens of MATLAB files including all laying flat in one folder. The first thing I did when I join a new team is showing everybody this idiom that recursively adds everything under the folder into MATLAB paths:

addpath(genpath())

Actually the built-in support for recursive directory search sucks for both MATLAB and Python.  Most often what we need is just a list of full paths for a path pattern that we search recursively, basically dir/w/s *. None of them has this right out of the box. They both make you go through the comprehensive data structure returned (let it be tuples from os.walk() in Python or dir() in MATLAB) and so some manipulations to get to this form.

genpath() itself is slow and ugly. It’s basically a recursive wrapper around dir() that cleans up garbage like '.' and '..'.  Instead of getting a newline character, a different row (as a char array) or a different cell (as cellstr), you get semi-colons (;) as pathsep in between. Nonetheless, I still use it because despite I have recursive path tools in my own libraries, I’ll need to load the library first in my startup file, which requires a recursive path tool like genpath(). This bootstraps me out of a chicken-and-egg problem without too much ugly syntax.


Most people will tell you to do a os.walk() and use listcomp to get it in the typical full path form, but I’m not settling for distracting syntax like this. People in the community suggested using glob for a relatively simple alternative to genpath()

Here’s a cleaner way:

def list_subfolders_recursively(p):
    p = p + '/**/' 
    return glob.glob(p, recursive=True);

It’s also worth noting that Python follows Linux’s file search pattern where directory terminates with a filesep (/) while MATLAB’s dir() command follows the OS, which in Windows, it’s *..

Both MATLAB and Python uses ** to mean regardless of levels, but you’ll have to turn on the recursive=True in glob manually. ** is already implied to be recursive in MATLAB’s dir() command.


Considering there’s quite a bit of plumbing associated with weak set of sys.path methods provided in Python, I created a qpath.py next to my startup.py:

''' This is the quick and dirty version to bootstrap startup.py
Should use files.py that issue direct OS calls for speed'''

import sys
import glob

def list_subfolders_recursively(p):
    p = p + '/**/' 
    return glob.glob(p, recursive=True);

def keep_only_new_set_of_paths(p):
    return set(p)-set(sys.path)

def set_of_new_subfolders_recursively(p):
    return keep_only_new_set_of_paths( list_subfolders_recursively(p) )

def add_paths_recursively_bottom(p):
    sys.path.extend(set_of_new_subfolders_recursively(p));

def add_paths_recursively_top(p):
    # operator+() does not take sets
    sys.path = list(set_of_new_subfolders_recursively(p)) + sys.path;

In order to be able to import my qpath module at startup.py before it adds the path, I’ll have put qpath.py in the same folder as startup.py, and request startup.py to add the folder where it lives to the system path (because your current Python working folder might be different from PYTHONSTARTUP) so it recognizes qpath.py.

This is the same technique I came up with for managing localized dependencies in MATLAB: I put the dependencies under the calling function’s folder, and use the path of the .m file for the function as the anchor-path to add paths inside the function. In MATLAB, it’s done this way:

function varargout = f(varargin)
  anchor_path = fileparts( mfilename('fullpath') );
  addpath( genpath(fullfile(anchor_path, 'dependencies')) );
  % Body code goes here

Analogously,

  • Python has __file__ variable (like the good old preprocessor days in C) in place of mfilename().
  • MATLAB’s  mfilename('fullpath') always gives the absolute path, but Python’s  __file__ is absolute if it’s is not in sys.path yet, and relative if it’s already in it.
  • So to ensure absolute path in Python, apply os.path.realpath(__file__). Actually this is a difficult feature to implement in MATLAB. It’s solved by a MATLAB FEX entry called GetFullPath().
  • Python os.path.dirname is the direct equivalent of fileparts() if you just take the first argument.

and in my startup.py (must be in the same folder as pathtools.py):

import os
import sys

sys.path.append(os.path.dirname(os.path.realpath(__file__)))

import pathtool

user_library_path = 'D:/Python/Libraries';
pathtool.add_paths_recursively_bottom(user_library_path)

This way I can make sure all the paths are deterministic and none of the depends on where I start Python.


Now I feel like Python is as mature as Octave. It’s usable, but it’s missing a lot of thoughtful features compared to MATLAB. Python’s entire ecosystem like at least 10 years behind MATLAB in terms of user friendliness. However, Python made it up with some pretty advanced language features that MATLAB doesn’t have, but nonetheless, we are still stuck with quite a bit of boilerplate code in Python, which decreases the expressiveness of the language (I’m a proponent of self-documenting code: variable and function names and their organization should be carefully designed to tell the story; comments are reserved for non-obvious tricks)

 1,604 total views

Getting pyinstaller 3.4 to work with Python 3.7

Python is an excellent language, but given that it’s free, it also comes with a lot of conspicuous loose-ends that you will not expect in commercially supported platforms like MATLAB.

Don’t expect everything to work right out of the box in Python. Everything is like 98% there, with the last 2% frustrate the heck out of you when you are rushing to get from point A to point B and you have to iron out a few dozen kinks before you can really start working.

When I tried use pyinstaller (v3.4) to compile my Python (v3.7) program into an executable, I ended up having to jump through a bunch of hoops:

  • pip install pyinstaller gives:
    ModuleNotFoundError: No module named 'cffi'
  • Then I looked up and installed cffi
    pip install cffi
  • After the dependency was addressed manually (it shouldn’t )  pip install pyinstaller worked
  • Then I tried to compile my first Python executable with pyinstaller, and I got this exception:
    File "C:\Python37\lib\site-packages\win32ctypes\core\cffi\_advapi32.py", line 198
    
            ^
        SyntaxError: invalid syntax
  • I searched the exact string and learned that pyinstaller (v3.4) is not ready for Python 3.7 yet! How come pip installer didn’t check for it? I opened up the offending file and looked for line 198 and saw this:
    c_creds.CredentialBlobSize = \
    
        ffi.sizeof(blob_data) - ffi.sizeof('wchar_t')

    It’s a freaking line continuation character \ (actually the extraneous CR before CRLF) that rooster-blocked it.

  • I just deleted the line continuation and merged the two lines, and saved _advapi32.py, then I was able to compile my Python v3.7 code (using pyinstaller 3.4) with no issues.

This is not something you’ll experience as a MATLAB user. The same company, TMW, wrote the MATLAB compiler as well as the rest. The toolbox/packages are released together in one piece so breaking changes that causes failure for the most obvious use case are caught before they get out of the door.

Another example of breaking changes that I ran into: ipdb does not allow you to move cursor backward.

Again, this is the cost associated with free software and access to the latest updates and new features without waiting for April/October (it’s the MATLAB regular release cycle). If hassle and the extra engineering time far exceed licensing MATLAB licensing costs, MATLAB is a better choice, especially if software is just a chore to get your company from point A to point B, and you are willing to pay big bucks to get there quickly and reliably.

Even with free software on the table, your platform choice is always determined by:

  • how much your time is worth wrestling problems
  • how much flexibility do you need (for customizing to your needs)
  • how much you are willing to pay for the licenses and support

In any case, the community did good work. Please consider sponsoring PyInstaller and PSF if you profit immensely from their work.

 1,410 total views