How I learned MATLAB inside out

Back when I was a struggling graduate student working 3 university jobs to stay afloat, one of the job was to build a multi-center data collection system that take cares of remote data upload, store it in a database and visualize the waveforms and records.

I surveyed other platforms and languages for a couple of months and finally settled on MATLAB because at the time MATLAB had the most convenient data types (cells), data loading/saving (‘.mat’ files so I don’t have to manage the datatypes/format), external interfaces, and most importantly MATLAB FileExchange (FEX) pretty much cover every generic idea I can think of. With MATLAB, my work is pretty much down to coding the high level ‘business’ logic.

And no, I wasn’t biased towards MATLAB at the time because I have a signal processing background. I didn’t know much about MATLAB’s programming support back then other than number crunching (just like the 90% of the public who misundestood the power of the language), so I wouldn’t choose it for a software project at that level of complexity without much research.

Learning the guts of MATLAB, architecturing and coding the entire system took me only 3 months (well, it included a 1 month non-stop 16 hours a day, 7-days a week shut-in programming). Not a shabby platform for a project that is supposed to take 4 years. In fact, the rest of the time was spent

  1. reading the last owners *#@&ed up perl code and fighting to get the fragile linux setup to work on other machines, then reverse the entire project requirements from the source code because there wasn’t any documentation and the previous owners graduated.
  2. reconstruct Guidant’s half-finished (done by a 3rd party then abandoned) binary data reader by finishing the hardest part of the incomplete XSLT code.

The rest that has to do with MATLAB was relatively easy once I learned the main ideas through their documentation, newsgroups, Loren’s blog and the official support.

To understand and appreciate the beauty of MATLAB and use it effectively, you have to get past the following hurdles:

  1. Basic data structures: cell, struct and language features.
  2. Vectorization: use for-loops only in limited cases
  3. Anonymous function (Lambdas), cellfun(), arrayfun(), bsxfun(), structfun()
  4. Overloading and OOP
  5. Tables (Heterogeneous data structures) and Categorical objects (Nominal, Ordinal)

or else you are coding it like a C programmer: a complete waste of MATLAB’s license fees. If you know these 5 aspects of MATLAB well and would say there are strictly superior options out there, please let me know in the comments section and I’ll look into it.

 

Loading

MATLAB Quirks: cellfun() high-performance trap

cellfun() is a powerful function in MATLAB which mirrors the idea of applying a ‘map’ operation (as in functional programming) to monads (cells: a datatype that holds any arbitrary data type, including itself).

One common use is to identify which cells are empty so you can ignore them. Typically, you can do

index.emptyCells = cellfun(@isempty, C);

According to the help (H1) page, there is a limited set of strings that you can use in place of the function handle (functor) for backward compatibility:

 
A = CELLFUN('fun', C), where 'fun' is one of the following strings,
returns a logical or double array A the elements of which are computed 
from those of C as follows:
 
   'isreal'     -- true for cells containing a real array, false
                   otherwise 
   'isempty'    -- true for cells containing an empty array, false 
                   otherwise 
   'islogical'  -- true for cells containing a logical array, false 
                   otherwise 
   'length'     -- the length of the contents of each cell 
   'ndims'      -- the number of dimensions of the contents of each cell
   'prodofsize' -- the number of elements of the contents of each cell
 
A = CELLFUN('size', C, K) returns the size along the K-th dimension of
the contents of each cell of C.
 
A = CELLFUN('isclass', C, CLASSNAME) returns true for each cell of C
that contains an array of class CLASSNAME.  Unlike the ISA function, 
'isclass' of a subclass of CLASSNAME returns false.

Turns out these functions have their native implementation and runs super-fast. But I got burned once when I tried to use this on cells containing table() objects:

index.emptyCells = cellfun('isempty', cellsOfTables);

I meant to find out if I have zero-row or zero-column (empty) table objects with that. It gave me all false even when my cells have these 0-by-0 tables. What a Terrible Failure?! Turns out these ‘backward compatibility’ native implementations (I guess they already have cellfun() before having function handles) looks at the raw data stream like PODs (Plain Old Datatypes) as a C program would do.

A table() object has lots of stuff stored in it like variable (column) names, so there’s no way a program looking at an arbitrary binary stream without accounting for such data type will consider that object empty. It’s up to the overloaded isempty() or numel() of the class to tell what is empty and what’s not, but it needs to be called by the function handle to establish which method to call.

Lesson learned: don’t use those special string based functor in cellfun() unless you know for sure it’s a POD. Otherwise it will get you the wrong answer at the speed of light.

Loading

Make your MATLAB code work everywhere

MATLAB is used in many different setups that it’s hard to expect every line of your code you write generalizes to all other versions, platform and deployments. The following code examples helps you identify what envirnoment you are in and so you can branch accordingly:

% 32 bit vs 64 bit MATLAB
% This exploits the fact that mex is on the same platform as your MATLAB
is64bit = strcmpi( mexext(), 'mexw64');

% Handling different MATLAB versions
if( verLessThan('matlab', yourVersionString) )
  error('This code requires %s or above to run!', yourVersionString);
end

% Headless (Parallel) workers
% This exploits the fact that JVM is not loaded for parallel workers. 
% Starting MATLAB with '/nojvm' will be considered headless
isHeadless = ~usejava('desktop');

% DCOM / ActiveX /Automation Server
isDCOM = enableservice('AutomationServer');

% Test for compiled MATLAB:
isdeployed()

Loading