Python 3 Scientific Installation To-do List

Although I am a big fan of MATLAB, it’s time for me to really try out Python so I can fairly compare the pros and cons of both languages.

The first tiny hurdle for Python is its scattered installation process for Windows. I thought Python(x,y) will give me everything in one place, but turns out the Spyder is stuck in Python 2.7. To install Python 3.7, I’ll need to do it from scratch. Here are the steps:

  1. Download official Python 3 (https://www.python.org/downloads/windows/). You will need that for the “pip” package manager
  2. In command prompt anywhere,
    pip install scipy
    It might complain that {python37}/scripts is not in PATH variable, but I checked and the folder is already in PATH. Can safely ignore it.
  3. I took the advice to upgrade PIP. Can run it anywhere in command prompt
    python -m pip install –upgrade pip
  4. Now I’ll need Spyder3, a MATLAB-like IDE. Qt5 is one of the pre-req:
    pip3 install PyQt5
  5. And finally Spyder3
    pip install Sypder
  6. PyVISA is the analog of “Instrument Control Toolbox” in MATLAB.
    pip install pyvisa
  7. Turns out that only NumPy and IPython is installed with SciPy, not the entire ecosystem.
    pip3 install pandas
    pip3 install matplotlib
    If you know the power of dataset/table objects in MATLAB like I do, you’ll jump for dataframes in panadas.
  8. SymPy, the analog of MATLAB’s symoblic math toolbox, needs to be installed separately
    pip3 install sympy
  9. IPython gives the ‘notebook’ feel in Mathematica, MathCAD and Maple, where the returned results are directly pasted in the same area where your command/syntax is. I rarely cared for it because I usually want the max visual real estate for my plots.

pip does not install icons in your start menu. So I’ll need to manually create a shortcut
“{Python37}/Scripts/spyder3.exe”

.py files are not associated with Spyder3 (normally it’ll just directly run the python script with python3). I usually manually change the association in Windows to Sypder3.

11 total views, 1 views today

TDS 500~800 series Monochrome CRT Driver Repair

The most common mode of CRT display failure in the TDS 500~800 series (Monochrome models) is the flyback transformer. The symptom is that after leaving the screen on for a couple of hours, the screen started stretching vertically until it disappears.

It also happens the failure only happens to a batch of CRT boards. The batch I’ve seen looks new (with modern markings that tells you the purpose of the trimpots) and lightly used, so I’m sure it’s infant mortality. Here’s the broken CRT driver board to be repaired:

There’s practically nowhere you can find this obsolete replacement because it’s not a common configuration. I sourced a batch from China that claimed the part number, and I spent whole day cursing the vendor when the flyback transformer arrived. Here’s what I saw:

The one on the right was the broken original flyback transformer, and two on the left were my new orders. Not only that the shapes are completely different, the number of pins doesn’t even match. WTF?!!!

The seller told me that it works. Forget about how to fit that in the board for a moment. How the f*** am I supposed to know which pins goes to which spot? Not to mention there are 11 nibs  when the original only has 8+1 (actually 7+1, pin#8 is not used). I said nibs instead of pins because not all of them are populated with a pin, and the pins that are missing were not even consistent across the transformers in the batch.

I cannot even guess with a multimeter because it’s not a simple, uniform transformer. Even if I know which ones are connected, I could have ruined the whole thing by having the wrong number of coil turns/inductances because I switched a pair or two!

I had to push the seller really hard for him to dig up the actually mapping and draw me the pinouts on the pictures I’ve sent him (I’m sure the whole batch will be trash if I could not communicate with them in Chinese). The ‘product’ must have been designed and the manufacturing line ran by a bunch of village idiots. Nothing is right about it other than the windings inside are electrically usable (can’t even say compatible because I need to hack it really hard to get the correct display). Here’s the pinout:

Here’s another transformer that doesn’t have the ground pin (unnumbered), turns out the transformer works without it:

The space inside the oscilloscope case is pretty tight, and I managed to find one orientation that lines up with the case nicely, but it’s ugly as hell:

I held it down with hot-glue, caulk to stabilize it. A rubber band was put over it so that if the glue fails, the transformer won’t roll inside the compartment causing mayhem (later units I used cable ties since rubber band might deteriorate with heat. You get the idea.):

I have a few of these transformers available for $80/pc. Unless you have the time on your hands to figure out the details, the learning curve is so steep that I wouldn’t recommend DIY.

I hate to see perfectly good units going to the landfill for stupid reasons. Although it’s less than what my time is worth, for $350, I can provide all the necessary part and do the CRT driver surgery if you anybody wants to revive a unit.

If time is pressing and you don’t want to ship the unit back and forth, I have ready-made CRT driver boards for $300 (you’ll need to trade in the bad CRT driver board). Of course, you have to do the CRT screen adjustments yourself since each tube is different.

247 total views, 2 views today

MATLAB Techniques: variadic (variable input and output) arguments

I’ve seen a lot of ugly implementations from people trying to deal with variable number of input and output arguments. The most horrendous one I’ve seen so far came from MIT’s Physionet’s WFDB MATLAB Toolbox. Here’s a snippet showing how wfdbdesc.m handles variable input and output arguments:

function varargout=wfdbdesc(varargin)
% [siginfo,Fs,sigClass]=wfdbdesc(recordName)
...
%Set default pararamter values
inputs={'recordName'};
outputs={'siginfo','Fs','sigClass'};

for n=1:nargin
    if(~isempty(varargin{n}))
        eval([inputs{n} '=varargin{n};'])
    end
end
...
if(nargout>2)
    %Get signal class
    sigClass=getSignalClass(siginfo,config);
end
for n=1:nargout
    eval(['varargout{n}=' outputs{n} ';'])
end

 

The code itself reeks a very ‘smart’ beginner who didn’t RTFM. The code is so smart (shows some serious thoughts):

  • Knows to use nargout to control varargout to avoid the side effects when no output is requested
  • [Cargo cult practice]: (unnecessarily) track your variable names
  • [Cargo cult practice]: using varargin so it can be symmetric to varargout (also handled similarly). varargout might have a benefit mentioned above, but there is absolutely no benefit to use varargin over direct variable names when you are not forwarding or use inputParser().
  • [Cargo cult practice]: tries to be efficient to skip processing empty inputs. Judicially non-symmetric this time (not done to output variables)!

but yet so dumb (hell of unwise, absolutely no good reason for almost every ‘thoughtful’ act put in) at the same time. Definitely MIT: Make it Tough!

This code pattern is so wrong in many levels:

  • Unnecessarily obscuring the names by using varargin/varargout
  • Managing a list of variable names manually.
  • Loop through each item of varargin and varargout cells unnecessarily
  • Use eval() just to do simple cell assignments! Makes me cringe!

Actually, eval() is not even needed to achieve all the remaining evils above. Could have used S_in = cell2struct(varargin) and varargout=struct2cell(S_out) instead if one really wants to control the list of variable names manually!


The hurtful sins above came from not knowing a few common cell packing/unpacking idioms when dealing with varargin and varargout, which are cells by definition. Here are the few common use cases:

  1. Passing variable arguments to another function (called perfect forwarding in C++): remember C{:} unpacks to comma separated lists!
    function caller(varargin)
       callee(varargin{:});
  2. Limiting the number of outputs to what is actually requested: remember [C{:}] on left hand side (assignment) means the outputs are distributed as components of C that would have been unpacked as comma separated lists, i.e. [C{:}] = f(); means [C{1}, C{2}, C{3}, ...] = f();
    function varargout = f()
    // This will output no arguments when not requested,
    // avoiding echoing in command prompt when the call is not terminated by a semicolon
        [varargout{1:nargout}] = eig(rand(3));
  3. You can directly modify varargin and varargout by cells without de-referencing them with braces!
    function varargout = f(varargin)
    // This one is effectively deal()
        varargout = varargin(1:nargout);
    end
    
    function varargout = f(C)
    // This one unpacks each cell's content to each output arguments
        varargout = C(1:nargout);
    end

One good example combining all of the above is to achieve the no-output argument example in #2 yet neatly return the variables in the workspace directly by name.

function [a, b] = f()
// Original way to code: will return a = 4 when "f()" is called without a semicolon
    a = 4;
    b = 2;
end

function varargout = f()
// New way: will not return anything even when "f()" is called without a semicolon
    a = 4;
    b = 2;
    varargout = manage_return_arguments(nargout, a, b);
end

function C = manage_return_arguments(nargs, varargin)
    C = varargin(1:nargs);
end

I could have skipped nargs in manage_return_arguments() and use evalin(), but this will make the code nastily non-transparent. As a bonus, nargs can be fed with min(nargout, 3) instead of nargout for extra flexibility.


With the technique above, wfdbdesc.m can be simply rewritten as:

function varargout = wfdbdesc(recordName)
% varargout: siginfo, Fs, sigClass
...
varargout = manage_return_arguments(nargout, siginfo, Fs, sigClass);

Unless you are forwarding variable arguments (with technique#1 mentioned above), input arguments can be (and should be) named explicitly. Using varargin would not help you avoid padding the unused input arguments anyway, so there is absolutely no good reason to manage input variables with a flexible list. MATLAB already knows to skip unused arguments at the end as long as the code doesn’t need it. Use exist('someVariable', 'var') instead.

 

 

225 total views, 3 views today

Oversimplified: Getting rid of data in STL containers Summary of Item 9 in "Effective STL"

Unless deleting a known range of elements directly through iterators (no conditions to match), which rangeerase() method can be used directly, targeting specific key/value/predicate requires understanding of the container’s underlying data structure.

I’d like to give a summary of Item#9 in “Effective STL” by defining the names and concepts so the complicated rules can be sensibly deduced by a few basic facts.


The words ‘remove‘ and ‘erase‘ has very specific meaning for STL that are not immediately intuitive.

Lives in Target to match Purpose
remove_?() <algorithm> required Rearrange wanted elements to front
erase() container not accepted Blindly deleting range/position given

There is a remove() method for lists, which is an old STL naming inconsistency (they should have called it erase() like for associative containers). Treat it as a historical mistake.

The usage is easy to remember once you understand it with the right wording above:

algorithm + container contiguous lists associative
remove_?(): move front Step 1 Step 1
(Use remove_?() method)
unordered*: cannot rearrange
(Use erase(key) directly)
erase(): trim tail Step 2 Step 2
(Use remove_?() method)
Use after find_?()
(Use erase(key) directly)

Note that there are two steps for sequential (contiguous+lists) containers , hence the erase-remove idiom. e.g. auto tail = remove(c.begin(), c.end(), T); c.erase(tail, c.end);. Lists provides a efficient shortcut method (see table below) since linked-lists does not need to be rearranged (just short the pointers).

one-shot methods contiguous lists associative
by content N/A: use erase-remove idiom remove(T) erase(key)
by predicate N/A: use erase-remove_if idiom remove_if(pred) N/A: Use for-loops for now
No erase_if() yet as of C++17.

Never try range-based remove_?() for associative containers. It is a data corruption trap if any attempt is made to use anything named remove on associative containers.

The trap used to be possible since <algorithms> and containers were separate, but newer C++ protects you from the trap by checking if the element you are moving is of a MoveAssignable type. Since associative containers’ keys cannot be modified (without a rescan), the elements are not move-assignable.


As for erasing through for-loops (necessary if you want to sneak in an extra step while iterating), C++11 now returns an iterator following the last erased element uniformly across all containers. This helps to preserve the running iterator that gets invalidated immediately after the erase through i=c.erase(i);


* For brevity, I twisted the term unordered here to mean that the native (implementation) data order is dependent on the data involved.

When I said ‘cannot rearrange’, I meant ‘cannot efficiently rearrange’, since there are no cheap O(1) next() or prev() traversal.

It’s a mess to simply copy one element over another (during rearrangement), leaving orphans there, and re-balance a BST or re-hash a hash-map. Nobody wants to go through such pains to remove element when there are tons of more direct ways out there.

200 total views, 1 views today

Understanding the difference between recognized arrays and pointers 'Recognized' means sizeof(array_name) gives the underlying allocated size

array≠ pointer:

pointer only contains a memory location,
while an array already have memory allocated to hold the data.


The confusion comes from the fact that array names are always seen as pointers anywhere in C, but when an array name is referred in places that the scope happens know the allocated size, namely

  • Global arrays: everybody knows the size
  • Local arrays: only the instantiating function knows its size.

, the array name itself has a superpower that pointers lack: report the underlying allocated data size (NOT pointer size) using sizeof(array).


Definition: An array is ‘recognized‘ if the array name is used in the scope that knows the underlying data size.

Corollary: Calling the array name with sizeof() gives the underlying allocated data size.

Examples of consequences that can be derived from the definition above:

  • Heap allocations always return a pointer type, NOT an array name!
    So heap arrays are never recognized.
  • VLA in C99 are considered local stack arrays, so it’s recognized
  • x[] is just a cosmetic shorthand for *x: it doesn’t prevent any recognized array from decaying into a pointer across boundary.
  • The storage duration (static or not) does not matter. e.g.
    • Heap pointers at global level are not recognized arrays
    • Static local array still loses the recognition across function boundaries
      (unless passed carefully by data type T (&array)[N]).

Most often recognized arrays cannot be aliased without decaying into a pointer. However, we can bind a recognized array to a reference to an array, which is a completely different type. Example:

int v[]{1,2,3,4};
int (&w)[4]=v;  // w is a reference to an array of size 4

int* p = v;     // Decays v to a pointer. Size information lost.
// int &w[4]=v; // Does not compile: this means an array of 4 references.

Note that the syntax requires a bracket for reference name. Omitting it will lead the compiler to misinterpret it as an array of references, which cannot* be compiled.

This means contrary to common beliefs, you can pass a recognized array across functions through reference, but this is rarely done because of the hassle of explicitly entering the number of elements (4 for the example above) as part of the data type. This can still be done through templates/constexpr, but for such inconvenience, we’re better off using std::vector (or std::array if you want near zero overhead).

However, so far I haven’t found a way to re-recognize an array from a pointer. That means there is no way to keep a local array’s recognition across function boundaries in C since it does not have references like C++.


To summarize with a usage example: this post has described the entire logic needed to decide whether sizeof(x)/sizeof(x[0]) gives you the number of array elements, or how many times your machine pointer type is bigger than the element storage.


* references must be bound on creation. Declaring an array of references means you want to bound references in batches. There are no mechanisms to do so as of C++14.

 

234 total views, 2 views today

begin() and end() is defined for arrays in C++11 and above

I was a little confused the first time I saw range-based for-loop in C++11 on “Tour of C++” that it works right out of the box for both recognized array and STL containers and yet the text says it requires begin() and end() to be defined for the object for range-based for-loop to run.

I later learned that despite typical STL usage example writes v.begin(), v.end(), the most bulletproof way is to write begin(v), end(v) instead (Herb Sutter recommends it). Then I started to suspect that C++11 must have defined free-form (non-member) begin(), end() functions that takes in arbitrary recognized arrays. I pulled up my code editor and ran this:

#include <iostream>
int main()
{
    int v[4]={1,2,3,4};
    std::cout << *(std::crend(v)-1) << std::endl;

    return 0;
}

It compiled and ran uneventfully, printing ‘1’ as expected (I’m using crend(), to see if they implemented the more obscure ones). It makes more sense now why range-based for-loop works for arbitrary recognized arrays without making an exception to the begin(), end() requirement.

To confirm that it is the case (since “Tour of C++” didn’t say anything about why arbitrary array works for range-based for loop), I looked up the STL source code from libc++ in LLVM, namely <iterator>, and saw this:

template <class T, size_t N> constexpr T* begin(T (&array)[N]);

Bingo! There’s a mechanism to do so. But before I close, Stephan T. Lavavej (Mr. STL) mentioned that the template quoted above is no longer required (by the standard) to implement range-based for-loop in C++11.

Now the conclusion becomes that begin(), end() that takes in recognized arrays exist (which completes the logic behind range-based for-loop), but the range-based for loop can (and typically will) handle recognized arrays without these templated functions defined.

 

175 total views, 1 views today

Super-simplified: Programming high performance code by considering cache

  • Code/data locality (compactness, % of each cache line that gets used)

  • Predictable access patterns: pre-fetch (instructions and data) friendly. This explains branching costs, why linear transversal might be faster than trees at smaller scales because of pointer chasing, why bubble sort is the fastest if the chunks fit in the cache.

  • Avoid false sharing: shared cache line unnecessarily with other threads/cores (due to how the data is packed) might have cache invalidating each other when anyone writes.

153 total views, no views today

Super-simplified: What is a topology

‘Super-simplified’ is my series of brief notes that summarizes what I have learned so I can pick it up at no time. That means summarizing an hour of lecture into a few takeaway points.

These lectures complemented my gap in understanding open sets in undergrad real analysis, which I understood it under the narrow world-view of the real line.


X: Universal set

Topology ≡ open + \left\{\varnothing, X\right\}

Open ≡ preserved under unions, and finite intersections.

Why finite needed for intersections only? Infinite intersections can squeeze open edge points to limit points, e.g. \bigcap^{\infty}_{n}(-\frac{1}{n},\frac{1}{n}) = \left\{0\right\}.

Never forget that \left\{\varnothing, X\right\} is always there because it might not have properties that the meat open set B doesn’t have. e.g. a discrete topology of \mathbb{Q} on (0,1) = B \subseteq universal set X=\mathbb{R} means for any irrational point, \mathbb{R} is the only open-neighborhood (despite it looks far away) because they cannot be ‘synthesized*’ from \mathbb{Q} using operation that preserves openness.

* ‘synthesized’ in here means constructed from union and/or finite intersections.


[Bonus] What I learned from real line topology in real analysis 101:

  1. Normal intuitive cases
  2. Null and universal set are clopen
  3. Look into rationals (countably infinite) and irrationals (uncountable)
  4. Blame Cantor (sets)!

 

 

212 total views, no views today

EMC PCB Layout Notes

  • Implicit RLC (potentially filters) and antennas formed by traces
  • Large ground/voltage planes serves as EMI shield, low impedance path current sink
  • True differential signals can be generated by current sources
  • Decouple with ferrite beads if radiation inevitable by geometry/placement
  • Avoid / minimize large current swings on analog plane (e.g. buffer digital signals)
  • Star ground when splitting sections: don’t let heavy digital current sink through analog ground by cascading the grounds.
  • Don’t really need to split planes as long as large digital current’s preferred return paths are localized and far away from the analog section.
  • AGND/DGND refers to the grounds responsible for different sections of a mixed-signal IC. Has nothing to do with which actual ground to tie to. (e.g. DGND pin in ADC chips still goes to analog ground plane as it has low switching current)

221 total views, 2 views today