Watch out if a ‘const’ method in Python

One thing I feel a little bit not quite as intuitive when I switch to Python is I constantly have to look up whether the method directly updates the contents or it’ll return a different object (of the same type) that I’ll have to overwrite the input variable myself.

An example would be strings and bytes object. replace() sounded like an updating method, but it’s actually a ‘const’ method (a term borrowed from C++ to say that the method does not have side-effects) that does not change the state of the object.

I initially thought this has to do with whether the object is immutable or not, but I tried it on bytearray objects (which is mutable), replace() behaves consistently with the identically named methods in other immutable objects (bytes object, string object): you’ll need to assign the output to self (basically bind the name to the temporary and throw away the original).

bts = b'test'
bts.replace('es', 'oas')       # dumps the output to workspace (can be accessed by _) and do nothing else
bts = bts.replace('es', 'oas') # actually updates bts

 

5 total views, no views today

Python startup management

The startup script is simply startup.m in whatever folder MATLAB start with.

Now how about Python? For plain Python (anything that you launch in command line, NOT Spyder though), you’ll need to ADD a new environment variable PYTHONSTARTUP to point to your startup script (same drill for Windows and Linux).

For Spyder, it’s Tools>Preferences>IPython console>Startup>”Run a file”:

but you don’t need that if you already have new environment variable PYTHONSTARTUP correctly setup.

 

24 total views, 1 views today

MATLAB and Python paths

MATLAB’s path() is equal to Python’s sys.path().


To add paths in MATLAB, use the obviously named function addpath(). Supply the optional -end argument if you don’t want any potential shadowing (i.e. the folder to import has lower priority if there’s an existing function with the same name).

I generally avoid userpath() or the graphical tools because the results are sticky (persists between sessions). The best way is to exclusively manage your paths with startup.m so you always know what you are getting. If you want full certainty, you can start with restoredefaultpath() in MATLAB.


Python’s suggested these as equivalents of MATLAB’s addpath():

sys.path.insert(0, folder_to_add_to_path)
sys.path.append(folder_to_add_to_path)

but just like MATLAB’s addpath() which works with strings only (not cellstr), these Python options do not work correctly  with Python lists because the methods in sys.path are as primitive as doing [sys.path, new_stuff]:

  1. This means you’ll end up with list of lists if you supplied Python lists as inputs to the above
    (MATLAB will throw an exception if you try to feed it with cellstr instead of polluting your path space with garbage)
  2. This also means it doesn’t check for duplicates! It’ll keep stacking entries!

To address the first problem, we use sys.path.extend() instead. It’s like doing addpath(..., '-end') in MATLAB. If you want it to be inserted at the front (higher priority, shadows existing), you’ll need sys.path = list_of_new_paths + sys.path. For MATLAB, you can make a path string like DOS by using pathsep:

addpath(strjoin(cellstr_of_paths, pathsep)))

Note that  sys.path.extend() is still not polymorphic: it expect iterables so if you feed it a string, which Python will consider it a list of characters, you will get a bunch of one character paths inserted!

On the other hand, DO NOT TRY to get around it in Python with the same trick like MATLAB by doing sys.path.append( ';'.join(path_list)). Python recognize sys.path as a list, NOT a one long string like MATLAB/Windows path, despite insert() and append() accepts only strings!

Aargh!

The second problem (which does NOT happen in MATLAB) is slightly more work. You’ll need to subtract out the existing paths before you add to it so that you won’t drag your system down by casually adding paths as you see fit. One way to do it:

def keep_only_new_set_of_paths(p):
    return set(p)-set(sys.path)

You should organize your programs and libraries in a directory tree structure and use code to crawl the right branch into a path list! Don’t let the lack of built-in support to tempt you to organize files in a mess. Keep the visuals clean as mental gymnastics/overheads can seriously distract you from the real work such as thinking through the requirements and coming up with the right architecture and data structures. If you constantly need to jump a few hoops to do something, do it only once or twice using the proper way (aka, NOT copying-and-pasting boilerplate code), and reuse the infrastructure.

At my previous workplaces, they had dozens and dozens of MATLAB files including all laying flat in one folder. The first thing I did when I join a new team is showing everybody this idiom that recursively adds everything under the folder into MATLAB paths:

addpath(genpath())

Actually the built-in support for recursive directory search sucks for both MATLAB and Python.  Most often what we need is just a list of full paths for a path pattern that we search recursively, basically dir/w/s *. None of them has this right out of the box. They both make you go through the comprehensive data structure returned (let it be tuples from os.walk() in Python or dir() in MATLAB) and so some manipulations to get to this form.

genpath() itself is slow and ugly. It’s basically a recursive wrapper around dir() that cleans up garbage like '.' and '..'.  Instead of getting a newline character, a different row (as a char array) or a different cell (as cellstr), you get semi-colons (;) as pathsep in between. Nonetheless, I still use it because despite I have recursive path tools in my own libraries, I’ll need to load the library first in my startup file, which requires a recursive path tool like genpath(). This bootstraps me out of a chicken-and-egg problem without too much ugly syntax.


Most people will tell you to do a os.walk() and use listcomp to get it in the typical full path form, but I’m not settling for distracting syntax like this. People in the community suggested using glob for a relatively simple alternative to genpath()

Here’s a cleaner way:

def list_subfolders_recursively(p):
    p = p + '/**/' 
    return glob.glob(p, recursive=True);

It’s also worth noting that Python follows Linux’s file search pattern where directory terminates with a filesep (/) while MATLAB’s dir() command follows the OS, which in Windows, it’s *..

Both MATLAB and Python uses ** to mean regardless of levels, but you’ll have to turn on the recursive=True in glob manually. ** is already implied to be recursive in MATLAB’s dir() command.


Considering there’s quite a bit of plumbing associated with weak set of sys.path methods provided in Python, I created a qpath.py next to my startup.py:

''' This is the quick and dirty version to bootstrap startup.py
Should use files.py that issue direct OS calls for speed'''

import sys
import glob

def list_subfolders_recursively(p):
    p = p + '/**/' 
    return glob.glob(p, recursive=True);

def keep_only_new_set_of_paths(p):
    return set(p)-set(sys.path)

def set_of_new_subfolders_recursively(p):
    return keep_only_new_set_of_paths( list_subfolders_recursively(p) )

def add_paths_recursively_bottom(p):
    sys.path.extend(set_of_new_subfolders_recursively(p));

def add_paths_recursively_top(p):
    # operator+() does not take sets
    sys.path = list(set_of_new_subfolders_recursively(p)) + sys.path;

In order to be able to import my qpath module at startup.py before it adds the path, I’ll have put qpath.py in the same folder as startup.py, and request startup.py to add the folder where it lives to the system path (because your current Python working folder might be different from PYTHONSTARTUP) so it recognizes qpath.py.

This is the same technique I came up with for managing localized dependencies in MATLAB: I put the dependencies under the calling function’s folder, and use the path of the .m file for the function as the anchor-path to add paths inside the function. In MATLAB, it’s done this way:

function varargout = f(varargin)
  anchor_path = fileparts( mfilename('fullpath') );
  addpath( genpath(fullfile(anchor_path, 'dependencies')) );
  % Body code goes here

Analogously,

  • Python has __file__ variable (like the good old preprocessor days in C) in place of mfilename().
  • MATLAB’s  mfilename('fullpath') always gives the absolute path, but Python’s  __file__ is absolute if it’s is not in sys.path yet, and relative if it’s already in it.
  • So to ensure absolute path in Python, apply os.path.realpath(__file__). Actually this is a difficult feature to implement in MATLAB. It’s solved by a MATLAB FEX entry called GetFullPath().
  • Python os.path.dirname is the direct equivalent of fileparts() if you just take the first argument.

and in my startup.py (must be in the same folder as pathtools.py):

import os
import sys

sys.path.append(os.path.dirname(os.path.realpath(__file__)))

import pathtool

user_library_path = 'D:/Python/Libraries';
pathtool.add_paths_recursively_bottom(user_library_path)

This way I can make sure all the paths are deterministic and none of the depends on where I start Python.


Now I feel like Python is as mature as Octave. It’s usable, but it’s missing a lot of thoughtful features compared to MATLAB. Python’s entire ecosystem like at least 10 years behind MATLAB in terms of user friendliness. However, Python made it up with some pretty advanced language features that MATLAB doesn’t have, but nonetheless, we are still stuck with quite a bit of boilerplate code in Python, which decreases the expressiveness of the language (I’m a proponent of self-documenting code: variable and function names and their organization should be carefully designed to tell the story; comments are reserved for non-obvious tricks)

18 total views, no views today

Getting pyinstaller 3.4 to work with Python 3.7

Python is an excellent language, but given that it’s free, it also comes with a lot of conspicuous loose-ends that you will not expect in commercially supported platforms like MATLAB.

Don’t expect everything to work right out of the box in Python. Everything is like 98% there, with the last 2% frustrate the heck out of you when you are rushing to get from point A to point B and you have to iron out a few dozen kinks before you can really start working.

When I tried use pyinstaller (v3.4) to compile my Python (v3.7) program into an executable, I ended up having to jump through a bunch of hoops:

  • pip install pyinstaller gives:
    ModuleNotFoundError: No module named 'cffi'
  • Then I looked up and installed cffi
    pip install cffi
  • After the dependency was addressed manually (it shouldn’t )  pip install pyinstaller worked
  • Then I tried to compile my first Python executable with pyinstaller, and I got this exception:
    File "C:\Python37\lib\site-packages\win32ctypes\core\cffi\_advapi32.py", line 198
    
            ^
        SyntaxError: invalid syntax
  • I searched the exact string and learned that pyinstaller (v3.4) is not ready for Python 3.7 yet! How come pip installer didn’t check for it? I opened up the offending file and looked for line 198 and saw this:
    c_creds.CredentialBlobSize = \
    
        ffi.sizeof(blob_data) - ffi.sizeof('wchar_t')

    It’s a freaking line continuation character \ (actually the extraneous CR before CRLF) that rooster-blocked it.

  • I just deleted the line continuation and merged the two lines, and saved _advapi32.py, then I was able to compile my Python v3.7 code (using pyinstaller 3.4) with no issues.

This is not something you’ll experience as a MATLAB user. The same company, TMW, wrote the MATLAB compiler as well as the rest. The toolbox/packages are released together in one piece so breaking changes that causes failure for the most obvious use case are caught before they get out of the door.

Another example of breaking changes that I ran into: ipdb does not allow you to move cursor backward.

Again, this is the cost associated with free software and access to the latest updates and new features without waiting for April/October (it’s the MATLAB regular release cycle). If hassle and the extra engineering time far exceed licensing MATLAB licensing costs, MATLAB is a better choice, especially if software is just a chore to get your company from point A to point B, and you are willing to pay big bucks to get there quickly and reliably.

Even with free software on the table, your platform choice is always determined by:

  • how much your time is worth wrestling problems
  • how much flexibility do you need (for customizing to your needs)
  • how much you are willing to pay for the licenses and support

In any case, the community did good work. Please consider sponsoring PyInstaller and PSF if you profit immensely from their work.

20 total views, no views today

Picking an IDE for Python

The native features in MATLAB are often very good most of the time, as I’ve yet to hear anybody spending time to shop for a IDE outside the official one.

Atom has the feel of Maple/MathCAD, and Jupyter Notebook has the feel of Mathematica. Spyder feels like MATLAB the most, but it’s hugely primitive.

IDLE is more miserable than a command prompt. It doesn’t even have the decency to recall command history with up arrow. It’s like freaking DOS before loading doskey.com. Not to mention that single clicking on the window won’t set the cursor to the active command line, which you have to scroll all the way down to click on the bottom line. WTF! I’d rather use the command prompt and give up meaningless syntax coloring.

IPython (in Spyder) is unbearably slow (compare to MATLAB’s editor which I consider slow to the extent that it’s marginally bearable for the interactive features it offers), but at least usable unlike IDLE, and most importantly the output display is pprint (pretty printer) formatted so it’s legible. Just type locals() and see what kind of sh*t Python spits out in IDLE/cmd.exe and you’ll see what I meant.

I simply cannot live without who/whos provided in IPython, but I still don’t like it showing the accessible functions/modules along with the variables (I know, Python doesn’t tell them apart). Nonetheless it’s still weak because these are automagics that doesn’t return the results as Python data (just print). Spyder’s ‘variable explorer’ is the only place I can find that doesn’t include loaded functions/modules. Python should have provided facilities to get the user-introduced variables exclusively and leave the modules to a different function like MATLAB’s import command that shows imported packages/classes.

However, pretty printer doesn’t even come close to MATLAB in terms of the amount of dirty work disp() did to format the text to make it easy to read. Keys in the dictionary shown in pretty printer in Python are not right-aligned like MATLAB struct so we can easily tell keys and values apart. For example:

MATLAB struct shows:
          name: 'S'
          size: [9 1]
         bytes: 7765
         class: 'struct'
        global: 0
        sparse: 0
       complex: 0
       nesting: [1×1 struct]
    persistent: 0

Python with Pretty Printer shows:
{'__name__': '__main__',
 '__doc__': 'Automatically created module for IPython interactive environment',
 '__package__': None,
 '__loader__': None,
 '__spec__': None,
 '__builtin__': <module 'builtins' (built-in)>,
 '__builtins__': <module 'builtins' (built-in)>,
 '_ih': ['', 'locals()'],
 '_oh': {},
 '_dh': ['C:\\Users\\Administrator'],
 'In': ['', 'locals()'],
 'Out': {},
 'get_ipython': <bound method InteractiveShell.get_ipython of <ipykernel.zmqshell.ZMQInteractiveShell object at 0x00000000059B7828>>,
 'exit': <IPython.core.autocall.ZMQExitAutocall at 0x5a3b198>,
 'quit': <IPython.core.autocall.ZMQExitAutocall at 0x5a3b198>,
 '_': '',
 '__': '',
 '___': '',
 '_i': '',
 '_ii': '',
 '_iii': '',
 '_i1': 'locals()'}

I often convert things to MATLAB dataset() because the disp() method is excellent, such as struct2dataset(ver()). table/disp() is nice, but I think they overdid it by defaulting to fancy rich-text that bold the header, which makes it a magnitude of orders slower, and it’s not using the limited visual space effectively to show more data. Python still has a lot more to do in the user-friendly department.

29 total views, no views today

Obscure differences between Kanji and Chinese characters

People who already know Chinese characters are often said to have the advantage of being able to pick up Japanese quickly. However, to learn it properly, in addition to the  difference between infix (English, Chinese) and reverse polish (Japanese) notations, it also comes with quite a bit of baggage. It’s the differences that requires work to observe, such as:

  • some made up ‘Chinese’ characters (和製漢語),
  • some are written slightly differently, including artistic variations
  • some has a completely different meaning,
  • some has opposite preferences for using which character in the pair when simplifying
  • and some has drastically different overtones despite they technically mean the same thing
  • the mixture of simplified and traditional characters, occasionally a character written like simplified Chinese means something totally different from traditional Chinese, such as 机(つくえ)which means desk vs 機(キ)which means machines or chances depending on the context.
  • the roles of historical and modern writings are randomly reversed

学習 is a good example. Modern Chinese considers 学 to be more colloquial (e.g. 学武功)and 習 to be more formal (e.g. 習武). Japanese is the other way round for 学ぶ and 習う。学ぶ has a more serious tone.


Actually, the kinds of variations mentioned above applies to regional differences in Chinese languages (such as Taiwanese, Cantonese and Mandarin). Most places agree to write Chinese in a way that can be read directly using Mandarin so that we can at least communicate on paper. So as time goes by, we lost the ability to write in Taiwanese and Cantonese. I hope it’ll change as both dialects are very colorful. Re-expressing them in Mandarin will take away all the flavors in them.

It’s evident that humans can pick up more than one language, so there is no reason to compromise dialects in the process of standardization. People advocating to kill other languages are simpletons who believe in the kind of logic supporting a competitive system: you find ways to make your peers do worse to stay ahead, instead of improving yourself.

Different regions occasionally have different preferences for character order in phrases. Basically we have to watch out for all kinds of combinations. Like 介紹 is used in the same order for Taiwanese/Cantonese/Mandarin to mean introduction, but it’s reversed 紹介(しょうかい) in Japanese. To make it a total mindfuck, Mandarin sticks with 客人 for guests, which is used the same way as Japanese’s 客人(きゃくにん), Taiwanese mostly says 人客, while Cantonese uses both with slight overtones: 客人 is usually used as a particular noun (e.g. 呢位客人) while 人客 is often used as a collective noun (e.g. 人客嚟齊未?), most likely because 客人 sounds more formal than 人客.


Putting traditional and simplified Chinese aside, different regions have different preferences for Chinese characters. I couldn’t tell the difference between traditional Chinese characters used in Hongkong/Macau (港澳繁體) and Taiwan (台灣正體) on Wikipedia, and later learned that it was because I’ve been randomly mixing both all along and nobody ever pointed it out.

裏/着 (Hongkong) vs 裡/著 (Taiwan) are good examples. For these two, modern Japanese sided with Hongkong in the character choices for 裏(うら) and 着(ちゃく). On the other hand, 峰(みね) in Japanese sided with the Taiwanese’s preferred writing 峰, while the 峯 is the ‘officially’ preferred writing in Hongkong.

I remember writing 峰 most of the time even when I was a kid and only used 峯 for names that specifically calls for it. We respect the original writing for names. This is the similar situation as in Japanese: 沢(さわ/たく) is used in most cases and reserve 澤(サワ) for names that specifically requests to be written in this form. The only difference is that I used the official character 峯 exclusively for names, while using the off-label 峰 for the rest.

Speaking of names, there are some similar-looking characters that has the same Japanese sound (かな) but are actually different in both writing and meaning. 斉藤 and 斎藤 are different, but they are easily confused for native Japanese speakers who don’t have any Chinese language background. Here’s the table for comparison:

齊/齐・斉 齋/齋・
Meaning Gathered, organized Plain, house, recitations
Cantonese chai (cai4) jaai (zaai1)
Taiwanese tsè tsai
Mandarin qi2 zhai1
Japanese (音読み:さい) 斉しい・等しく いつき・(潔斎)物忌み

The bottom line is: as language evolves, different regions have different preferences about what can they be sloppy about and what they must be meticulous about. They also reorder/tweak things to make them flow smoothly with their dialect. This means traps for for those learning a new language that are close to what they’ve already mastered.

I came across a document called 常用漢字表 released by the Agency for Cultural Affairs (文化庁) that explains all the quirks of Kanji that was carefully collecting on my own while taking the classes. Wish I had it back in the days. Here’s the link, but I also saved a local copy of 常用漢字表 just in case if their website moves around in the future.

29 total views, no views today

Malware deleting TrustedInstaller.exe, therefore crippling Windows

My sister’s computer is was infected with a bunch of stubborn malware. Even after cleaning the offending files, a lot of things won’t wouldn’t work.

Windows Update, run sfc /scannow, or DISM /Online /Cleanup-Image fails with unknown reasons, which I found it somehow related to “Windows Module Installer” service not running.

I saw something weird in services.msc: “Windows Module Installer” doesn’t exist, but I know the underlying name is “TrustedIntaller” and noticed a service named as such is there, but it cannot be started, nor there are any descriptive information.

So I searched registry for “TrustedInstaller” and got to its entry. I noticed these two:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\TrustedInstaller]
"DisplayName"="@%SystemRoot%\\servicing\\TrustedInstaller.exe,-100"
"Description"="@%SystemRoot%\\servicing\\TrustedInstaller.exe,-101"

It means the meaningful names and descriptions I saw on services.msc are generated by calling the underlying  service executable file with switches. I checked my “C:\Windows\servicing” and found that “TrustedInstaller.exe” is not there at all! Of course you cannot start a service where the file does not exist at the promised path (ImagePath).

I searched the hard drive and found only one instance of the file stored somewhere (like C:\Windows\winsxs\x86_microsoft-windows-trustedinstaller_31bf3856ad364e35_6.1.7600.16385_none_90e389a7ae7a4b6c) and I tried to move the file to “C:\Windows\servicing”. However the ownership and permissions to write to “C:\Windows\servicing” goes to “TrustedInstaller” account, not “Administrator”, so I took the ownership, gave Administrator full rights, then move the file over.

Everything worked after that! Just the mere trick of deleting TrustedInstaller.exe is enough to make the user miserable trying to clean the system up! “sfc /scannow” or the like requires TrustedInstaller/WIM to be working in the first place, so you cannot use it to repair TrustedInstaller/WIM problems.

83 total views, no views today

Floppy Disk Drive Ribbon Cable Orientation

Hooking up a floppy drive after a decade of disuse today, I followed the notch/key on the connector/cable but it turns out to be incorrect! Turns out I should do the opposite, forcing the key to the side without the notch, by force (or trim the key)!

So stick with the conventional wisdom that the ribbon’s pin 1 (marked) should always stay close to the power connector, regardless of whether it’s IDE or FDD (3.5″ or 5.25″), EVEN IF FOOLPROOF MECHANISMS TELLS YOU OTHERWISE!

 

 

 

55 total views, no views today

Option 005 “Vertical Output” port of 54600 series oscilloscopes (54616B, 54616C, etc) A secret backdoor feature that new oscilloscopes lack

Over the last year, I got a couple of requests for 54616B that specifically ask for a “vertical output” port at the back. I have never seen an oscilloscope that came with such a port, including a few hundred of first generation first generation 54600s I acquired from many different sources.

I got curious and looked it up. Turns out it’s a secondary feature of a relatively obscure option (only measured in the manuals, but I have never seen one) called Option 005, which lets you analyze (like count lines) and trigger over common TV signals, like PAL/NTSC/SECAM, which is way obsolete today. It also seems that none of the customers asking specifically for the “vertical output” port at the back know that it is a super rare option that is normally not included, so they must be using it for something else other than analog TV signal analysis.

A closer look at the user guide shows that “vertical output” port duplicates the signal source (e.g. channel 1) that the scope is triggering on, limited to what is seen by the oscilloscope, to the said “vertical output” port, a secondary feature to let you chain your signal to instruments like spectrum analyzers for further analysis.

I tried the feature myself by chaining the output to another oscilloscope. Even if the waveform is off-screen for the current vertical volts/div, the vertical output port waveform did not clip. I also played around with input impedance settings 1MΩ and 50Ω for a 50Mhz square wave. Based on what gets the square wave badly distorted, I can confirm that the vertical output signal is the analog signal after attenuator (the amplitude changes only with Volts/div that causes relay clicks) but before ADC, assuming a 50Ω load.

Wait! An oscilloscope that duplicates the input analog signals after being processed by the front end (post-attenuator, pre-ADC) to an external output port?! I don’t have to mess with the original signal path by splitting the signal (passively) or make an amplifier to duplicate the signal? Wow! How come it’s not standard (or at least a purchasable option) in modern oscilloscopes? I’d like to see what’s going on with the analog waveform before the scope processes it! Not only it’s very educational, it allows other instruments to get an accurate insight of what the oscilloscope is seeing. Neat!

Installing the Option 005 is not difficult if you happen to have an unobtainium Option 005 case with labels, and the entire kit with all the necessary interconnect. However, it’s like an unicorn and I’ve never seen one. Drilling professional looking holes for it is a nightmare as we don’t have the dimensions. The hardware is also insanely hard to get as it was made for a specialized crowd for the time and practically nobody cared about analog TV signals nowadays. Even if I can get that, they are most often missing the interconnects. The ribbon cable is missing for nearly all of them, and if you get a standard ribbon cable, you’ll realize the plastic retainer gets into the way of a screw on the main acquisition board so the Option 005 card won’t slide in unless you trim some of the plastic off. PITA!

Nowadays I am already spoiled by high end gears like MSO6054A and 13Ghz Infiniiums (like DSO81304A), but none of them has a convenient analog, post-attenuator output like a first generation 54600 with an Option 005. Given the hardware is scarce, I’ll save it for the top of the line first generation 54600 series, namely 54616B and 54616C.

For those who have this special need (need to tap into the pre-ADC signals up to 500Mhz), I can custom build these Option 005 units for you, depending on parts availability. Call me at 949-682-8145 or reach me at my business website www.humgar.com.

59 total views, no views today

The mess converting decibels to voltages in test instruments (dBm, dBW, W, dbV, V)

Complex conversions between decibels and physical quantity has always been a rich source of confusion. The reason is that dB(something) is actually a loaded word with hidden assumptions:

  • dB always works on base-10
  • dB is always a relative (dimensionless) POWER quantity, the convenience scaling factor is always 10. It does NOT make sense directly on non-power quantities.
  • dB(something) is always with respect to a quantity (the something), and the reference quantity is often not written in full. Since there is an implicit reference, db(something) can be mapped to absolute quantities.

If you are a diverse multi-disciplinary techie like me (math, electronics, programming, computers), it’d frustrate the hell out of you when you talk to people who has been working exclusively on a narrow field for at least a decade and they have a table of commonly used numbers in their memorized: they act like you are supposed to know how to get the numbers in the dB-variant that they use, than explaining to you what the field-specific assumptions are (likely because they forgot about it).

I hope this post will clear up the confusion by working out an example in test instrumentation, most commonly in RF as well, converting dBm to Volts.


Before I start, I’ll clarify the most common form of beginner confusion in EE and physics: converting between dB and voltages:

\mathrm{dB}= 20\log_{10}(V)

This looks like a definition of decibel, except the scaling factor is 20 magically for Volts. It is correct (under very commonly used assumptions) as well. Most people take it as an equivalent definition of decibels, and throw away these important assumptions behind it:

  • the reference is 1V,
  • and the resistance* (common to the voltage of interest and the reference voltage) gets cancelled

and run into troubles when they venture into those dB-variants like dBm. Technically the above should be written as dBV, but I have seen very few people use the clearer term.

The decibel formula for voltage came from

\mathrm{dBV} = 10\log_{10}(\frac{P}{P_{ref}})

where P = \frac{V^2}{R} and P_{ref} = \frac{1^2}{R}, you get

\mathrm{dBV} = 10\log_{10}(\frac{V^2/R}{1^2/R})

The R get cancelled out and you get

\mathrm{dBV} = 10\log_{10}(V^2)

People moved the squaring out and lumped (multiplied) it with the scaling factor 10:

\mathrm{dBV} = 20\log_{10}(V)

So the whole reason why it is 20 instead of 10 is simply because P\propto V^2, and \log(V^2) \equiv 2\log(V).


Now back to the business converting dBm to dBV or Volts.

First of all dBm is dB(mW), NOT dB(mV). The RF/telecom people are just too lazy to write out the most important part: the physical quantity expressly, because nearly all the time, it’s the power that matters to them.

However, I often need to connect a RF generator to a high bandwidth oscilloscope, so the very self-centered RF/telecom nomenclature start to become problematic when people of different fields need to talk to each other. Oscilloscope see everything in volts. RF sees everything in power, often in dB.

Then we get to the (mW) part, which means the reference quantity in the definition is 1mW, which is a physical quantity with dimensions. Then how are we going to convert it to Volts? You cannot jump to the shortcut formula I illustrated above with the 20 factor this time because the reference is in mW and your quantity is in Volts.

You’ll need to convert power to voltages. To do so, you’ll need to know voltages induced by power ‘dissipated’ through a ‘resistance’ across a component (load). The missing gap is that you will need to know the load ‘resistance’ before the conversion. With that, you can use P = V^2/R, or rewritten as V^2 = PR when it’s more convenient.

All RF-related test-instruments and bench function generators typically have a 50Ω output impedance, which means it also assumes a matching 50Ω as mathematically, it provides the maximum power transfer (sadly split evenly between the load and wasted at the instrument’s output impedance). For convenience, the amplitude you see in the instrument control panel refers to the amplitude you see at a 50Ω load, not what the instrument pumps out internally (that’s why you see 2Vpp when your function generator says 1Vpp if you hook it up to a low-end oscilloscope that serves 1MΩ by default).

Since we are dealing with continuous wave (not transient power), all amplitude quantities on RF test instruments are in RMS (power or voltage) unless otherwise specified. So the quantities we have for dBm is

\mathrm{dBm} = 10\log_{10}(\frac{P_{rms}}{1mW})

when written in terms of voltages,

\mathrm{dBm} = 10\log_{10}(\frac{V^{2}_{rms}/50Ω}{1mW})

Instead of splitting it into 3 terms and immediately grouping the constants, I’d like to first convert dBm to dBW:

\mathrm{dBW} = 10\log_{10}(P/1W)

\mathrm{dBm} = 10\log_{10}(P/0.001W)

The linear quantity in dBm is artificially scaled 1000 times bigger than in dbW, to put it in a comfortable scale for us to work with smaller signals. Therefore dBm is always 30dB higher than dbW (the smaller the reference, the bigger the relative numbers look).

So back to the above in dBW, we subtract 30dB to get to dBW:

\mathrm{dBm} = \mathrm{dBW} + 30\mathrm{dB}

where

\mathrm{dBW} = 10\log_{10}(V^{2}_{rms}/50Ω)

We can separate the load and put it on the left hand side

\mathrm{dBW} + 10\log_{10}(50Ω) = 10\log_{10}(V^{2}_{rms})

The right hand side is dBV, and you can think of the load as scaling the power up (inducing) the voltage-squared quantity (V^2 = PR, or \log(V^2) = \log(P) + \log(R)).

10\log_{10}(50Ω) is 16.9897dB, for most purposes I’ll just say the load lift the dBW by 17dB when turning it into dbV.

Having both together,

\mathrm{dBW} + 17\mathrm{dB} = \mathrm{dBV}
\mathrm{dBW} = \mathrm{dBm} - 30\mathrm{dB}

\mathrm{dBm} - 30\mathrm{dB} + 17\mathrm{dB} = \mathrm{dBV}
(This is how you should remember it, so you can replace the +17dB for 50Ω with
10\log_{10}(R) when you work on other applications, like 600Ω, 4Ω, 8Ω for audio.)

Basically:

-30dB to undo the mili- prefix (small reference value bloated the numbers)
+17dB to account for the load inducing the voltage by burning Watts

The end result (for the 50Ω case):

\mathrm{dBV} = \mathrm{dBm} - 13\mathrm{dB}

Then you can convert dBV to V_{rms}:

\mathrm{dBV} = 10\log_{10}(V^2_{rms}/1^2) = 20\log_{10}(V_{rms})

V_{rms} = 10^{\frac{\mathrm{dBV}}{20}}

V_{rms} = 10^{\frac{\mathrm{dBm}-13dB}{20}}

Phew! That’s a lot of steps to get to something this simple. So the moral of the story is that these assumptions cannot be ignored:

  • The quantity is always power in dB, not voltages
  • dB(mW) has a reference of 1mW. The smaller the reference, the bigger the numbers
  • RMS voltages and power are used in RF
  • 50Ω is the load required to convert from power to voltages

Keysight already has a derivation, but it’s just a bunch of equations. The missing gap I want to fill in this blog post is that people find this so confusing they’d rather believe a formula or a table pulled on the internet:  it doesn’t have to be this way after realizing that there’s a bunch of overlooked assumptions.


* Technically I should call it (load) impedance Z, as in RF, capacitive and inductive elements are nearly always involved, but I want to make it appealing to those with high school physics background.

110 total views, no views today