Installing SoftEther VPN Server in Asus Merlin WRT

I’m experimenting with SoftEther as the user experience/interface is a lot more polished than WireGuard and OpenVPN and it has wide platform coverages. The Windows administration interfaces are sensibly designed (organized conceptually), unlike the Linux software culture that basically pretend to have a user interface yet it’s just a step away from editing the raw config file. SoftEther’s documentation also has nice graphical illustrations and it’s use cases oriented. The best part of the docs is that they are short and to the point.

Here I have a use case that’s not as quite as common for SoftEther’s users, so I might as well do a quick write up so if I run into this again in the future, I don’t have to do the research again.

Use case: router doubling as VPN server

I made a diagram based on my current understanding of how ethernet router works, and what needs to be done to have the router dual as SoftEther VPN server.

No guarantee that it’s the correct model and the lingo, and I’d appreciate comments to help me improve as I’m still learning (I was in algorithms and non-networking software development, more on the DSP side + light embedded systems, so this area is new to me).

  • The part shaded in blue is the new part we are building.
  • Installing SoftEther from Entware is just the square block.
  • You need to modprobe tun to let the SoftEther Server Admin software create the TAP port (made up Ethernet card).
  • At the same time the TAP port is created, say tap0, what SoftEther ‘bridges‘ is NOT the LAN, but the orange link in the diagram that goes to the virtual hub (which belongs to a VPN server instance). This lingo confusion wasted me days.
  • The ‘bridge’ on SoftEther’s side tells the incoming VPN connection which ‘Ethernet card’ (turns out to the the TAP interface) on the host computer should act on its behalf.
  • I felt like something is odd that SoftEther did not ask me what local network should the TAP interface go into so I suspect the TAP is just sitting there not talking to anybody, and it turned out to be the reason why my incoming VPN connection succeed but I’m not getting DHCP assignments.
  • I bit the bullet and understand how a Linux router work as if it were a computer with 5 Ethernet cards and one important piece of the puzzle is that the 4 LAN ports aren’t directly talking to the the WAN, but instead they form a bridge (software switch) which the bridge represents them and talk to the processed WAN traffic.
  • So the missing link is the double-line on the diagram where I add the TAP interface to the LAN bridge, namely brctl addif br0 tap_tap0. Linux adds a tap_ prefix to tap interfaces so it’s tap_tap0 for tap0 in SoftEther.
  • One more non-obvious thing here is that you also need to register the brctl a few seconds (using sleep delay) right after the SoftEther VPN Server service starts and nowhere else. The TAP has to exist before you put it on the LAN bridge and the TAP is programmed correctly to be as short-lived as needed, which is very responsible.

What to watch out for this use case

SoftEther’s interface does support creating a TAP adapter, but it provides scary warnings as this is an unusual settings.

TAP depends on the TUN module being loaded first, but Merlin-WRT’s firmware do not load this out of the box.

Some other websites tells you to install packages ip-full (for ip command) and OpenVPN (for the TAP) adapter, but it’s not necessary in some newer releases of Merlin-WRT. It’s all there, just waiting for you to modprobe tun (load TUN/TAP kernel drivers) before you can create TAP adapters.

If you don’t have the TUN module loaded first, the newly created bridge will show ‘Error’ with no explanation, which is confusing.

I figured out this is the missing part that causes the Error status by learning how TAP interface are created on Linux and speculated the Windows remote server admin interface (Server Manager) calls this under the hood:

ip tuntap add dev {YOUR TAP DEVICE NAME GOES HERE} mode tap

and tried to imitate the call and researched the error messages.

The next hard part is that the TAP adapter created by SoftEther’s is not tied to anything in the router when freshly created by “Local Bridge Setting”! It’s like you just freshly added an extra network card into a computer with the drivers set up, it doesn’t interact with anything on your network before you plug a cable in the right port!

As the last step you will need to SSH into the router to put in brctl addif br0 tap_tap0.

Obvious preparations

  1. Prepare USB storage (format it with amtm) to host Entware if not already done
  2. install Entware (amtm has an installer for it)
  3. install softethervpn5-server through Entware (opkg install softethervpn5-server)

Enable TUN/TAP drivers

By default TUN/TAP kernel module is by default not loaded, so we somehow need to add modprobe tun to startup scripts.

Out of the box the router is read-only so you cannot get it to remember the startup scripts unless you turn on /jffs, a small (like 64MB) onboard non-volatile memory to store user data such as startup scripts.

After you turn on /jffs, you will see tapping points to the startup process provided by executable files located in /jffs/scripts:

If you haven’t installed anything that has written to services-start (the earliest point), you can install spdmerlin (from amtm), a tool that provides a customized router admin page that creates a dashboard with all admin goodies and it will create services-start and make it executable if it’s not already there for you to tap in the modprobe tun line.

If you want to do this yourself, make sure you spell ‘services’ with the plural ‘s’ (the pre-existing ‘service-event’ which the ‘service’ is singular might tempt you to imitate it, which is incorrect) and chmod +x services-start to make the script executable.

I use nano to sneak modprobe tun into /jffs/scripts/services-start (I also tried init-start and it works too since modprobe is very early kernel stuff). Do whatever that’s convenient for you as long as you can sneak modprobe tun in:

I recommend rebooting right away then run lsmod | grep tun to make sure the module is indeed loaded. If you can’t spare a reboot (which is like 5 minutes), you can simply run modprobe tun at the terminal right away and hope the startup script remembers to do it on the next reboot

Use SoftEther Server Manager to remotely configure the softethervpn5-server installed on the router

The server program on the router did not ask for a password, yet SoftEther asks for it. This UI design is actually a little confusing. Turns out you enter an empty password on the first access/run and the user interface will ask you to create a proper password (just like some routers’ admin pages do).

The first time you set it up, you will be greeted by a Wizard which I cannot find again. This wizard is equivalent to ‘Create a Virtual Hub’ -> [‘Manage Virtual Hub’ -> Add Users] -> Local Bridge Setting. However, you want to skip the last step (create bridge) in the wizard because the wizard version caters basic users and they don’t offer the option to make a TAP adapter for the bridge.

By exiting the wizard at the bridge creation step, you’ve created the ‘Virtual Hub’ (which SoftEther sees ‘Virtual Hub’ as an instance of VPN server which you can run in parallel. Confusing lingo for beginners, but it might be sensible with the logic of the architecture). Click on the ‘Local Bridge Setting’ to finish the step that was not done by the Wizard

Bridging is a matter of hardware, so it’s universal across all Virtual Hub (or VPN server instances). This is why it’s at the top level outside your VPN server instance (Virtual Hub) configs.

Softether is trying to be helpful but we know what we doing something unusual here (using the router itself as a computer that plugs into the router). Don’t get scared by the warning and just click Yes to continue

If you remember to start the TUN kernel module, the Status should turn from Error to Operating in a split second. If it stays in the Error, go back and check if you have TUN running properly.

Oversimplified view of LAN bridges

From an end-user perspective, a bridge can be thought of in terms of switches despite the order of evolution is the other way round.

You can think of a bridge as a switch where the computer that hosts it gets a free ride on the switch without the extra physical switch NIC port, physical ethernet cable, and physical device/computer NIC port. I called it ‘implied’ in the diagram on top of this post.

Say for example with 1 ethernet adapter computer A connects to the upstream (say Internet and home network managed by a router above) and let’s call it NIC-A. Then we install an extra network card/interface called NIC-B that’s for serving other devices.

By creating a bridge BR0 formed by NIC-A and NIC-B, you created an illusion of NIC-A behaving as two network cards NIC-A and NIC-B with 2 cables connected to the upstream LAN directly despite only 1 card (NIC-A) is physically there. So what NIC-A did in the bridge BR0 is it act as a software switch which it gets a free ride (implied) and the downstream Computer B rides on the switch that’s served by Computer A.

Add the TAP interface to the existing LAN bridge

brctl is the command line interface for managing bridges

addif adds an interface to the bridge. Here’s the manual page for the syntax:

In this forum post user miscell reversed the order and typed the interface first, which you’ll get a ioctl error complaining that you’re trying to write to an unwritable file.

However, since we are in a router, these changes won’t stick on reboot so we need to put this somewhere. Turns out it’s a colossal pain in the butt to figure this out because the tap0 adapter is correctly programmed to exist only when the SoftEther VPN server service is running and disappear when the service stops. In other words, the TAP adapters created and managed by SoftEther is ephemeral.

Since the TAP does not exist before the SoftEther VPN Server service (S05vpnserver) starts and vanished when the service stops, the ONLY place you should attach the bridging operation is within the start) section of /opt/etc/init.d/S05vpnserver, right after the core service completely finished starting so the TAP is fully created. I monitored the output of ifconfig and realize I need a few seconds of delay before adding the TAP interface to the bridge because the TAP bridge has to exist first. Add the highlighted line to the right place

with the chunk repeated here if your LAN bridge is called br0 and the TAP is called tap0 in SoftEther:

sleep 3
brctl addif br0 tap_tap0

I also tested it and it seems like the bridge association is removed when the TAP adapter was cleaned up when the service is stopped, so I didn’t bother to add the brctl delif in the stop) section.

/opt is actually points to your entware folder (I choose not to show the raw path because it contains my usb partition label which you’ll have to substitute your own) so the data is not volatile and it’s living in your USB entware storage. Basically the SoftEther VPN server registry lives in Entware’s /init.d as S05vpnserver.

Double check the naming on your router with say ifconfig instead of trusting the tap_ prefix which might not be universal across routers. Also check if your router’s LAN bridge is indeed named br0 and replace the interface names accordingly. You can also adapt this to other routers as long as you know where to sneak in the startup scripts


Bonus: Firewall instructions

The firewall rules in MerlinWRT just quit working so the table I entered doesn’t do anything when I turn the firewall on. It doesn’t seem like it’s placing firewall exceptions the way I intended.

There’s also another weird behavior that if the port is firewall blocked, the server admin program intermittently still connect but it connects to a blank state server (blank config). WTF!

You won’t run into these problems if the firewall is turned off, but if you want to keep the firewall on, here’s the SoftEther VPN Server Firewall instructions.

Suggestion to SoftEther: Add a LAN bridging UI to the TAP option

Since this is an unusual concept, I copied the diagram from 3.6 Local Bridges – SoftEther VPN Project and overlay it to illustrated the ‘Local (VPN) Bridge’ has nothing to do with your LAN bridge which is necessary for the TAP adapter to do anything useful.

Right now there’s too little help on this topic which SoftEther considers it as advanced. Turns out putting SoftEther on a router isn’t too uncommon of a thing to ask for once people find out that it’s not impossible.

It’d save us who want to put SoftEther on a Linux router a lot of grief if SoftEther has an extra UI section in the dialog with a pulldown menu that states what bridge it can optionally join:

This is better done inside SoftEther instead of outside it because the users do not have to anticipate the names of the TAP adapters administrators create in the UI. Don’t worry about this extra option of adding it to a LAN bridge could confuse new users, as the lack of such option is way more confusing because there’s a TAP adapter created just to not connect to anything and it shoves new users to a dead end!

In the worst case you can throw a dialog box when users choose a non-blank item from the bridge list saying that this is for advanced users and make sure you know what you are doing (it’d be helpful to remind this could be used for installing SoftEther server on a Linux router).

Extras (feel free to skip it): First-time Wizard

Under no circumstance you should pick one of the LAN ports like /eth0 to bridge. This made no sense (btw /eth0 is usually the WAN interface) and I tried it just out of curiosity and it bricked the router by boot loop (luckily there’s self-recovery to fresh state after a few crashes).

The wizard isn’t that useful as soon as you notice the so called ‘VPN server (instance)’ is called ‘Virtual Hub’ and the buttons on the screen make intuitive sense that requires little explanation.

Loading

NTLite Post-Setup Trap (Machine vs User)

There is a conceptual trap in NTLite’s registry and post-setup section: anything user-specific are not handled properly until an account is established (say created by NTLite itself)! You might even run into this dead-end if you attempt to tell NTLite to install user-interactive programs before an account exist.

How can you install software that did not come with Windows without an account in place and Windows will know where to put the user-specific registry settings? Silent installers might get stuck as they don’t know how to handle it gracefully (such as aborting the particular installation and move on). So you might need to hard reset to interrupt the confused silent installers until there are none left.

Machine vs User

First of all, NTLite’s UI did not educate the user on the concept of ‘Machine’ vs ‘User’ and what are the implications and bad things that’d happen if conflated the two.

TLDR: ‘Machine’ vs ‘User’ in post-setup section is a matter of WHEN (before or after an *user* account is active), not a matter of WHICH (machine-wide or user-specific)! This is what the UI in NTlite doesn’t tell you and I had to figure it out on my own!

More specifically in the forum, ‘Machine’ refers to a special SYSTEM account (a kind of service account) which you cannot interact like a built-in Administrator (typical user account that’s a member of the interactive user group). When your user-interactive programs/installers try to write as (which settings also go into) a SYSTEM account, it’s hard to predict how the programs/installers will react when the program learned the hard way it shouldn’t/couldn’t/wouldn’t!

Of course if an operation is purely system-wide with no user-specific components, you can eagerly put them in the ‘Machine’ bin if you are 100% on top of it and know the operation has ZERO side effects/dependencies that are user-specific, but this is not necessary nor helpful to do so if it’s not a scenario that you absolutely have to. e.g. enabling an account before you have an active account is a good use of the ‘Machine’ section.

Machine refers to system-wide operations that are not tied to specific accounts, something like adding an account (or enabling built-in administrator account before the NTlite bug was fixed) or enabling Powershell scripts to run in the unattended process (yes, it’s disabled by default which is super-annoying).

You can think of it as if you are booting into Windows Recovery Environment (which is a kind of WinPE) and enter things in the command prompt (before you hook to a specific installation of Windows and log on as a specific user if applicable). Whatever that won’t work in WinPE/RE, you shouldn’t put it in the ‘Machine’ section of post-setup either.

Like you don’t want to install Microsoft Office through that minimal Windows scaffolding (WinPE), the ‘Machine’ section is not where you manage things that interacts with the user.

Keep the ‘Machine‘ section to the minimal and restrict to things that you fully know what are the implications of what you are doing. ‘Machine‘ is a place for you to enter things that you’ll run into a chicken-and-egg problem if you don’t do it before an user account is established/active. Don’t use this section if you don’t have to.

Say, it also doesn’t hurt to move HKLM (Machine-wide) registry operations to the User section (doing it from an user account) as long as that account has administrative privileges. More things can go wrong when you run operations before the system even has an active account.

User refers to what goes on after a *user* account is established. It’s like logging into your account (like Administrator) and start running programs there. If you’d install that program or run that command after you logged in as a user if you were to do this unautomated, this is the section where you should put in such commands.

User interface

The UI of post-setup is easy to miss/misunderstand. It’s very badly organized that it will lead people to do the wrong thing and land in cryptic errors or produce output images that doesn’t behave the way they anticipated. It’s another one of the design choices that’s convenient for the programmer, miserable for the user.

This clumsy UI design encourages the users to randomly dump the commands/programs with no regard to the distinction between ‘Machine’ and ‘User’ sections onto the Post-Setup page. For ages I thought programs go to the top and commands go to the bottom!

This is actually how the UI is structured: you are editing the page like a spreadsheet in Ribbon-enabled Microsoft Excel, not going through an installation wizard!

NTLite subdivide the post-setup screen into two halves (tables).

What’s so clumsy about this is that they don’t let you double click and add a new text command entry in the relevant sections (machine vs user) yet they let you drag and drop files into the sections! You also cannot drag-and-drop (move and insert) lines and you have to rely on “Move Up”/”Move Down” button. That’s the shortcut I’d take if I’m in a hurry to rush the program out to meet a hard deadline and there’s only 24 hours in a day, but yuck!

What even made less sense is that you can highlight a like and hit delete, yet you can’t right click on the line for a context menu to remove it. Instead you have to look for the ‘Remove’ button in the ribbon if you wish to delete a line with your mouse:

This goes the same with ‘Select All’, which the shortcut Ctrl+A works so I never paid much attention to the ribbon bar, which caused me to overlook the distinction between the two tabs for ages!

To add text commands, you have to use the ‘Add’ button from the ribbon, but you have to watch out which tab you are in (circled in the screenshot above with the matching color code)! The ‘Add’ in ‘Task – Machine’ tab looks exactly the same as ‘Task – User’ tab except the ‘Reset’ button says ‘(Machine)’ instead of ‘(User)’! WTF. This logically make sense if you are editing an Excel spreadsheet, except that we aren’t! It defies user expectation that it’s a step-by-step wizard, not a config file editor!

This means if you click on the ‘Add’ button from the wrong tab, the entry goes into the wrong section. Guess what? People tend to go with the first thing they see without reading every detail so every text command tend to go to the top half, which is the ‘Machine’ section! WTF.

It’s a terrible design that’d makes structural sense to the designer trying to save the work of ‘double-click and type’ UI by squeezing the clumsy menu-button ‘Add/Reset’ mechanism into the Microsoft Ribbon paradigm! You don’t want the ribbon tabs to look almost identical and use the tab for the ‘state’ information (in this case, the state info is “This command refers to the Machine section or it refers to the User section?”). It’s just setting the user up for failure.

Suggestion

I think it’d make more sense to simply split Post-Setup into two pages: Post-Setup (Machine) and Post-Setup (User), which the timeline accounts for the order of execution. This is the least-effort path from the developer’s perspective and it will promote the discussion about the difference between Machine and User sections which is essential to make sure the output works (or works as intended)!

The UI design in NTlite’s Post-Setup sucked so hard that we might as well be better off just editing a text cmd file where the user can drag and drop a file into the text editor for the full path. Then have a cmd file massager/transfomer/parser that strips the source paths and copies the files into a vault sources\$OEM$\$$\Setup.

If it’s a powershell script, just add a banner that tells user to call powershell.exe (first token) and have the path of the script as a parameter. I learned the hard way that NTlite isn’t doing anything to bypass Microsoft’s new hardening that doesn’t allow powershell scripts to run by default. So the UI adds no value to powershell script handling either, as I had to run this in Post-Setup before anything else to add powershell scripts to Post-Setup to actually get executed:

reg add "HKLM\SOFTWARE\Microsoft\PowerShell\1\ShellIds\Microsoft.PowerShell" /t REG_SZ  /v ExecutionPolicy /d "Unrestricted" /f

The present UI design for Post-Setup is simply counterproductive! All it does is to add constraints to pretend to have a structure where the structure adds no benefit to the use case. This would be one of the least effort path for the developer if somebody argue that it’s beneficial to put the ‘Machine’ section and ‘User’ section on the same page. At least people know why there should be two sections and they are not fungible!

Loading

Getting sound to work Ubuntu VM guests for Hyper-V

It’s frustrating that there are no simple packages provided by Microsoft or Ubuntu to get something as basic as sound working on Hyper-V. It’s nuts that we still have to deal with these kind of integration bullshit in 2023 when people are claiming that Linux is useable! At the time of writing, there are still way too many rough edges on Linux that are clearly not the users’ fault!

To understand why things are the way they are. We have to first understand that Hyper-V talks to the VM under RDP under the hood which has many advantages that the RDP provides. RDP is way less sluggish for the video and sound quality than the damned X11 server, which people spent little time to provide for Windows. MobaXterm is the only decent free X11 server for Windows in terms of user experience but it’s not the first one that populate search results.

In other words, we’ll need to set up Linux to stream the apps through RDP, not X11 to take advantage of Hyper-V manager’s interface (specifically vmconnect.exe under the hood). In Linux, this area is not maturely developed and the author of xrdp is not keen to make polished packages so most of the time we get pointed to the struggle of compiling the source code.

Given RDP is not native to Linux, xrdp did not rebuild the guts of X11 that maps natively to RDP. Instead it launches a bare minimal RDP session with nothing on it other than a basic client (sesman) that you can use it as a VNC client or X11 client that streams to the bare minimum RDP, so there’s an extra layer of indirection. You are not running RDP natively. That RDP session is for you to call X11.

Since X11 does not natively stream audio, you need to a sound server (over network) system that streams the audio to the bare minimal RDP layer of xrdp, then have the RDP layer stream/relay to RDP with pulseaudio-module-xrdp (which is a kernel package that you need to build from scratch at the time of writing since there’s no packages for it).

By the way, if you get this xrdp+pulseaudio ordeal working, you also get (x)RDP working for the linux even if you don’t use vmconnect to connect to it, because it’s RDP under the hood anyway.

I followed the messy (and often broken) instructions from multiple sources to build and install pulseaudio-module-xrdp, but it turned out this wasn’t enough. There’s no sound and the anticipated sound device didn’t show up in Ubuntu and all I see is a dummy sound driver.


Turns out there are many pieces of the puzzle scattered through different blogs and the blog either has typos, missing a key component, or the URLs changed so it’s broken. Here’s an overview of what you need to do

  • Replace Pipeware completely with PulseAudio on Ubuntu
  • RDP Enhanced Session is required for sound. This is true for Windows Guests as well. Linux doesn’t have RDP so you’ll need xrdp first before you even talk about enhanced session. Vsocket is how Enhanced Session talk. On Linux side we configure xrdp to talk on vsocket instead of a raw rdp protocol over Port 3389. On Windows we enable Enhanced Session (if not already) and enable hv-socket (with it the Windows side of vsocket).
  • Since xrdp ‘cheats’ by redirecting X11 instead of implementing RDP from the core, you’ll need to relay the pulseaudio from the Ubuntu itself to the RDP layer which is done by pulseaudio-module-xrdp. Unfortunately it’s does not come with xrdp and there’s no package so you have to build it yourself the install your compiled product. Remember by default source code repo is disabled so you need to enable it first before following the any build instructions.

Get rid of Pipeware Completely & Install PulseAudio

Griffon’s IT library provided the insight that we need to take out Pipeware (the competitior of PulseAudio) completely and replace it with PulseAudio. After that I got it working. Here’s a path to his tutorial:

XRDP – Bring back xRDP sound redirection on Ubuntu 22.10 – Griffon’s IT Library (c-nergy.be)

His tutorial included the script he made to install a more recent xrdp he built but the link is now broken. So what I did turned out to be necessary after all.

His tutorial basically stop/disabled/masked the hell out of Pipewire so it’s dead and deader and make sure the users cannot install it later and displace PulseAudio. It’s a lot of gymnastics because systemctl disable and mask do not take wildcards so you have to find out each service/daemon named pipewire. I’ll take this shortcut instead:

sudo apt purge pipewire

A tool to check if there’s audio server running (optional) is

pactl info

House keeping and install both pulseaudio and xrdp if not already done. pavucontrol is for controlling the volume which is often needed:

sudo apt update
sudo apt install pulseaudio pavucontrol xrdp

Then enable pulse audio and start it immediately (the –now switch follows enable with start) without rebooting:

systemctl --user enable --now pulseaudio.service pulseaudio.socket

Configure Ubuntu and xrdp for Enhanced Session (nothing to do with PulseAudio itself)

I got the clue from this blog: How to install Ubuntu 20.04 on Hyper-V with enhanced session | by Francesco Tonini | Medium, but some of the details changed as time moves on so I’ll document it here for the state of the art in 2023.

Part 1 [DONE BY DEFAULT NOW]: Change xrdp to receive vsocket instead of raw RDP

TLDR: No actionable item here. Included here for educational purposes only as it was a required step before.

Enhanced Session uses vsocket instead of raw Port 3389 for connection which xrdp.ini defaults to out of the box. Griffon’s IT library gave the instruction to replace the port=3389 to port=vsock://-1:3389 in xrdp.ini. However it’s already taken care of in the new install.sh script for linux-vm-tools:

The more recent xrdp already defaulted to use_vsock=false. Discussion: Cannot establish RDP connection to Ubuntu VM made with Hyper-V Quick Create · Issue #1260 · neutrinolabs/xrdp (github.com)

Part 2: install linux-vm-tools (think of it as vsocket driver)

linux-vm-tools is “Hyper-V Linux Guest VM Enhancements” developed by Microsoft which dropped support for it and picked up (forked) by hinara to support Ubuntu 22.04. Hinara’s version is more updated.

So the first thing you do is to download (can use wget, curl, aria whatever you like as the downloader) the install.sh of the latest/appropriate version (right now it’s Ubuntu 22.04):

wget https://raw.githubusercontent.com/Hinara/linux-vm-tools/ubuntu20-04/ubuntu/22.04/install.sh

This blog (Enable Sound Output Hyper-V (ubuntuforums.org))’s answer is a little outdated as involves a hack to rename 20.04 to 22.04 and keep the same script.

Note that the folder name at the top level says 20.04 (branch name) but at the lower level says 22.04 (folder name). So be prepared the path might change in the future if the author figured it’s better to use a consistent branch name later. The safest bet is to go to the Github page and discover the latest version then click raw to get the direct link to use with wget (or any file downloader).

Downloads by default are not executable for your safety, so enable the execute attribute with chmod +x:

sudo chmod +x install.sh

and of course execute the install.sh after you checked it’s all kosher:

sudo ./install.sh

The last 2 lines of the script tells you to RUN IT AGAIN AFTER REBOOT. It’s easy to overlook given the text doesn’t stand out after the user got bombarded with lots of verbose info. Make sure you follow it!

Configure Hyper-V for Enhanced Session (Host)

On Windows side, you’ll need to enable the Windows version of vsocket to communicate with the vsocket. There’s no curly braces when you type in {your VM’s name}:

Set-VM {your VM's name} -EnhancedSessionTransportType HvSocket

The chaos trying to build pulseaudio-module-xrdp

Deploy a Linux VM on Hyper-V with Sound 20.04 Edition – techbloggingfool.com‘s scripts do not use the default folder choices by the build script so the changes in the upstream broke his folder scheme by adding a ‘+dfsg1‘ to the folder name. Instead of fixing it, I followed the official instructions on the github page but take advantage of the one very useful piece he provided: enable source code repo:

He has command line scripts that enables the source code too. Given my confidence about linux developer coordinating with people downstream about the naming schemes changes, I’ll stick with the GUI which gives a consistent interface.

After source repos are allowed, get the build tools (official instructions for Ubuntu):

sudo apt install build-essential dpkg-dev libpulse-dev git autoconf libtool

Grab the pulseaudio-modle-xrdp source code and go into that folder

git clone https://github.com/neutrinolabs/pulseaudio-module-xrdp.git
cd pulseaudio-module-xrdp

Run the install_pulseaudio_sources_apt-wrapper.sh under the /scripts folder:

./scripts/install_pulseaudio_sources_apt_wrapper.sh

(I tried the non ‘_wrapper’ version next to it. It works too, but it doesn’t relieve you from the rest of the build and install process). This is taken straight from Build on Debian or Ubuntu · neutrinolabs/pulseaudio-module-xrdp Wiki (github.com):

./bootstrap && ./configure PULSE_DIR=$HOME/pulseaudio.src
make

PULSE_DIR=$HOME/pulseaudio.src because this is the default in the wrapper script above if no arguments are specified. Stick with the defaults (working at home folder) as it’s not wise to trust people to coordinate their naming scheme consistently. The defaults are likely tested more thoroughly.

And the last step is to install your hard work (README · neutrinolabs/pulseaudio-module-xrdp Wiki (github.com)):

sudo make install

which you can OPTIONALLY check your work to see if the kernel modules are indeed installed:

ls $(pkg-config --variable=modlibexecdir libpulse) | grep xrdp

[Optional Cleanup] If you don’t have anything named pulseaudio that you’d like to keep, you can clean up by removing the files using wildcard and the install.sh

sudo rm -rf ~/pulseaudio*
sudo rm ~/install.sh

Summary

Here’s a consolidated ‘script’ to show the complexity. I do not recommend copying and pasting it as the dependencies might change and it might break.

# Change audio server in Ubuntu
sudo apt -y update
sudo apt -y purge pipewire
sudo apt -y install pulseaudio pavucontrol xrdp
systemctl --user enable --now pulseaudio.service pulseaudio.socket

# [Optional] check if PulseAudio was installed correctly
pactl info

# Build pulseaudio-module-xrdp and install the kernel modules
# Stick with official instructions which assumes home folder
cd ~
sudo apt -y install build-essential dpkg-dev libpulse-dev git autoconf libtool
git clone https://github.com/neutrinolabs/pulseaudio-module-xrdp.git
cd pulseaudio-module-xrdp
./scripts/install_pulseaudio_sources_apt_wrapper.sh
./bootstrap && ./configure PULSE_DIR=$HOME/pulseaudio.src
make
sudo make install

# [Optional] check if pulseaudio-module-xrdp was installed correctly
ls $(pkg-config --variable=modlibexecdir libpulse) | grep xrdp

# Install linux-vm-tools which enables Enhanced Session
# Give linux-vm-tools its own folder to avoid confusion
mkdir -p ~/linux-vm-tools && cd $_
wget https://raw.githubusercontent.com/Hinara/linux-vm-tools/ubuntu20-04/ubuntu/22.04/install.sh
sudo chmod +x install.sh
sudo ./install.sh

# The last 2 lines of screen output of install.sh tells you to reboot and run this again
# This is automated below by making a icon in Gnome desktop's autostart folder
# that will self-destruct after first launch
cat > ~/.config/autostart/startonce.desktop <<EOF
[Desktop Entry]
Type=Application
Name=startonce.desktop
Exec=gnome-terminal -- sh -c '~/linux-vm-tools/install.sh && rm -rf ~/pulseaudio-module-xrdp ~/pulseaudio.src ~/linux-vm-tools ~/.config/autostart/startonce.desktop && init 0'
EOF

sudo reboot

I also noticed a quirk that the first reboot after everything’s installed might be a little too fast. In this case restart again once more. So instead I just have the desktop shortcut shutdown your VM after it’s done and have you manually start it again so it’d work right the first time.

I’ve created a Github repo for my own convenience, but feel free to use it however way you like, but I’m not responsible for any damages it might cause. Better read through the code with the help of this blog page and understand what it does and decide if you want to try it for your setup. I tried to make it robust, but it’s designed to be installed on freshly install Ubuntu guest VMs.

https://github.com/wonghoi/enhanced_session_linux/

Loading

Intro to Proxmox (KVM/Qemu)

Among the common virtual machine solutions (Hyper-V, VirtualBox, Xen (Cirtix), VMware, KVM/Qemu), KVM/Qemu has the reputation of being lean and mean. However, it’s not easy to set up and use.

Microsoft’s Hyper-V has an interesting approach: if you disable Hyper-V, Windows runs as raw Windows. When you enable Hyper-V, your host Windows is a VM session in disguise running locally and have the first dibs on the hardware, which the overhead is unavoidable even if you launched no VM.

I tried the QEmu port on Windows and it was faster than Hyper-V even when there’s no KVM (used HAXM). However QEmu’s Windows port is clearly not polished. It retains a lot of the KVM lingo which made no sense, and it’s messy to get TAP (VMs having direct access to my own network without any translation as if they have their own network card plugged to my router) and it bluescreens my Windows.

So the next attempt is to try QEmu/KVM on linux. QEmu/KVM pretty much operate as a simple executable with a gazillion command line switches. Anything you can find that are a little user friendly are just a wrapper that passes the correct command line switches.

Most of the UIs, do not polish the UI to organize the available features into intuitively arranged common use cases like HP/Agilent/Keysight/Windows does. Most of the time it takes as much effort understanding the man pages spit out by –help switch as using their interface, and there’s a lot of common use scenarios overlooked that the user will end up having to look up the switches or enter the commands in the shell themselves.

Libvrt is simply an XML tree that’s a near-direct translation of the command line arguments so you feel the entry looked more professional than saving the command line text is a shell script or batch file. I’m not impressed as it’s just more fluff as the structure do not help the users avoid studying the command line switches’ manual pages.

QtEmu had promise (the interface looked like a half-developed version of VirtualBox’s Manager) but the project was not maintained anymore and Qemu’s command line switches’ evolution has escaped them so a lot of the functions are broken or missing.

virt-manager do not have a compiled Windows binaries port. They have a virt-viewer for connecting to the sessions in Windows but it’s command line like SPICE. This is beyond frustrating. If you want something polished like a Hyper-V manager on Windows to manage Proxmox or linux QEmu clusters, forget about it.

Proxmox is very incomplete when compared with Microsoft’s offerings, but it’s still better than nothing. At least it has a web interface that VNCs the screen back to you.

You still cannot avoid actually understanding the long list of Qemu command switches as the UI only translates a few basic use cases into raw command switches which they show the raw string to you with the expectation that you’d figure out what to fix if they accidentally generated invalid combinations.

There are quite a lot of features (like setting up TAP) that still require shell command line entries, which they provided the interface so you don’t have to SSH separately into the Linux

Remember whatever Proxmox provided, it’s largely a pass through of the command line (switches) interface. Don’t get your hopes up assuming all the things a normal user would expect are there. You are expected to fill in the gaps by a lot of Googling. The UI is useful only if you know what you are looking for.

There are a few things Proxmox did right but overall it’s not an experience comparable to Microsoft’s management interface.

USB mapping

I have to give Proxmox credit for implementing PCI and USB passthrough/mapping, which is often a pain in the butt to get the unique string from command line and hack together the right reference to mount the hardware.

VNC over WebUI is horrendous

I often hate people for squeezing all the UIs in a browsers when they shouldn’t. VNC is one of them. The reason is that when you are remote controlling another computer, you need to make sure most keystrokes goes to the VM, not the host, unless released. You cannot do this from within in a browser as the Alt-F4 closes the client’s browser, not the guest VM’s Windows!

Loading

Set up and Usage notes for Proxmox (KVM/Qemu)

Bump 1: Dick move from Proxmox

The first thing that trips me from Proxmox is the downloadable, despite it said it’s free if you don’t use their enterprise repository, is Enterprise (paid) version out of the box, with no option to download the free version that’s configured as free/community edition.

It’s a dick move to greet ALL new users with this, hoping to scare them to consider a subscription:

I don’t think frustrating people who are trying to learn/explore the software will make them want to pay for a subscription. The best this dick move can do is to scare new users away as the user might think they did something wrong getting things they don’t expect. I certainly thought of throwing out Proxmox had there be better options out there when I run into this, as I’m still evaluating whether I should go with Qemu or Hyper-V.

First of all. This scary message doesn’t actually block you from using Proxmox. It’s just that you don’t get updates until you either pay for their enterprise repositories or change to the free repositories. At least you can still use the interface to gain shell access which we’ll need to fix it (or you can go to the physical computer and enter the same thing in the text terminal display locally)

The difference between enterprise and free is just which servers the update repositories points to. Getting the latest and greatest is not necessarily a plus for enterprise users so they let free users take the risks first and provide feedback so they can polish their software. Fair enough. Great model.


There are two parts to fixing this ordeal:

  • Configuring the update repositories to the free no-subscription repositories (Functional issue, and it’s per node, including slave nodes)
  • Removing the nag screen (Cosmetic, and it’s the overall Proxmox, aka the main node hosting the Proxmox management interface)

Fix Subscription Scare Part 1: Updating repositories locations

Basically, what you’ll need to do is to notice edit the lines file names show below (underlined in red) corresponding to the repository URL path you want to change:

/etc/apt/sources.list.d/pve-enterprise.list is not necessary for free users, so you can simply comment all the lines out (there’s only one line)

/etc/apt/soruces.list is the link to the core repository for Proxmox updates. Instead of blindly following exact instructions which can go stale as version progresses, open the URL http://download.proxmox.com/debian/ and see what’s out there. What’s not spelled out in the web admin interface is the intermediate folder called ‘dist

bookworm is the latest Debian version’s code name at the time of writing

and obviously we pick the branch/sub-folder that says no-subscription (there’s no enterprise here since it belongs to a different root URL), but you still have to get the name right for the ‘Components’

You can open it with a text editor like nano:

nano /etc/apt/sources.list

and edit this proxmox’s repository line (remember to skip the ‘dists‘ intermediate folder). Every space after a word that is just a subfolder you see from the folder structure. If Debian released a new version/codename, you might also want to update the first 3 lines of debian repositories as well to match the name code name (and folder structure if they rearranged it).

Ceph is an optional feature (not installed) yet it’s configured to be enterprise as well, so for consistency, we might want to change it to the no-subscription (free) version as well. The latest codename for ceph that was published at the time of writing is “quincy” (there’s nothing in the “reef” folder), so we click on it.

Again the ‘dists’ is boilerplate and not spelled out (so we don’t enter it) in the entries of the repository sources file.

bookworm is the current Debian version for that

and we see a “no-subscription” folder which is the one we want obviously. We can just guess by sensible names you’d choose if you were the developer.

You can again open a text editor like nano to edit the repository location file as shown in the web admin UI

nano /etc/apt/source.list.d/ceph.list

And finally, disable /etc/apt/source.list.d/pve-enterprise.list

Under the hood, it’s basically Proxmox adding a # (comment sign) to disable the line in /etc/apt/source.list.d/pve-enterprise.list with the similar procedures we did:

Hit Reload and you are done with this subscription scare tactic.

Actually out of consistency, you can build your own pve-no-subscription.list to repace pve-enterprise.list and replace ‘enterprise‘ in the root URL with ‘download‘, update the Debian codename (at the time of writing it’s ‘bookworm‘), and the change the components folder from ‘pve-enterprise‘ to ‘pve-no-subscription‘, which translates to crawling this repository path: http://download.proxmox.com/debian/pve/dists/bookworm/pve-no-subscription/

There’s nothing fancy and hard-coded about these names. It’s basically the URL of where the update files are stored with an intermediate folder ‘dists‘ sandwiched between the root URL and the tokens (separated by spaces) which are basically subfolder names. All it does is to attach a ‘/dists/‘ after the root URL and replace the rest of the ‘/‘ with spaces.

It simply looked like the developers for the web admin UI didn’t have the time to get to making the table entries clickable and editable yet and they merely got to make the enable/disable button to comment out the lines in the file. You’ll see similar UI deficiencies in a lot of places later which you’d have to go to the shell to do it yourself after researching the concepts.

Fix Subscription Scare Part 2: Removing the nag screen

Even after you fixed the repository locations, it’s only per node, yet the nag screen is at the top admin UI level. The Updates/Repositories interface won’t show error messages (undownloadable repositories) anymore, but the nag screen still needs to be addressed.

Luckily somebody wrote a script to deal with it, hosted on Github:

# Download script
wget https://raw.githubusercontent.com/foundObjects/pve-nag-buster/master/install.sh
# Good practice to read ANY unknown script to make sure there's no shenanigans 1st

# Then run the script with sudo
sudo bash install.sh

This blog shows the mechanisms in case if some changes broke the script above.

TAP Networking not in the Web UI

TAP interface is necessary for the ethernet card on the VM directly interact with the router connected to the physical hardware (PHY/NIC) but with a different identity, which puts it at equal level as other physical computers on your network. This is often useful when you want to host servers.

I’ve adapted the instructions from Extremecoders on Github here as the default ethernet device names are different

  • Debian doesn’t use the /eth0 naming scheme anymore. It’s /enp4s0
  • /br0 now has a prefix “vm” in front of it since it’s a virtual bridge. Proxmox created this by default. The ‘bridge’ in this case is not in the ethernet bridge we understand in Windows (which bridges two interface together as one), but instead it’s just a virtual ethernet switch. Once I know this twist, I understand how to set TAP up

Since we are using the /vmbr0 which is already set up, we can skip the bridge creation and adding the physical network card /enp4s0 to the /vmbr0 ‘bridge’ (virtual switch).

The core step is to create a TAP interface. Let’s call it /tap0

tunctl -t tap0

You don’t need the “-u (username)” part unless you want to assign ownership of this specific TAP interface to a specific user.

Then you need to add this TAP (/tap0 in this example) to the ‘bridge’ (/vmbr0 in this example). ‘addif’ means ‘add interface’. ‘brctl’ means ‘bridge control

brctl addif vmbr0 tap0

Make sure all the physical card (/enp4s0), the TAP interface (/tap0) and the ‘bridge’ (/vmbr0) is up. Then assign IP to the ‘bridge’ /vmbr0. If using acquiring new IP address from DHCP, use DHCP client (dhclient):

dhclient -v vmbr0

Pools for resources

You can directly create the virtual hard disks directly from where you are configuring your VM, but you can only delete it from a Pool viewer. This is the same as VirtualBox. ‘local-lvm’ is a bunch of virtual hard drive images that you need to mount to act like a hard disk drive. Your VM images lives in /dev/pve

It’s a little more rigid than VirtualBox where you can directly point to the CD image. In Proxmox you have to upload the CD image to a pool before referring them in the VM’s settings. ‘local’ is just a folder of files (specifically /var/lib/vz), and the CD images goes to /var/lib/vz/template/iso.

Actually local and local-lvm are all defined at the root level called ‘Datacenter’

Default CPU

By default Proxmox choose x86-64-v2-AES for you, which might have better compatibility. I had trouble with the Windows port of Qemu not supporting hosts because my CPU is too new for it, but the Linux Qemu-Kvm have no trouble recognizing my new CPU under ‘hosts’ type. Look into the extra CPU flags to match whether you are using an AMD or Intel CPU.

Loading