I thought it might be useful to start a more general thread to discuss Linux based audio stuff that doesnât really fall into the other topics: things like OS configuration, audio drivers, routing, and utilities.
Iâm going to start with a discussion of Linux audio setup because, well, itâs a good place to start!
At the end of the day, the Linux audio architecture isnât that dissimilar from Windows or Mac, but there are some practical differences which are important.
Firstly, unlike Windows, but more like Mac (or iOS/Android), when you purchase a new audio interface, there arenât any âdriversâ to download and install for Linux. This is because, if the device is supported on Linux, the audio drivers are built-in.
How does this work?
For most of us, we will be using audio interfaces which connect via USB. The USB standard includes a way for a USB device to report its capabilities to the host computer when it is plugged in. This is done using a âdevice descriptorâ. Within the specification for USB Audio, the device descriptor describes all of the inputs, outputs, controls, formats and sample rates for the device.
By using the device descriptor, Linux can âsuss outâ what a particular audio interface is, and automatically adapt to it. This is all handled by part of Linux called âALSAâ (Advanced Linux Audio Architecture".
In the sections below, I have described some ânerdyâ detail. This is the sort of stuff that gives Linux the reputation for being complex or requiring a high level of expertise. In reality, you donât generally need to know this stuff.
One of the differences on Linux, compared to Windows and Mac, is that this sort of detail is exposed and easily accessible whilst on other platforms itâs largely hidden.
Iâve hidden this behind detail tags because it really is only for the inquisitive. You really donât need to know or care about it.
Example of device descriptor for a Boss Katana GO
The following is an example of a device descriptor for a Boss Katana Go. The output is from the Linux command lsusb -v and shows the header with the device manufacturer, product name and other details.
Bus 001 Device 016: ID 0582:0304 Roland Corp. KATANA:GO
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 1.10
bDeviceClass 0 [unknown]
bDeviceSubClass 0 [unknown]
bDeviceProtocol 0
bMaxPacketSize0 64
idVendor 0x0582 Roland Corp.
idProduct 0x0304 KATANA:GO
bcdDevice 1.00
iManufacturer 1 BOSS
iProduct 2 KATANA:GO
iSerial 0
bNumConfigurations 1
This is followed a Configuration descriptor which contains key information about the device configuration, such as itâs power requirements and the number of Interfaces.
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 0x00f9
bNumInterfaces 4
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xc0
Self Powered
MaxPower 0mA
This is followed by Interface sections which describe the various capabilities. Firstly the device topology is described by the âAudioControlâ Interfaces. This is the âmapâ of how audio flows in the device:
- terminal IDs 1 & 3 are the Playback paths (PC â device)
- terminal IDs 4 & 7 are the Capture paths (device â PC)
- The device supports 16 bit audio at 48kHz. Note that this device has a fixed resolution. Other devices may offer multiple options.
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 0
bInterfaceClass 1 Audio
bInterfaceSubClass 1 Control Device
bInterfaceProtocol 0
iInterface 0
AudioControl Interface Descriptor:
bLength 11
bDescriptorType 36
bDescriptorSubtype 1 (HEADER)
bcdADC 1.00
wTotalLength 0x0035
bInCollection 3
baInterfaceNr(0) 1
baInterfaceNr(1) 2
baInterfaceNr(2) 3
AudioControl Interface Descriptor:
bLength 12
bDescriptorType 36
bDescriptorSubtype 2 (INPUT_TERMINAL)
bTerminalID 1
wTerminalType 0x0101 USB Streaming
bAssocTerminal 0
bNrChannels 2
wChannelConfig 0x0003
Left Front (L)
Right Front (R)
iChannelNames 0
iTerminal 0
AudioControl Interface Descriptor:
bLength 9
bDescriptorType 36
bDescriptorSubtype 3 (OUTPUT_TERMINAL)
bTerminalID 3
wTerminalType 0x0602 Digital Audio Interface
bAssocTerminal 0
bSourceID 1
iTerminal 0
AudioControl Interface Descriptor:
bLength 12
bDescriptorType 36
bDescriptorSubtype 2 (INPUT_TERMINAL)
bTerminalID 4
wTerminalType 0x0602 Digital Audio Interface
bAssocTerminal 0
bNrChannels 2
wChannelConfig 0x0003
Left Front (L)
Right Front (R)
iChannelNames 0
iTerminal 0
AudioControl Interface Descriptor:
bLength 9
bDescriptorType 36
bDescriptorSubtype 3 (OUTPUT_TERMINAL)
bTerminalID 7
wTerminalType 0x0101 USB Streaming
bAssocTerminal 0
bSourceID 4
iTerminal 0
The âAudioStreamingâ interface describe the device inputs and outputs including the streaming format, sample rates, data transfer mechanism, and other technical capabilities. Each of these includes an âendpointâ descriptor which represents an audio channel.
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 1
bAlternateSetting 0
bNumEndpoints 0
bInterfaceClass 1 Audio
bInterfaceSubClass 2 Streaming
bInterfaceProtocol 0
iInterface 5 KATANA:GO AUDIO
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 1
bAlternateSetting 1
bNumEndpoints 1
bInterfaceClass 1 Audio
bInterfaceSubClass 2 Streaming
bInterfaceProtocol 0
iInterface 5 KATANA:GO AUDIO
AudioStreaming Interface Descriptor:
bLength 7
bDescriptorType 36
bDescriptorSubtype 1 (AS_GENERAL)
bTerminalLink 1
bDelay 0 frames
wFormatTag 0x0001 PCM
AudioStreaming Interface Descriptor:
bLength 11
bDescriptorType 36
bDescriptorSubtype 2 (FORMAT_TYPE)
bFormatType 1 (FORMAT_TYPE_I)
bNrChannels 2
bSubframeSize 2
bBitResolution 16
bSamFreqType 1 Discrete
tSamFreq[ 0] 48000
Endpoint Descriptor:
bLength 9
bDescriptorType 5
bEndpointAddress 0x0d EP 13 OUT
bmAttributes 13
Transfer Type Isochronous
Synch Type Synchronous
Usage Type Data
wMaxPacketSize 0x00cc 1x 204 bytes
bInterval 1
bRefresh 0
bSynchAddress 0
AudioStreaming Endpoint Descriptor:
bLength 7
bDescriptorType 37
bDescriptorSubtype 1 (EP_GENERAL)
bmAttributes 0x00
bLockDelayUnits 0 Undefined
wLockDelay 0x0000
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 2
bAlternateSetting 0
bNumEndpoints 0
bInterfaceClass 1 Audio
bInterfaceSubClass 2 Streaming
bInterfaceProtocol 0
iInterface 6 KATANA:GO AUDIO
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 2
bAlternateSetting 1
bNumEndpoints 1
bInterfaceClass 1 Audio
bInterfaceSubClass 2 Streaming
bInterfaceProtocol 0
iInterface 6 KATANA:GO AUDIO
AudioStreaming Interface Descriptor:
bLength 7
bDescriptorType 36
bDescriptorSubtype 1 (AS_GENERAL)
bTerminalLink 7
bDelay 0 frames
wFormatTag 0x0001 PCM
AudioStreaming Interface Descriptor:
bLength 11
bDescriptorType 36
bDescriptorSubtype 2 (FORMAT_TYPE)
bFormatType 1 (FORMAT_TYPE_I)
bNrChannels 2
bSubframeSize 2
bBitResolution 16
bSamFreqType 1 Discrete
tSamFreq[ 0] 48000
Endpoint Descriptor:
bLength 9
bDescriptorType 5
bEndpointAddress 0x8e EP 14 IN
bmAttributes 13
Transfer Type Isochronous
Synch Type Synchronous
Usage Type Data
wMaxPacketSize 0x00cc 1x 204 bytes
bInterval 1
bRefresh 0
bSynchAddress 0
AudioStreaming Endpoint Descriptor:
bLength 7
bDescriptorType 37
bDescriptorSubtype 1 (EP_GENERAL)
bmAttributes 0x00
bLockDelayUnits 0 Undefined
wLockDelay 0x0000
This device has MIDI capability ( used by the control app to configure the device, switch patches and so on). This is described in the âMIDIStreamingâ Interface blocks.
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 3
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 1 Audio
bInterfaceSubClass 3 MIDI Streaming
bInterfaceProtocol 0
iInterface 4 Generic Audio MIDI 1.0
MIDIStreaming Interface Descriptor:
bLength 7
bDescriptorType 36
bDescriptorSubtype 1 (HEADER)
bcdADC 1.00
wTotalLength 0x0041
MIDIStreaming Interface Descriptor:
bLength 6
bDescriptorType 36
bDescriptorSubtype 2 (MIDI_IN_JACK)
bJackType 1 Embedded
bJackID 16
iJack 8 KATANA:GO MIDI IN
MIDIStreaming Interface Descriptor:
bLength 6
bDescriptorType 36
bDescriptorSubtype 2 (MIDI_IN_JACK)
bJackType 2 External
bJackID 32
iJack 10 KATANA:GO MIDI OUT
MIDIStreaming Interface Descriptor:
bLength 9
bDescriptorType 36
bDescriptorSubtype 3 (MIDI_OUT_JACK)
bJackType 1 Embedded
bJackID 48
bNrInputPins 1
baSourceID( 0) 32
BaSourcePin( 0) 0
iJack 10 KATANA:GO MIDI OUT
MIDIStreaming Interface Descriptor:
bLength 9
bDescriptorType 36
bDescriptorSubtype 3 (MIDI_OUT_JACK)
bJackType 2 External
bJackID 64
bNrInputPins 1
baSourceID( 0) 16
BaSourcePin( 0) 0
iJack 8 KATANA:GO MIDI IN
Endpoint Descriptor:
bLength 9
bDescriptorType 5
bEndpointAddress 0x03 EP 3 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
bRefresh 0
bSynchAddress 0
MIDIStreaming Endpoint Descriptor:
bLength 5
bDescriptorType 37
bDescriptorSubtype 1 (Invalid)
bNumEmbMIDIJack 1
baAssocJackID( 0) 16
Endpoint Descriptor:
bLength 9
bDescriptorType 5
bEndpointAddress 0x84 EP 4 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
bRefresh 0
bSynchAddress 0
MIDIStreaming Endpoint Descriptor:
bLength 5
bDescriptorType 37
bDescriptorSubtype 1 (Invalid)
bNumEmbMIDIJack 1
baAssocJackID( 0) 48
What Linux see when an audio interface is plugged in
When an audio interface is plugged into Linux, it is detected and the device descriptor read, and used to create an audio device automatically. This can be seen in the logs:
[68225.749930] usb 1-8.1.1.2: new full-speed USB device number 16 using xhci_hcd
[68225.836791] usb 1-8.1.1.2: New USB device found, idVendor=0582, idProduct=0304, bcdDevice= 1.00
[68225.836802] usb 1-8.1.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[68225.836806] usb 1-8.1.1.2: Product: KATANA:GO
[68225.836810] usb 1-8.1.1.2: Manufacturer: BOSS
[68225.845499] usb 1-8.1.1.2: Quirk or no altset; falling back to MIDI 1.0
And, as itâs a USB device, it can be seen by running the lsub command:
$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 0a5c:21e8 Broadcom Corp. BCM20702A0 Bluetooth 4.0
Bus 001 Device 003: ID 0451:8442 Texas Instruments, Inc.
Bus 001 Device 004: ID 1397:00d4 BEHRINGER International GmbH XR18
Bus 001 Device 005: ID 0bda:5411 Realtek Semiconductor Corp. RTS5411 Hub
Bus 001 Device 006: ID 05e3:0610 Genesys Logic, Inc. Hub
Bus 001 Device 007: ID 046d:c52b Logitech, Inc. Unifying Receiver
Bus 001 Device 008: ID 0bda:5411 Realtek Semiconductor Corp. RTS5411 Hub
Bus 001 Device 009: ID 0b05:18f3 ASUSTek Computer, Inc. AURA LED Controller
Bus 001 Device 010: ID 0451:82ff Texas Instruments, Inc.
Bus 001 Device 011: ID 046d:085b Logitech, Inc. Logitech Webcam C925e
Bus 001 Device 012: ID 1050:0407 Yubico.com Yubikey 4/5 OTP+U2F+CCID
Bus 001 Device 013: ID 076b:3021 OmniKey AG CardMan 3021 / 3121
Bus 001 Device 014: ID 056a:00d1 Wacom Co., Ltd CTH-460 [Bamboo Pen & Touch]
Bus 001 Device 015: ID 31e3:1232 Wooting Wooting Two HE (ARM)
Bus 001 Device 016: ID 0582:0304 Roland Corp. KATANA:GO
Note that the device has two numbers against it. This is the unique vendor id (0582=Roland Corp.) and the device ID (0304=Katana GO).
The device descriptor can be found using lsb -v -d 0582:0304
At the command line, the aplay command can be used to list playback devices:
$ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 1: XR18 [XR18], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 2: PCH [HDA Intel PCH], device 0: ALC1220 Analog [ALC1220 Analog]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 2: PCH [HDA Intel PCH], device 1: ALC1220 Digital [ALC1220 Digital]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 3: NVidia [HDA NVidia], device 3: HDMI 0 [DELL U3818DW]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 3: NVidia [HDA NVidia], device 7: HDMI 1 [HDMI 1]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 3: NVidia [HDA NVidia], device 8: HDMI 2 [HDMI 2]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 3: NVidia [HDA NVidia], device 9: HDMI 3 [HDMI 3]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 4: KATANAGO [KATANA:GO], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
Similarly, capture devices can be listed with arecord:
$ arecord -l
**** List of CAPTURE Hardware Devices ****
card 0: C925e [Logitech Webcam C925e], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 1: XR18 [XR18], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 2: PCH [HDA Intel PCH], device 0: ALC1220 Analog [ALC1220 Analog]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 2: PCH [HDA Intel PCH], device 2: ALC1220 Alt Analog [ALC1220 Alt Analog]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 4: KATANAGO [KATANA:GO], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
And MIDI devices with amidi:
$ amidi -l
Dir Device Name
IO hw:1,0,0 XR18 MIDI 1
IO hw:4,0,0 KATANA:GO MIDI OUT
When a device is plugged it, it is normally automatically detected and will show up in the audio settings:
And in audio controls:
On modern desktop Linux using Pipewire (Iâll explain this in a later post), it will also show up in the Pipewire graph. Here Iâm using qpwgraph to view it:
So, for most people, audio interfaces on Linux are âplugânâplayâ.
There are exceptions, especially with older devices:
In the past, a lot of vendors were lazy or secretive with their devices, and wouldnât fully describe them in the USB device descriptor. They would then ship a downloadable âdriverâ (which, in many cases, was a text file containing a structured description of the stuff which should have been in the USB Device Descriptor).
Sometimes they would implement things in non-standard ways or add capabilities which were outside the standards (aka âquirksâ) and which could not be discovered using standard USB audio semantics.
A lot of these devices wonât work on Linux unless the proprietary stuff is reverse-engineered and implemented into ALSA. The good news is this has been done for a lot of devices. I, myself, have helped the ALSA developers implement the quirks for a range of Roland devices, including the Katana amps.
But, increasingly, vendors are moving towards more standard USB audio devices because theyâve realised that proprietary shenanigans prevents these devices from being used on phones and tablets, which is now a big market for them.
Cheers,
Keith


















