on
On Keyboard Shortcuts
Background
In computing, a keyboard shortcut also known as hotkey is a series of one or several keys to quickly invoke a software program or perform a preprogrammed action.
— From Wikipedia
Whether you know it or not, you are already using keyboard shortcuts. Pressing Ctrl+C
to copy text, then Ctrl+V
to paste? That’s a shortcut. Shift
+ Arrow keys
to select a text region? Another shortcut. Ctrl
+ Arrow keys
to jump words? Yet another shortcut.
Generally speaking, a shortcut works by pressing a modifying key to activate a layer, and then another common key to invoke an action within that layer. So Ctrl
activates the controlling layer of keys, whereas the Windows
key activates the window management layer (think minimizing/maximizing windows, switching workspaces, etc.).
Software engineers tend to customize their computers (and input devices), for better or worse. A popular choice amongst software developers is VIM
-like keybindings, which, amongst many other things, map the arrow keys to h
, j
, k
and l
respectively. “Why such hassle?”, you might ask. Well, the costs of switching hands to press keys add up (context switching on a small scale). By keeping your hands at the same position on the keyboard, you save time, and focus.
Levels of keyboard customization
There are several levels, at which you can start applying your keyboard shortcut logic.
- At the keyboard itself.
- At the interface between kernel- and userspace.
- At the application layer.
Most applications nowadays encourage you to stick to approach 3 of the above approaches. We’ll look at the three approaches individually, but first let’s step back a little and look at how a keypress gets into userspace in the first place.
- When a key is pressed or released, the keyboard generates a scan code and sends it to the CPU using the USB Protocol Stack.
- The scan code is translated by the keyboard driver(s) (i.e. in the kernel) to a keycode using a keycode table. The driver(s) also keeps track of some keyboard state (e.g. whether
Capslock
is activated, or not). It might seem counter-intuitive, but the computer tracks theCapslock
state (i.e. tells the keyboard activate and deactivate the LED), not the keyboard. So ifCapslock
doesn’t respond anymore, it’s a good indicator that your kernel has panicked. - The kernel provides a userspace interface for the key events. For Linux these live under
/dev/input/event<i>
. This interface is then grabbed by some software, usually the window manager, which then intercepts certain key presses and forwards the rest to some application.
With this knowledge at hand, let’s explore different keyboard shortcut approaches.
Shortcutting the keyboard itself
Projects like QMK leave the computer as it is, and focus on reprogramming the keyboard itself. By writing firmware for the keyboard and manipulating the keymap, the computer receives custom-tailored scancodes in the first place. If pressing Alt_R+l
on the keyboard sends the scancode for Right
, well then that’s the key the computer is gonna assume the user pressed. This is powerful but requires a compatible keyboard.
Shortcutting at the interface between kernel- and userspace
This approach intercepts key events after the kernel driver translates them but before most userspace applications (like the window manager or specific apps) see them. Tools like kmonad
, interception-tools
, or kbct
operate at this level.
They work by reading directly from the kernel’s input event devices (/dev/input/eventX
). For instance, kbct
specifically grabs the raw keyboard device (e.g., /dev/input/event2
). It then processes these events according to its configuration (remapping keys, creating layers, defining macros) and outputs modified events to a new virtual input device.
Crucially, this prevents the Window Manager (WM) or other applications from reading the original, raw keyboard events from /dev/input/event2
. Instead, the WM typically grabs a different input device (e.g., /dev/input/eventY
, often a composite device or the virtual one created by the tool) which receives the processed key events.
This allows for powerful, system-wide customization that works consistently across all applications, regardless of whether they have their own shortcut settings or what the window manager tries to do. You define your shortcuts once, and they work everywhere – from the terminal to the browser to your IDE.
Shortcutting at the application layer
This is the most common approach. Your web browser, text editor, IDE, file manager – they all come with their own set of default shortcuts and usually provide a way to customize them. You might change your IDE’s “build” shortcut or configure your terminal emulator to use Ctrl+Shift+C/V
for copy/paste.
The advantage is ease of use; settings are often graphical and specific to the task at hand. The major disadvantage is inconsistency. A shortcut learned in one application likely won’t work in another. Furthermore, these shortcuts only work when the application is focused and often cannot override system-level or window manager shortcuts.
Don’t reinvent the wheel
While deep customization is tempting, it’s often wise to build upon existing conventions. Standard OS shortcuts (Ctrl+C
, Ctrl+S
, Alt+Tab
) are ubiquitous for a reason – they provide a baseline familiarity across systems. Similarly, adopting established paradigms like Vim keybindings (hjkl
navigation, modal editing) within supporting applications leverages a well-documented and widely understood system, rather than creating a bespoke setup only you understand. Start with the defaults and override specific things that genuinely impede your workflow.
When too much customization sucks
Deep customization, especially across multiple layers, can become a liability. Consider these scenarios:
- You switch the software that you use on a regular interval: If your shortcuts rely heavily on application-specific settings (Layer 3), switching to a new browser, IDE, or even terminal emulator means rebuilding your shortcut muscle memory or spending time reconfiguring the new tool. System-wide customization (Layer 1 or 2) mitigates this.
- You regularly (have to) use other computers than your own: Your intricate
.vimrc
, QMK firmware, orkbct
config won’t be on your colleague’s machine or that server you SSH into. Relying solely on heavy customization makes using other systems frustrating and inefficient. Sticking closer to defaults, or using easily portable configurations, helps. - You like to keep the abstraction layers clearly separated: As mentioned, shortcuts can be defined at the hardware, kernel interface, or application level.
By customizing shortcuts heavily at the application layer, you risk breaking the abstraction principle of computer science! Why? Because you’re making the application responsible for interpreting input in a way that might conflict with or duplicate efforts happening at lower levels (like the window manager or a tool like kbct
). This can lead to confusing behavior where shortcuts sometimes work and sometimes don’t, depending on focus or which tool intercepts the keypress first. Defining shortcuts at the lowest appropriate level (e.g., system-wide navigation via Layer 2 tools) often leads to more predictable and maintainable behavior.
Sane shortcuts
Instead of completely rewriting the keyboard’s behavior, focus on high-impact, ergonomic improvements. Good candidates often involve:
- Navigation: Bringing arrow keys, Home/End, PageUp/PageDown closer to the home row. Mapping
Caps Lock
(often unused) toCtrl
orEsc
is also popular. Using a Layer 2 tool likekbct
to mapAlt_R + [hjkl]
to arrow keys system-wide is a common and effective pattern. - Text input: Defining shortcuts for frequently typed symbols or code snippets. Remapping keys to better suit alternative keyboard layouts (like Colemak or Dvorak) if the OS support isn’t sufficient.
The goal is to reduce hand movement and make frequent actions faster, without creating a system so alien that it hinders you elsewhere.
Benefits
Consider the example of switching browser tabs. Many applications use Ctrl+Tab
or Ctrl+PageUp/Down
. Let’s say you want a consistent way, perhaps using arrow keys like Ctrl_L+Alt_L+Left/Right
. This works, but requires moving your hand to the arrow key cluster.
By remapping the Arrow keys
to Alt_R+[hjkl]
using a Layer 2 tool (like kbct
, intercepting /dev/input/eventX
), this becomes a breeze. Ctrl_L+Alt_L + Alt_R+l
(which the system sees as Ctrl_L+Alt_L+Right
) switches tabs without moving your right hand.
- Unlike application-specific extensions like
Vimium
(which is absolutely great but limited), this Layer 2 approach works system-wide. It functions across all Chrome tabs (including settings, PDF views, extension pages where Vimium might not run), in other browsers, in your file manager, your IDE – everywhere. The shortcut is handled before the application even sees it. - You learn one shortcut (
Ctrl_L+Alt_L + Right Equivalent
) that you can apply conceptually across many applications, enhancing consistency and reducing cognitive load. While the exact key combination might rely on your custom mapping, the underlying action shortcut (Ctrl+Alt+Arrows
for tab/window switching) is often standard, making it somewhat transferable.
By choosing the right level of abstraction for your shortcuts, you can significantly improve your workflow efficiency and comfort.