Written by c0d3x
on June 13, 2024

On Keyboard Shortcuts

Background

In computing, a keyboard shortcut also known as hotkey is a series of one or several keys to quickly invoke a software program or perform a preprogrammed action.

— From Wikipedia

Whether you know it or not, you are already using keyboard shortcuts. Pressing Ctrl+C to copy text, then Ctrl+V to paste? That’s a shortcut. Shift + Arrow keys to select a text region? Another shortcut. Ctrl + Arrow keys to jump words? Another shortcut.

Generally speaking, a shortcut works by pressing a modifying key to activate a layer, and then another common key to invoke an action within that layer. So Ctrl activates the controlling layer of keys, whereas the Windows key activates the window management layer (think minimizing/maximizing windows, switching workspaces, etc.).

Software engineers tend to customize their computers (and input devices), for better or worse. A popular choice among software developers is VIM-like keybindings, which, among many other things, map the arrow keys to h, j, k and l respectively. “Why such hassle?”, you might ask. Well, the costs of switching hands to press keys add up (context switching on a small scale). By keeping your hands at the same position on the keyboard, you save time and focus.

Levels of Keyboard Customization

There are several levels at which you can apply your keyboard shortcut logic:

At the keyboard itself
At the interface between kernel and userspace
At the application layer

Most applications nowadays encourage you to stick to approach 3. We’ll look at the three approaches individually, but first let’s step back and look at how a keypress gets into userspace in the first place.

When a key is pressed or released, the keyboard generates a scan code and sends it to the CPU using the USB Protocol Stack.
The scan code is translated by the keyboard driver(s) (i.e., in the kernel) to a keycode using a keycode table. The driver(s) also keep track of some keyboard state (e.g., whether caps lock is activated or not).¹
The kernel provides a userspace interface for the key events. For Linux these live under /dev/input/event<i>. This interface is then grabbed by some software, usually the window manager, which intercepts certain key presses and forwards the rest to some application.

With this knowledge at hand, let’s explore different keyboard shortcut approaches.

Programming the Keyboard Itself

Projects like QMK leave the computer as it is and focus on reprogramming the keyboard itself. By writing firmware for the keyboard and manipulating the keymap, the computer receives custom-tailored scancodes in the first place. If pressing Alt_R+l on the keyboard sends the scancode for Right, then that’s the key the computer will assume the user pressed. This is powerful but requires a compatible keyboard.

System-Level Key Interception

This approach intercepts key events after the kernel driver translates them but before most userspace applications (like the window manager or specific apps) see them. Tools like kmonad, interception-tools, or kbct operate at this level.

They work by reading directly from the kernel’s input event devices (/dev/input/eventX). For instance, kbct specifically grabs the raw keyboard device (e.g., /dev/input/event2). It then processes these events according to its configuration (remapping keys, creating layers, defining macros) and outputs modified events to a new virtual input device.

Crucially, this prevents the window manager (WM) or other applications from reading the original, raw keyboard events from /dev/input/event2. Instead, the WM typically grabs a different input device (e.g., /dev/input/eventY, often a composite device or the virtual one created by the tool) which receives the processed key events.

This allows for powerful, system-wide customization that works consistently across all applications, regardless of whether they have their own shortcut settings or what the window manager tries to do. You define your shortcuts once, and they work everywhere – from the terminal to the browser to your IDE.

Application-Level Shortcuts

This is the most common approach. Your web browser, text editor, IDE, file manager – they all come with their own set of default shortcuts and usually provide a way to customize them. You might change your IDE’s “build” shortcut or configure your terminal emulator to use Ctrl+Shift+C/V for copy/paste.

The advantage is ease of use; settings are often graphical and specific to the task at hand. The major disadvantage is inconsistency. A shortcut learned in one application likely won’t work in another. Furthermore, these shortcuts only work when the application is focused and often cannot override system-level or window manager shortcuts.

Don’t Reinvent the Wheel

While deep customization is tempting, it’s often wise to build upon existing conventions. Standard OS shortcuts (Ctrl+C, Ctrl+S, Alt+Tab) are ubiquitous for a reason – they provide a baseline familiarity across systems. Similarly, adopting established paradigms like Vim keybindings (hjkl navigation, modal editing) within supporting applications leverages a well-documented and widely understood system, rather than creating a bespoke setup that only you understand.

By customizing shortcuts heavily at the application layer, you risk breaking the abstraction principle of computer science! You’re making the application responsible for interpreting input in a way that could have been defined at lower levels (like the window manager or system-level tools).

This is where operating system design philosophy becomes crucial. Windows and Linux systems generally reserve the Windows key (or Super key) exclusively for window management and OS functions – applications rarely override these shortcuts, creating predictable, system-wide behavior. In contrast, macOS’s fragmented approach where Command serves both OS and application functions creates inconsistency and conflicts, with applications able to override system shortcuts like ⌘+Q for quit. A dedicated OS/window manager key ensures that core system navigation remains consistent across all applications, preventing the chaos of competing shortcut schemes.

Start with the defaults and override only specific things that genuinely impede your workflow.

Conclusion

By choosing the right level of abstraction for your shortcuts, you can significantly improve your workflow efficiency and comfort. The key is understanding where each level of customization makes sense and building upon established conventions rather than reinventing the wheel.

It might seem counter-intuitive, but the computer tracks the caps lock state (i.e., tells the keyboard to activate and deactivate the LED), not the keyboard. So if caps lock doesn’t respond anymore, it’s a good indicator that your kernel has panicked. ↩︎

← → Top