On Keyboard Shortcuts

Background

In computing, a keyboard shortcut also known as hotkey is a series of one or several keys to quickly invoke a software program or perform a preprogrammed action.

— From Wikipedia

Whether you know it or not, you are already using keyboard shortcuts. Pressing Ctrl+C to copy text, then Ctrl+V to paste? That’s a shortcut. Shift + Arrow keys to select a text region? Another shortcut. Ctrl + Arrow keys to jump words? Yet another shortcut.

Generally speaking, a shortcut works by pressing a modifying key to activate a layer, and then another common key to invoke an action within that layer. So Ctrl activates the controlling layer of keys, whereas the Windows key activates the window management layer (think minimizing/maximizing windows, switching workspaces, etc.).

Software engineers tend to customize their computers (and input devices), for better or worse. A popular choice amongst software developers is VIM-like keybindings, which, amongst many other things, map the arrow keys to h, j, k and l respectively. “Why such hassle?”, you might ask. Well, the costs of switching hands to press keys add up (context switching on a small scale). By keeping your hands at the same position on the keyboard, you save time, and focus.

Levels of keyboard customization

There are several levels, at which you can start applying your keyboard shortcut logic.

  1. At the keyboard itself.
  2. At the interface between kernel- and userspace.
  3. At the application layer.

Most applications nowadays encourage you to stick to approach 3 of the above approaches. We’ll look at the three approaches individually, but first let’s step back a little and look at how a keypress gets into userspace in the first place.

  1. When a key is pressed or released, the keyboard generates a scan code and sends it to the CPU using the USB Protocol Stack.
  2. The scan code is translated by the keyboard driver(s) (i.e. in the kernel) to a keycode using a keycode table. The driver(s) also keeps track of some keyboard state (e.g. whether Capslock is activated, or not). It might seem counter-intuitive, but the computer tracks the Capslock state (i.e. tells the keyboard activate and deactivate the LED), not the keyboard. So if Capslock doesn’t respond anymore, it’s a good indicator that your kernel has panicked.
  3. The kernel provides a userspace interface for the key events. For Linux these live under /dev/input/event<i>. This interface is then grabbed by some software, usually the window manager, which then intercepts certain key presses and forwards the rest to some application.

With this knowledge at hand, let’s explore different keyboard shortcut approaches.

Shortcutting the keyboard itself

Projects like QMK leave the computer as it is, and focus on reprogramming the keyboard itself. By writing firmware for the keyboard and manipulating the keymap, the computer receives custom-tailored scancodes in the first place. If pressing Alt_R+l on the keyboard sends the scancode for Right, well then that’s the key the computer is gonna assume the user pressed. This is powerful but requires a compatible keyboard.

Shortcutting at the interface between kernel- and userspace

This approach intercepts key events after the kernel driver translates them but before most userspace applications (like the window manager or specific apps) see them. Tools like kmonad, interception-tools, or kbct operate at this level.

They work by reading directly from the kernel’s input event devices (/dev/input/eventX). For instance, kbct specifically grabs the raw keyboard device (e.g., /dev/input/event2). It then processes these events according to its configuration (remapping keys, creating layers, defining macros) and outputs modified events to a new virtual input device.

Crucially, this prevents the Window Manager (WM) or other applications from reading the original, raw keyboard events from /dev/input/event2. Instead, the WM typically grabs a different input device (e.g., /dev/input/eventY, often a composite device or the virtual one created by the tool) which receives the processed key events.

This allows for powerful, system-wide customization that works consistently across all applications, regardless of whether they have their own shortcut settings or what the window manager tries to do. You define your shortcuts once, and they work everywhere – from the terminal to the browser to your IDE.

Shortcutting at the application layer

This is the most common approach. Your web browser, text editor, IDE, file manager – they all come with their own set of default shortcuts and usually provide a way to customize them. You might change your IDE’s “build” shortcut or configure your terminal emulator to use Ctrl+Shift+C/V for copy/paste.

The advantage is ease of use; settings are often graphical and specific to the task at hand. The major disadvantage is inconsistency. A shortcut learned in one application likely won’t work in another. Furthermore, these shortcuts only work when the application is focused and often cannot override system-level or window manager shortcuts.

Don’t reinvent the wheel

While deep customization is tempting, it’s often wise to build upon existing conventions. Standard OS shortcuts (Ctrl+C, Ctrl+S, Alt+Tab) are ubiquitous for a reason – they provide a baseline familiarity across systems. Similarly, adopting established paradigms like Vim keybindings (hjkl navigation, modal editing) within supporting applications leverages a well-documented and widely understood system, rather than creating a bespoke setup only you understand. Start with the defaults and override specific things that genuinely impede your workflow.

When too much customization sucks

Deep customization, especially across multiple layers, can become a liability. Consider these scenarios:

By customizing shortcuts heavily at the application layer, you risk breaking the abstraction principle of computer science! Why? Because you’re making the application responsible for interpreting input in a way that might conflict with or duplicate efforts happening at lower levels (like the window manager or a tool like kbct). This can lead to confusing behavior where shortcuts sometimes work and sometimes don’t, depending on focus or which tool intercepts the keypress first. Defining shortcuts at the lowest appropriate level (e.g., system-wide navigation via Layer 2 tools) often leads to more predictable and maintainable behavior.

Sane shortcuts

Instead of completely rewriting the keyboard’s behavior, focus on high-impact, ergonomic improvements. Good candidates often involve:

The goal is to reduce hand movement and make frequent actions faster, without creating a system so alien that it hinders you elsewhere.

Benefits

Consider the example of switching browser tabs. Many applications use Ctrl+Tab or Ctrl+PageUp/Down. Let’s say you want a consistent way, perhaps using arrow keys like Ctrl_L+Alt_L+Left/Right. This works, but requires moving your hand to the arrow key cluster.

By remapping the Arrow keys to Alt_R+[hjkl] using a Layer 2 tool (like kbct, intercepting /dev/input/eventX), this becomes a breeze. Ctrl_L+Alt_L + Alt_R+l (which the system sees as Ctrl_L+Alt_L+Right) switches tabs without moving your right hand.

By choosing the right level of abstraction for your shortcuts, you can significantly improve your workflow efficiency and comfort.