Anatomy of a Terminal Emulator
To link to this post, it is advisable to use the preview URL: https://poor.dev/terminal-anatomy
The terminal is a ubiquitous platform that has been fairly stable for many years. There are plenty of resources out there for understanding its inner workings, but most of them are either fairly arcane or offer deep knowledge about a very specific area. This post aims to bridge this gap by offering a gentle and broad introduction to the terminal emulator as a platform for development.
We’ll talk about the different parts of the terminal and how they interact, build a small program to read input from the shell and understand how it’s interpreted, discuss how to create a user interface in the terminal and finally see how we can use all of this to cause some mischief.
I’ve used Rust for code examples, but tried to make them as simple and short as possible so that they’ll also be comprehensible to non-Rust programmers. I also provided explanations of the relevant parts of the code. We refer to the workings of terminal emulators in a unix or unix-like system (eg. Linux and macOS). Some parts might also be relevant to other operating systems.
While this is aimed at those new to developing terminal applications, I tried to go into enough depth in certain areas to keep things interesting for old terminal hounds as well. If you found a mistake in this post, or feel something has been over-simplified or hand-waved away, feel free to hit me up on Twitter.
Different parts of the terminal
In relation to the terminal emulator, one often hears terms such as “the shell” (or command-line) and pty. Let’s talk a little about all of these and their relationship.
The terminal emulator, as its name suggests, carries with it a lot of history. In this post though, we’re going to concentrate on how things work today. I’ve provided some links for those interested in how things came to be this way at the bottom of this article.
Both the terminal emulator (eg. gnome terminal, alacritty, rxvt-unicode) and the shell (eg. bash, zsh, fish) are executable applications that run on our machine.
The terminal emulator is a graphical application whose role it is to interpret data coming from the shell and display it on screen. This display is often textual but not always.
The shell provides an interface to the operating system, allowing the user to interact with its file-system, run processes and often have access to basic scripting capabilities.
These two programs are connected together by the pty (pseudoterminal) which provides a bi-directional asynchronous communication channel between the two. One channel of the pty represents the communication directed from the terminal emulator to the shell (STDIN) and the other channel refers to the communication directed from the shell to the terminal emulator (STDOUT). When a user types text into the terminal, it is sent to the shell through the pty’s STDIN channel, and when the shell would like to display text to the user on their terminal emulator, it is sent to it through the pty’s STDOUT channel.
The pty has two sides (we will refer to them as “primary” and “secondary”, although the official documentation refers to them by other names). On the primary side is the terminal emulator, and on the secondary side is the shell - though conceivably any program that expects to be connected to a terminal can be opened on the pty’s secondary side (eg. the terminal emulator can open “vim” or “top” directly on the secondary side of the pty when creating it).
Let’s look at an example of how this whole system works:
How does the terminal emulator interpret and display the data from the shell?
When the shell sends text to the terminal, it uses a set of instructions haphazardly gathered over the years under the title “ANSI escape codes”. These are used when the shell needs to send anything that is more involved than clear text (eg. changing colors/styles or moving the cursor). To see how these work exactly, let’s write a small Rust program that spawns a pty and starts the machine’s default shell on its secondary side. Then we can see what the shell sends us.
fn read_from_fd(fd: RawFd) -> Option<Vec<u8>> {
unimplemented!()
}
fn spawn_pty_with_shell(default_shell: String) -> RawFd {
unimplemented!()
}
fn main() {
let default_shell = std::env::var("SHELL")
.expect("could not find default shell from $SHELL");
let stdout_fd = spawn_pty_with_shell(default_shell);
let mut read_buffer = vec![];
loop {
match read_from_fd(stdout_fd) {
Some(mut read_bytes) => {
read_buffer.append(&mut read_bytes);
}
None => {
println!("{:?}", String::from_utf8(read_buffer).unwrap());
std::process::exit(0);
}
}
}
}
Here we start by getting the path to the system’s default shell from the SHELL
environment variable and start it in a new process on the secondary side of a pty (in a function that we’ll flesh out below).
This function returns the STDOUT file descriptor of the primary side of the pty. We start reading from it in the read_from_fd
function (which we’ll also flesh out below). We read until there’s no more data to read (presumably the process ended) and then we print out all the data we’ve read.
Let’s flesh out the spawn_pty_with_shell
function:
use nix::pty::forkpty;
use nix::unistd::ForkResult;
use std::os::unix::io::RawFd;
use std::process::Command;
fn spawn_pty_with_shell(default_shell: String) -> RawFd {
match forkpty(None, None) {
Ok(fork_pty_res) => {
let stdout_fd = fork_pty_res.master; // primary
if let ForkResult::Child = fork_pty_res.fork_result {
// I'm the secondary part of the pty
Command::new(&default_shell)
.spawn()
.expect("failed to spawn");
std::thread::sleep(std::time::Duration::from_millis(2000));
std::process::exit(0);
}
stdout_fd
}
Err(e) => {
panic!("failed to fork {:?}", e);
}
}
}
forkpty
is a libc function that forks the current process, starts a pty and places the child part of the fork on the secondary side of the pty. We use the excellent nix
wrapper around the bare C function to deal with any unsafe code for us.
The code on the Ok
side of the match
statement runs in both the parent process (our main program) and the child process. We distinguish between the two by the ForkResult
, and so in the child process, we run the default shell, sleep for 2 seconds to let it load and then exit. In the parent process, we return the stdout
file descriptor to the program so that it can read what information the child sends it.
Now, let’s flesh out the read_from_fd
function:
use nix::unistd::read;
fn read_from_fd(fd: RawFd) -> Option<Vec<u8>> {
// https://linux.die.net/man/7/pipe
let mut read_buffer = [0; 65536];
let read_result = read(fd, &mut read_buffer);
match read_result {
Ok(bytes_read) => Some(read_buffer[..bytes_read].to_vec()),
Err(_e) => None,
}
}
Here we accept the file descriptor that we got from the spawn_pty_with_shell
function above. We send it to the read
system call along with a mutable buffer. This will read up to that amount of bytes (65536 in our case), place them in the buffer we gave it, and return us the number of bytes read.
Assuming a successful read, we then allocate a vector from the read portion of our read_buffer and return it to our main program.
Here’s the full source for this program
Running this program, we get the following output:
^[(B^[[m\r^[[KWelcome to fish, the friendly interactive shell\r\n^[]0;fish /home/aram^[[m^[[97m^[[46m⋊>^[[m^[[33m ~^[[m^[[K^[[67C^[[38;2;85;85;85m10:21:15^[[m\r^[[5C
Almost readable, right? This is the initial output of my default shell, fish.
It’s plain text with a smattering of those “ANSI escape codes” we talked about at the beginning of this section.
Let’s take a dive and see how we as a terminal emulator would interpret it. Note that the ^[
character is used here for convenience to denote an ASCII escape character.
This is a slightly redacted output. The actual output given by the fish shell includes some repetitions and instructions that I felt weren’t relevant to this post.
This is a pretty simple example, but any textual interface that runs in the terminal - regardless of which language it was written in - works in this way.
One can even send these instructions directly to one’s own terminal using the echo
command. Go ahead and try this out:
echo -e "I am some \033[38;5;9mred text!"
“033” is the octal representation of the ASCII escape character.
Just for fun, let’s look at a more involved example. Here’s a big lump of text that was meticulously prepared just for this purpose:
[1H[J[8;20H[38;2;167;216;255m!_[B[2D|*`~-.,[B[7D|.-~^`[2;51H!_[B[2D|*~=-.,[B[7D|_,-'`[14;12H!_[B[2D|*`--,[B[6D|.-'[16;34H!_[B[2D|*`~-.,[B[7D|_,-~`[14;43H!_[B[2D|*`~-.,[B[7D|_.-"`[11;20H[38;2;227;176;110m|[B[D|[5;51H|[B[D|[17;12H|[19;34H|[17;43H|[18;11H[38;2;190;97;107m/^\[B[4D/ \[B[6D/, \[B[8D/#" \[B[10D/##_ _ \[7;50H/^\[B[4D/ \[B[6D/, \[B[8D/#" \[B[10D/##_ _ \[18;42H[38;2;190;97;107m/^\[B[4D/ \[B[6D/, \[B[8D/#" \[B[10D/##_ _ \[22;25H[38;2;167;216;255m_____[B[8D0~{_ _ _}~0[B[9D| , |[B[7D| ((* |[B[7D| ` |[B[6D`-.-`[30;25H[38;2;227;176;110m_,--,_[B[7D/ | | \[B[8D| | | |[B[8D| | <&>[B[8D| | | |[B[8D| | | |[28;12H_[B[2D/+\[B[4D|+|+|[B[5D|+|+|[B[5D|+|+|[B[5D^^^^^[28;43H_[B[2D/+\[B[4D|+|+|[B[5D|+|+|[B[5D|+|+|[B[5D^^^^^[25;57H___[B[3D/+\[B[3D|+|[B[3D|+|[B[3D^^^[11;45H[38;2;112;117;121m_[3C_[3C_[3C_[B[14D[ ]_[ ]_[ ]_[ ][12;15H_[3C_[3C_[3C_[3C_[13;14H[ ]_[ ]_[ ]_[ ]_[ ] |_=_-=_ - =_|[14;15H|_=_ =-_-_ = =_|[14;46H|=_= - |[15;18H_- _ |[15;50H= [] |[16;16H|= [] |[16;49H_- |[17;16H|_=- - |[17;46H|=_- |[18;16H|=_= - |[18;46H|_ - =[] |[19;7H_[19;15H_|_=- _ _ _| _[19;37H_[19;46H|=_- |[20;6H[ ][20;16H[ ]_[ ]_[ ]_[ ]_[ ]_[ ]_[20;47H[ ]=- |[21;7H|[21;17H_=-___=__=__- =-_ -=_[21;48H| _ [] |[22;6H_[3C_[3C_[3C_-_ =[22;37H_[3C_[3C_[3C_ - |\[23;5H[ ]_[ ]_[ ]_[ ]=_[23;36H[ ]_[ ]_[ ]_[ ]=- | \[24;5H|_=__-_=-_ =_|-=_[24;36H|_=-___-_ =-__|_ | \[25;6H| _- =- |-_[25;37H|= _= | - |[3C\[26;6H|= -_= |= _[26;37H|_-=_ |=_ |[3C|[27;6H| =_ - |_ = _[27;37H| =_ = = |=_- |[3C|[28;6H|-_=-[28;18H|=_ = |=_= -[28;49H| = |[3C|[29;6H|=_-[29;18H| -= |_=-[29;49H|=_ |[3C|[30;6H|=_[30;18H|= - -[30;37H|_=[30;49H| -_ |= |[31;6H| -[31;18H|-_=[31;37H|=_[31;49H|-=_ |_-/[32;6H|=_=[32;18H| =_=[32;37H|_-[32;49H|_ = |=/[33;6H| _[33;18H|= -[33;37H|=_=[33;49H|_=- |/[34;6H|=_ = | =_-_[34;37H| =_ | -_ |[35;6H|_=-_ |=_=[35;37H|=_= |=- |[36;1H[38;2;163;189;141m^^^^^^^^^^`^`^^`^`^`^^^""""""""^`^^``^^`^^`^^`^`^``^`^``^``^^
Let’s see how the terminal emulator would interpret it: