What Is an Operating System?
The three jobs of an OS — virtualize hardware, manage concurrency, persist state — and why each one matters.
Summary#
An operating system is the layer of software that sits between user programs and the bare hardware, doing three jobs and only three jobs: it virtualizes the CPU, memory, and devices so each program sees its own private machine; it manages concurrency so multiple programs can share that machine without corrupting each other’s state; and it persists data through file systems and storage stacks that survive crashes. Everything else a kernel does — scheduling, paging, file caching, signals, sockets — is a specific mechanism in service of one of those three jobs.
The reason this framing matters for an interview is that almost any OS question reduces to “which of the three jobs is this about, and which mechanism is in play?” Get the framing right and you can recover the details from first principles instead of memorising them.
Why it matters#
Without the OS, every program would have to know how to drive the disk controller, lay out physical memory, and yield the CPU politely to other programs. That world existed — early mainframes ran one program at a time, and that program owned everything. The OS exists because the alternative is unworkable: programs would step on each other’s memory, hog the CPU forever, and crash the machine on the smallest bug.
The job isn’t free. Every system call crosses a privilege boundary, every page fault is a trap into the kernel, every context switch wipes caches and TLBs. A modern OS spends real cycles to provide its abstractions. The design tension that recurs across every topic is safety + isolation vs. raw performance, and the answer is almost always “pay a little for safety, claw back the performance with caching and clever data structures.”
How it works#
Job one — virtualize hardware#
The CPU is one physical resource; the OS turns it into the appearance of many. Each process gets its own private execution context — registers, stack, program counter — and the kernel switches between them on a timer interrupt. From the program’s point of view it has the CPU to itself; in reality it gets some fraction of wall-clock time. The same trick applies to memory (every process sees a flat virtual address space; the MMU maps it onto physical pages) and to devices (each process opens its own file descriptor; the kernel multiplexes the underlying device).
Job two — manage concurrency#
Once multiple processes (or threads within a process) share state, you need primitives that make access atomic. The kernel itself is the worst offender — interrupt handlers, kernel threads, and the scheduler all touch shared data. Locks, condition variables, semaphores, RCU, lockless atomics — the OS both provides these to user space (via pthreads and friends) and uses them internally to keep its own data structures consistent.
The bugs that come from getting concurrency wrong (deadlock, race conditions, priority inversion) are some of the hardest in the field, which is why a working understanding of synchronization is what separates “uses threads” from “ships correct threaded code.”
Job three — persist state#
RAM forgets on power loss; disks remember. The OS bridges the gap with a file system that turns named directory hierarchies and byte streams into block-level reads and writes against the device. On top of that, it has to handle crashes — a partial write must not leave the file system inconsistent, which is why journaling, soft updates, and copy-on-write file systems all exist.
The persistence stack is layered: device driver → block layer → file system → page cache → system call. Each layer has a clear contract; getting the contracts wrong is how data is lost.
The mechanisms that make all three possible#
Hardware provides the building blocks: privilege levels (user mode vs. kernel mode), traps and interrupts, an MMU for address translation, a programmable timer for preemption, DMA controllers for asynchronous I/O. The OS is the policy layer that uses these to enforce isolation, schedule work, and route I/O. The mechanism / policy split is the second mental model worth keeping: paging is a mechanism, LRU page replacement is a policy; the trap instruction is a mechanism, what system call number you assigned to read is a policy.
Variants and trade-offs#
The other recurring trade-off is generality vs. specialisation. A general-purpose OS (Linux, macOS) is asked to run text editors, databases, games, and embedded controllers — its scheduler and memory manager must be reasonable across all of them. A specialised OS (a real-time kernel for an airbag controller, a hypervisor for cloud VMs, a unikernel for one network function) discards generality and gets dramatically better behaviour on its target workload.
Why hybrid kernels are mostly marketing
Windows NT and XNU (macOS / iOS) are sometimes called “hybrid” because they have microkernel-flavoured IPC inside a monolithic-feeling kernel. In practice both run their major subsystems in kernel mode for performance, which makes them monolithic with extra steps. The interesting microkernel work today is in seL4 (formally verified) and the unikernel space (MirageOS, Unikraft), not in mainstream desktop kernels.
A third axis worth noting: OS for one machine vs. OS for many. The classic OS literature assumes one host, but today’s “operating system” frequently means a cluster scheduler (Kubernetes, Borg, Mesos) layered on top of per-node Linuxes. The same three jobs reappear — virtualize the cluster CPU, manage concurrent jobs, persist their state — just at a higher altitude.
When this is asked in interviews#
Almost always as the warm-up. “What does an operating system do?” is a question the interviewer expects to dispatch in two minutes, and the strong answer is the three-jobs framing followed by a concrete example of each. The weak answer is a list — “scheduler, file system, drivers, virtual memory” — without the unifying structure.
The follow-ups divide along seniority:
- “Walk me through what happens when I run
cat file.txt.” Tests whether you can stitch together a system call, the VFS lookup, a disk read, the page cache, and the write to a terminal device. Foundational role. - “What does the kernel actually do during a context switch?” Tests CPU virtualization depth. Mid-level role.
- “Why is fork+exec the UNIX pattern and not a single spawn call?” Tests historical fluency plus the ability to reason about composition. Senior role.
- “When would you reach for a microkernel over a monolithic one?” Tests architectural judgement. Staff and above.
A second context where this question appears is in systems infrastructure interviews — kernel teams, hypervisor teams, database storage engineers. There the bar is higher: you’re expected to talk about specific kernel subsystems (epoll, io_uring, the page cache, the block layer) and to have opinions about trade-offs the textbooks gloss over.
Related concepts#