CPU Scheduling — FIFO, SJF, STCF, RR
Four classic scheduling disciplines, their turnaround / response-time trade-offs, and where each breaks down.
What it is#
A CPU scheduler decides, at every dispatch point, which runnable process gets the CPU next. The four classic disciplines — FIFO (first-in, first-out), SJF (shortest job first), STCF (shortest time-to-completion first, the preemptive form of SJF), and RR (round-robin) — are the vocabulary every real scheduler is built on. MLFQ, CFS, and the various BSD priority queues are all hybrids and refinements of these four.
The disciplines optimise different metrics. Turnaround time is “when did the job finish minus when did it arrive” — a throughput-style measure. Response time is “when did the job first get the CPU minus when did it arrive” — an interactivity measure. No single discipline minimises both, and the trade-off between them is the recurring theme of every real-world scheduler.
When to use it#
You rarely deploy these in their pure form on a general-purpose OS — production schedulers are all adaptive. But you reach for the right pure discipline in two situations:
- Specialised systems. A batch cluster scheduler that knows job runtimes (because users submitted them) can run SJF and minimise mean turnaround. A real-time scheduler with hard deadlines uses earliest-deadline-first, which is the deadline generalisation of SJF. A simple bounded thread pool that processes tasks in submission order is FIFO.
- Interview reasoning. “Given workload X, which discipline minimises Y?” is a classic question. Knowing the four disciplines and their failure modes lets you reason from first principles instead of guessing.
A separate use for the vocabulary is diagnosing real scheduler behaviour. When Linux’s CFS feels unfair under a particular workload, the answer often reduces to “this workload looks like the case where naive SJF starves” or “this workload looks like the case where RR’s slice is too long.”
How it works#
FIFO (also called FCFS — first-come, first-served)#
The simplest discipline: jobs run in the order they arrived. No preemption. Easy to implement (one queue, push on the back, pop from the front). Optimal mean turnaround time only if all jobs are the same length — otherwise a single long job at the head (“the convoy effect”) makes every later short job wait.
Example: jobs A, B, C arrive at t=0 with runtimes 100, 10, 10. FIFO finishes at 100, 110, 120 → mean turnaround = 110. SJF on the same workload finishes B, C, A at 10, 20, 120 → mean turnaround = 50. Same total CPU work; very different mean turnaround.
SJF (shortest job first)#
Schedule the job with the smallest remaining runtime first. Non-preemptive (or assumed non-preemptive in the classic statement). Provably optimal for mean turnaround time when all jobs arrive at the same instant. The catch: you have to know each job’s length, and a long job that’s already running can’t be cut off when a short one arrives.
STCF (shortest time-to-completion first)#
The preemptive version. At every scheduling event (job arrival, job completion, timer tick) the scheduler re-picks the job with the smallest remaining runtime. If a long job is running and a short one arrives, the short one preempts. Optimal mean turnaround even when jobs arrive at different times.
Example: A arrives at t=0 with runtime 100, B and C arrive at t=10 each with runtime 10. STCF runs A for 10, switches to B (10), switches to C (10), resumes A (90). A finishes at 120, B at 20, C at 30 → mean turnaround = (20 + 30 + 120) / 3 ≈ 56.7. Non-preemptive SJF would have finished A first (100), then B (110), then C (120) → mean = 110.
RR (round-robin)#
Each runnable job gets a fixed time quantum (typically 10–100 ms historically; 1–10 ms on modern systems). At the end of the quantum the running job is preempted, moved to the back of the queue, and the next one runs. RR optimises mean response time — every job gets its first slice within N × quantum where N is the queue length. Turnaround is bad: every job pays the full sum of everyone else’s quanta.
The quantum is the design knob. Too short and switch overhead dominates; too long and RR degenerates toward FIFO and response time suffers. “A few times the cost of a context switch” is the right order of magnitude — long enough that overhead is bounded, short enough that interactive jobs feel snappy.
A worked comparison#
Jobs A, B, C of length 5, 5, 5 arriving at t=0.
FIFO : A A A A A B B B B B C C C C C turnaround: 5, 10, 15 → mean 10 response : 0, 5, 10 → mean 5
RR(q=1): A B C A B C A B C A B C A B C turnaround: 13, 14, 15 → mean 14 response : 0, 1, 2 → mean 1Same workload, very different metrics. The choice is which metric matters more.
Variants#
SJF / STCF without runtime oracles#
You usually don’t know how long a job will take. The practical workarounds:
- Estimate from history. An EWMA over previous bursts of the same process predicts the next CPU burst. Used in the BSD scheduler family.
- Discover by observation. Run every job for a small slice, observe whether it blocks (I/O-bound, short burst) or uses its slice (CPU-bound, long burst), demote CPU-bound jobs to a lower priority. This is exactly what MLFQ does.
RR variants#
- Weighted RR. Each job’s quantum is proportional to a weight (priority, share). The simplest form of fair-share scheduling.
- Hierarchical RR. Groups (cgroups, classes) round-robin between each other, and within each group jobs round-robin between themselves.
- Deficit RR. Each job has a “deficit” counter; it gets to run until the deficit is exhausted, then refills. Handles non-uniform service costs (variable-length packets in network scheduling).
Priority-based extensions#
Multilevel queues stack RR (or FIFO) at each priority level. Strict priority means a higher-priority queue empties before a lower-priority queue runs at all. The risk is starvation of low-priority jobs; the standard mitigation is priority boost (MLFQ) or fair-share (CFS, lottery).
Real-time disciplines#
- Rate Monotonic (RM). Static priorities assigned by period — shorter period, higher priority. Optimal among static-priority schedulers for periodic tasks.
- Earliest Deadline First (EDF). Dynamic priority by deadline. Optimal among dynamic-priority schedulers; equivalent to STCF for deadline-driven workloads.
Trade-offs#
Specific trade-offs to keep in mind:
- Knowledge requirement. SJF needs job lengths; STCF needs them online; RR needs nothing. The less the scheduler needs to know, the more robust it is to unfamiliar workloads.
- Preemption overhead vs. fairness. Short quanta improve fairness and response but multiply switch overhead. Modern CFS solves this by using a virtual-runtime ordering instead of fixed quanta — see the lottery / CFS page.
- Fairness vs. throughput. A perfectly fair scheduler hurts throughput on heterogeneous workloads (an I/O-bound and a CPU-bound task get equal CPU, but the I/O-bound one is wasting its share waiting). Real schedulers privilege interactive (I/O-bound) tasks.
Common pitfalls#
- Reasoning about scheduling without naming the metric. “Is RR better than FIFO?” has no answer until you say “for what?”. For mean response, RR; for mean turnaround on equal-length jobs, FIFO; for mean turnaround on mixed jobs, neither — SJF.
- Forgetting starvation. Pure SJF / STCF starves long jobs in workloads with a steady stream of short arrivals. Pure priority queues starve low-priority jobs. Every production scheduler has anti-starvation machinery (aging, boost, lottery, vruntime).
- Treating the quantum as cost-free. A 1 ms quantum on a system with 5 microsecond context switches costs 0.5% of CPU just to switching. A 100 microsecond quantum costs 5%. The quantum interacts with the switch cost, and the switch cost has changed over decades.
- Assuming I/O time is free. Jobs that block on I/O aren’t using the CPU but they’re not “running” either — they’re in the BLOCKED state and the scheduler picks from READY. Schedulers that treat I/O-bound tasks naively (as if they were CPU-hungry) hurt interactive workloads.
- Confusing turnaround with throughput. They’re related but not identical. Throughput is jobs-per-second; turnaround is per-job wall time. SJF improves both for mixed workloads; RR can improve neither (response time isn’t throughput).
Why doesn't Linux use STCF given it's provably optimal?
Because the proof assumes you know remaining runtimes, which you almost never do on a general-purpose system. The CFS approach — give every runnable task an equal share of CPU weighted by nice — sidesteps the prediction problem entirely. It’s not optimal for turnaround on any specific workload, but it’s robust across all workloads, and robustness beats peak optimality when you’re shipping to billions of unknown machines.
Related building blocks#
- The Process Abstraction — what’s being scheduled.
- Context Switching — the cost paid every preemption.
- Multi-Level Feedback Queue (MLFQ) — the practical refinement of these four.
- Lottery Scheduling and Linux CFS — the fair-share family that replaced naive RR.
- Multi-CPU Scheduling — what changes when there’s more than one CPU.