Design a Smart Parking Agent — Agentic

Scenario#

You’re handed an existing smart parking system at an urban garage:

Sensors at every spot — magnetic / IR / camera-based; they emit occupied | empty events with timestamps.
A ticketing / payment system — issues entry tickets, charges at exit, processes monthly passes.
A mobile app — currently lets users find the garage, check availability totals, and pay.
A backend API — exposes sensor state, ticket state, and historical occupancy.

The product team wants to evolve this into a proactive, intent-aware parking agent. The vision: a driver opens the app, says “I need to park near the convention centre at 2pm for three hours,” and the agent handles the rest — reserving a spot, guiding the driver in, helping them find their car later, warning them when their session is about to expire, recommending whether to extend or move.

Your job: design the agent system. Walk through the architecture and the decisions you’d defend on a whiteboard.

Constraints#

What’s fixed:

The physical sensors and the payment system can’t be replaced — too much CapEx already sunk. The agent has to work with their existing event streams and APIs.
Privacy is non-negotiable. Camera-based sensors must not leak license plates or personal data to the LLM. Whatever the agent sees, it has to be pre-redacted.
The mobile app team has bandwidth for one new screen. Anything beyond a chat-style input and a result view needs a phased plan.
Cost ceiling — the team can spend roughly $0.50 per parking session on agent inference. Above that and the unit economics break.
Latency — the user expects a reply in under 3 seconds for “find me a spot now” requests. Longer is okay for non-urgent flows.

What’s variable:

Which LLM(s) to use. Choice of single-agent vs multi-agent. Memory layer. Tool surface. Whether to use a hosted model or run on-device for some flows.
Whether to add new sensors or APIs over time.
The exact UX shape, within the “one new screen” budget.

What’s wishful (don’t assume):

Real-time intent inference from camera feeds — privacy alone rules this out.
Perfect occupancy prediction — sensors fail, drivers don’t always end sessions cleanly.
That the user always asks clearly. They will not.

Approach#

A reasonable architecture, in broad strokes:

                 ┌─────────────────────────────────────┐
                 │            Mobile app               │
                 │     (existing + one new screen)     │
                 └─────────────────┬───────────────────┘
                                   │
                              ┌────▼────┐
                              │  Intent │  ← classifies user request
                              │  Router │      into a flow
                              └────┬────┘
                                   │
            ┌──────────────────────┼────────────────────────┐
            ▼                      ▼                        ▼
   ┌───────────────┐    ┌──────────────────┐    ┌────────────────────┐
   │ Reservation  │    │ In-session  Help │    │ Session Management │
   │ agent (ReAct)│    │ agent (ReAct)    │    │ agent (event-driven)│
   └──────┬───────┘    └─────────┬────────┘    └─────────┬──────────┘
          │                       │                       │
          └───────────────────────┼───────────────────────┘
                                  ▼
                      ┌─────────────────────┐
                      │  Tool surface       │
                      │  (read-only + write)│
                      └──────────┬──────────┘
                                 ▼
            ┌──────────────────────────────────────────────┐
            │ Existing systems (sensors, tickets, payment) │
            └──────────────────────────────────────────────┘

Three sub-agents because the flows are genuinely different:

Reservation agent. “I want to park near X at Y.” Synchronous; latency-bounded; needs spot search, availability prediction, hold/reserve.
In-session help agent. “Where is my car?” “How long do I have left?” “Can I extend?” Synchronous; lower latency budget; needs ticket lookup, location recall.
Session-management agent. Background; event-driven. Fires when the session is nearing expiry, when overstays are detected, when an unusual exit pattern shows up. Asynchronous; user-facing only via push notification.

A single shared intent router classifies each user input and dispatches to the right sub-agent. The router is cheap (small model, simple prompt); the sub-agents are larger.

Design decisions to make#

The decisions a good design conversation surfaces:

Single agent vs multi-agent. Arguments for single: simpler, one prompt, no router. Arguments for multi: different latency budgets, different tool surfaces, different system prompts. Lean multi-agent here — the flows are genuinely distinct.
What’s the memory layer? Session-scoped working memory is mandatory (the agent needs to remember what the user asked five seconds ago). Long-term per-user memory (“user prefers covered spots”) is an upsell — defer to v2.
What’s the tool surface? Roughly: find_available_spots, reserve_spot, lookup_active_session, extend_session, recall_car_location, send_notification, predict_availability_at_time. Each one wraps existing-system API calls. The model never touches sensors directly.
Latency strategy. Streaming the response (start emitting tokens before the full answer is ready) buys perceptual latency. Pre-computing the top-K spots near common destinations gives the reservation agent a cheap hit-path. A small fast model for intent routing means the user gets something back within 100ms.
Eval harness. A frozen test set of real user requests (or realistic synthetic ones) graded against expected end-states: “did the agent find a spot?”, “did it understand the constraint?”, “did it surface the right warning?”. Without this you can’t tell whether prompt changes are improvements.
Privacy and PII handling. A pre-call validator strips license plates, names, phone numbers from any data the agent sees. A post-call validator scans the agent’s output for PII and blocks it before sending. Treat every model call as potentially adversarial.
Failure-mode design. When the intent router can’t classify, hand off to a human-support flow. When the reservation agent can’t find a spot, fall back to listing all available spots (the old experience). Agents fail; the system shouldn’t.

Trade-offs to discuss#

Common interviewer probes and the trade-offs to surface:

On-device intent routing. Cheaper, faster, more private. Costs: smaller model means worse routing on edge cases; harder to update; one more thing to ship in the app.

Hosted intent routing. More capable, easier to update. Costs: network round-trip, per-call cost, harder privacy story.

Other axes:

Tools that write vs tools that read. Reservations are write actions — the agent commits the user to a billable session. Make these confirm before executing; never let the model auto-reserve without user assent. Read-only tools (availability lookup) need no such gate.
Synchronous vs background agents. The session-management agent runs in the background; its actions are pushed to the user, not pulled. That’s a different UX contract — the user is being interrupted, not interacted with. Restraint matters.
Per-spot prediction vs aggregate prediction. Predicting whether a specific spot will be free in 20 minutes is hard; predicting the aggregate free count for the garage is much easier. The reservation flow only needs “yes/no, will there be a spot?” — aggregate is enough. Don’t over-engineer.
One LLM for everything vs per-flow models. The reservation agent might need a more capable model; the in-session help agent could run on a small one. A cost-optimised design tiers the models.

Evaluation criteria#

A passing answer:

Has a clear architecture — components named, data flow visible, sub-agents (if used) have distinct responsibilities.
Names specific tools — not “the agent talks to the system” but “the agent calls find_available_spots(lat, lng, time_window)”.
Handles failure — what happens when sensors lie, when the model hallucinates a spot, when the user’s intent is ambiguous.
Respects the constraints — privacy, latency, cost. The interviewer will probe each.
Includes evaluation — how do you know the agent works? Don’t punt this question.

A strong answer adds:

A phased rollout (v1: read-only chat, v2: reservation flow, v3: background management) instead of trying to ship everything at once.
Observability — every agent call traced, every action logged, every reservation auditable.
A failure-mode catalogue — “what if the model says the spot is at row 3 but it’s actually at row 5? Who eats the user’s frustration?”
Acknowledgement of the cost cap — if the design’s per-session cost is over $0.50, the design is invalid.

The one design decision interviewers keep coming back to

The reservation action. Should the agent automatically commit a reservation when the user says “I need to park near the convention centre at 2pm”? Most candidates’ first instinct is yes — that’s the point of the agent. Most experienced reviewers will push back: a reservation is a write, it costs the user money, and the agent is at most 95% accurate in 2026. The right answer is “the agent proposes, the user confirms.” This pattern — agent proposes, human confirms write — generalises to most consequential-action agent designs. Internalise it.