Setting Up and Grounding an Agent — Agentic

Goal#

This writeup gets you from an empty directory to a running agent that does something useful in your environment — read a file, hit an API, write a result. It’s the smallest workable hello-world loop with the Agent Development Kit (ADK), and it covers the boring-but-load-bearing parts that the ADK quickstart skips: where credentials go, how tools register, how the system prompt gets organised, and how to test the loop is alive before you spend any time on capability.

By the end you should have a single-agent ADK project that runs locally, talks to a model, calls one tool, and exits cleanly. From there, every other implementation in this topic (Eureka reward loop, multimodal web agent, loop control) is an extension of the same scaffold.

Prerequisites#

Before you start:

Python 3.11+. ADK targets recent Python and uses Annotated types and pydantic v2 heavily. Older Pythons will compile but you’ll fight typing edge cases.
A package manager. pip works; uv is faster. Examples below use uv.
Model credentials. At least one of:
- Google Gemini API key (set as GOOGLE_API_KEY). The most thoroughly tested provider with ADK.
- OpenAI API key (OPENAI_API_KEY) — works via ADK’s LiteLLM adapter.
- Anthropic API key (ANTHROPIC_API_KEY) — same, via LiteLLM.
Read Agent Development Kit Overview. This writeup assumes you know what the ADK primitives are.
Read What Is an AI Agent? and The ReAct Loop for conceptual grounding.

Step-by-step#

1. Project skeleton#

Set up the directory:

mkdir hello-agent && cd hello-agent
uv init
uv add google-adk python-dotenv
mkdir -p tools prompts eval
touch agent.py main.py .env tools/__init__.py

Your directory now looks like:

hello-agent/
├── pyproject.toml
├── .env                  (gitignored)
├── agent.py
├── main.py
├── tools/
│   └── __init__.py
├── prompts/
└── eval/

2. Environment variables#

Put credentials in .env — never inline in code:

GOOGLE_API_KEY=AIzaSy...
# Optional: pick one of these if you're not using Gemini
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Add .env to .gitignore immediately. The single most common security incident in agent projects is a committed credential.

Load them in main.py:

from dotenv import load_dotenv
load_dotenv()

# only after load_dotenv()
from agent import build_agent
from google.adk.runners import Runner

The import order matters — ADK reads env vars at import time for some providers, so loading the .env first avoids the “works locally, broken in CI” class of bug.

3. Define your first tool#

Tools are Python functions with typed signatures and docstrings. Both are read by the model — types become the JSON schema, docstrings become the human-readable description.

from datetime import datetime, timezone

def current_utc_time() -> str:
    """Return the current UTC time as an ISO 8601 string.

    Call this when the user asks for the current time, or when a
    downstream step needs a fresh timestamp.
    """
    return datetime.now(timezone.utc).isoformat()

Three rules:

Name the function for the verb. current_utc_time says what it does. get_time_v2 says nothing.
The docstring is for the model. Write it like you’d write a function description in a library doc. Include when to call it, not just what it does.
Return types should be simple. Strings, dicts, lists. The model serialises back to text anyway; complex objects cause confusion.

4. Write the system prompt#

Keep it in a file, not inline. The reason: prompts evolve, and you want diff-able history.

You are a small utility agent. You answer the user's question using
the tools available to you. Rules:

1. If the user's question requires the current time or a fresh
   timestamp, call `current_utc_time`. Do not guess the time from
   conversational context.
2. If the user's question can be answered from your own knowledge
   without a tool call, answer directly. Do not call tools you don't
   need.
3. Keep answers terse. One or two sentences.
4. If you cannot answer with the tools available, say so plainly.

The format is rule-shaped, not prose. Each rule is a directive the model can check against its own behaviour.

5. Build the agent#

from pathlib import Path
from google.adk.agents import LlmAgent
from tools.clock import current_utc_time

def build_agent() -> LlmAgent:
    system_prompt = (Path(__file__).parent / "prompts" / "system.md").read_text()

    return LlmAgent(
        name="hello-agent",
        model="gemini-2.0-flash",         # fast & cheap for dev
        instruction=system_prompt,
        tools=[current_utc_time],
        # explicit knobs we'll come back to:
        # max_steps, exit_predicate, output_schema, ...
    )

A few specifics worth knowing:

name is used in logs and traces; pick a useful one.
model is a string; you can switch to "openai:gpt-4o" (via LiteLLM) or "anthropic:claude-3-5-sonnet-latest" without touching the rest of the agent.
tools is a list of plain Python callables — ADK introspects the signatures.
instruction is the system prompt.

6. Run it#

# main.py (continued)
from dotenv import load_dotenv
load_dotenv()

from google.adk.runners import Runner
from agent import build_agent

def main() -> None:
    agent = build_agent()
    runner = Runner(agent=agent, app_name="hello-agent")

    user_query = "What's the current UTC time, and what day of the week is it?"
    result = runner.run(user_query)

    print("Final answer:", result.final_response)
    print("Steps taken:", len(result.steps))

if __name__ == "__main__":
    main()

Run it:

uv run python main.py

The first run will: load credentials, instantiate the agent, send your query to the model, the model will emit a tool call to current_utc_time, ADK will execute it, the result flows back to the model, the model composes the final answer.

If it works, you have a hello-world loop. If it doesn’t, the next section is the failure-mode triage.

7. Grounding — connecting to your actual environment#

The “hello-world” agent is unrelated to your environment. The next step is grounding — giving the agent tools that read and (carefully) write to your systems. Typical first-real-tool patterns:

Read-only filesystem tool. A read_file(path: str) -> str that’s scoped to a project directory. Trivial to add, immediately useful.
Read-only API tool. An HTTP client wrapping one of your internal APIs. Pass auth via env var; the tool reads it on each call.
Vector search tool. If you have a knowledge base, a search_docs(query: str, k: int=5) -> list[dict]. Read-only and cheap to call.

from pathlib import Path

ROOT = Path("/abs/path/to/project")  # configure via env var in real code

def read_project_file(relative_path: str) -> str:
    """Read a file from the project directory. Returns the file's
    contents as a string. Path is interpreted relative to the
    project root; absolute paths and `..` traversal are rejected.
    """
    path = (ROOT / relative_path).resolve()
    if not str(path).startswith(str(ROOT)):
        return "ERROR: path outside project root"
    if not path.exists():
        return f"ERROR: not found: {relative_path}"
    return path.read_text()

Path traversal handling matters from the first tool. The agent will at some point try to read ../../../etc/passwd because the user asked it to or because it hallucinated; your tool needs to refuse.

from tools.clock import current_utc_time
from tools.files import read_project_file

def build_agent() -> LlmAgent:
    # ...
    return LlmAgent(
        # ...
        tools=[current_utc_time, read_project_file],
    )

And teach the agent about it by extending the system prompt:

5. To read a file from the project, call `read_project_file` with a
   relative path. Do not invent file contents; if a file isn't
   accessible, say so.

Code structure#

The agent at the end of this writeup:

hello-agent/
├── .env                  (credentials)
├── pyproject.toml
├── agent.py              (build_agent — assembly only)
├── main.py               (entry point)
├── tools/
│   ├── __init__.py
│   ├── clock.py
│   └── files.py
├── prompts/
│   └── system.md
└── eval/
    └── (empty for now)

A few structural rules worth setting on day one:

agent.py is assembly, not logic. The build function imports tools and prompts, wires them into an LlmAgent, and returns it. No tool definitions inline. No prompt strings inline.
One file per tool, or a small group. If tools/files.py grows past a few hundred lines, split by sub-domain. The agent’s tool description budget is paid per tool, so consolidating two similar tools into one with a mode parameter is usually wrong — the model can’t tell when to use each mode.
Prompts as markdown files. They diff cleanly, they’re easy to share with non-engineers, and you can render them in a UI for inspection.
Eval folder, even when empty. Day-one placeholder; you’ll thank yourself when the first regression hits.

# agent.py — final assembly form
from pathlib import Path
from google.adk.agents import LlmAgent

from tools.clock import current_utc_time
from tools.files import read_project_file

def build_agent() -> LlmAgent:
    system_prompt = (Path(__file__).parent / "prompts" / "system.md").read_text()

    return LlmAgent(
        name="hello-agent",
        model="gemini-2.0-flash",
        instruction=system_prompt,
        tools=[current_utc_time, read_project_file],
    )

Loop control and exit conditions#

Even for hello-world, set step and time bounds.

# main.py — with safety knobs
from google.adk.runners import Runner, RunnerConfig

def main() -> None:
    agent = build_agent()
    config = RunnerConfig(
        max_steps=10,
        timeout_seconds=30,
    )
    runner = Runner(agent=agent, app_name="hello-agent", config=config)

    result = runner.run("What's the current UTC time?")
    print(result.final_response)

For a single-tool agent answering one question, ten steps is overkill — you’ll see the agent finish in two or three. But the cap is what protects you from a bug, a misbehaving tool, or a clever user that pushes the agent into a loop. Default to safe; loosen only when measurements justify it.

Runner also accepts:

exit_predicate — a callable run after each step. Return True to stop. Use it for task-specific success conditions (“stop when the agent has produced an answer matching the expected schema”).
on_step callback — fired after each step. Use it for logging, observability, or to inspect intermediate state during development.
max_tokens — token cap across the run. Useful when the agent has rich context (long files, many tools) that could blow up cost.

The Loop Control and Exit Conditions writeup goes deeper into the design of these.

Common pitfalls#

Pitfalls that hit first-time setups most often:

Credentials in the wrong place. Hardcoded in main.py, hardcoded in tests, accidentally logged in traces. Use .env from day one; gitignore it; never log the full env.
Tools without docstrings. The model can’t tell when to call them. Always include a one-line “call this when…” line in the docstring.
System prompt as a wall of text. A 2000-word prompt confuses the model and burns context. Numbered-rule format works better.
No max_steps. A buggy tool or a confused agent can run until your token budget is exhausted. The cap costs nothing to set and is the single most important safety knob.
Mixed concerns in agent.py. When tools, prompts, and runner config all live in agent.py, the file balloons and resists testing. Keep agent.py as assembly only.
Wrong development model. Iterating on a top-tier model is slow and expensive. Use gemini-2.0-flash or gpt-4o-mini while you’re getting the loop running; evaluate on the production model.
Not catching tool exceptions. If a tool throws, the model sees a stack trace as the observation — useless and confusing. Catch in the tool and return a structured "ERROR: ..." string the model can reason over.

Common pitfalls (extended — debugging)#

When the agent doesn’t do what you expect:

Look at the trajectory, not just the final answer. runner.run returns the per-step history. Walk through what the agent thought, what tool it called, what came back. The failure is almost always visible in the trajectory.
Re-read your tool’s docstring. Did you tell the model when to call it? Did you tell it what to expect back? The first round of debugging is usually fixing the tool descriptions, not the prompt.
Check the system prompt for contradictions. Two rules that pull in different directions confuse the model. Tighten until each rule is independently checkable.
Try a stronger model. If the agent makes a reasoning error a smarter model wouldn’t, swap to the production-tier model temporarily. If the bug disappears, your prompt or tools have room to improve before the smaller model is reliable.
Try a weaker model. If the agent works on the strong model but breaks on the cheap one, that’s a signal, not a result. Production-quality agents need to work on the model you’ll actually deploy.

Why so much emphasis on the tool docstring?

Every tool description is in the model’s context for every step. A vague docstring is read by every call, and every call has to re-derive when to use the tool. Time-budget aside, the model’s behaviour is noisier when tool descriptions are imprecise — it’ll call the tool when it shouldn’t, miss it when it should, or hallucinate parameters. A clear, narrow docstring is the single highest-leverage piece of prompt engineering in an agent system. Spend more time on it than you think you should.