Setting Up and Grounding an Agent
Wiring an agent to its environment — env vars, tool registration, prompt scaffolding, and the smallest workable hello-world loop.
Goal#
This writeup gets you from an empty directory to a running agent that does something useful in your environment — read a file, hit an API, write a result. It’s the smallest workable hello-world loop with the Agent Development Kit (ADK), and it covers the boring-but-load-bearing parts that the ADK quickstart skips: where credentials go, how tools register, how the system prompt gets organised, and how to test the loop is alive before you spend any time on capability.
By the end you should have a single-agent ADK project that runs locally, talks to a model, calls one tool, and exits cleanly. From there, every other implementation in this topic (Eureka reward loop, multimodal web agent, loop control) is an extension of the same scaffold.
Prerequisites#
Before you start:
- Python 3.11+. ADK targets recent Python and uses
Annotatedtypes andpydanticv2 heavily. Older Pythons will compile but you’ll fight typing edge cases. - A package manager.
pipworks;uvis faster. Examples below useuv. - Model credentials. At least one of:
- Google Gemini API key (set as
GOOGLE_API_KEY). The most thoroughly tested provider with ADK. - OpenAI API key (
OPENAI_API_KEY) — works via ADK’s LiteLLM adapter. - Anthropic API key (
ANTHROPIC_API_KEY) — same, via LiteLLM.
- Google Gemini API key (set as
- Read Agent Development Kit Overview. This writeup assumes you know what the ADK primitives are.
- Read What Is an AI Agent? and The ReAct Loop for conceptual grounding.
Step-by-step#
1. Project skeleton#
Set up the directory:
mkdir hello-agent && cd hello-agentuv inituv add google-adk python-dotenvmkdir -p tools prompts evaltouch agent.py main.py .env tools/__init__.pyYour directory now looks like:
hello-agent/├── pyproject.toml├── .env (gitignored)├── agent.py├── main.py├── tools/│ └── __init__.py├── prompts/└── eval/2. Environment variables#
Put credentials in .env — never inline in code:
GOOGLE_API_KEY=AIzaSy...# Optional: pick one of these if you're not using GeminiOPENAI_API_KEY=sk-...ANTHROPIC_API_KEY=sk-ant-...Add .env to .gitignore immediately. The single most common security incident in agent projects is a committed credential.
Load them in main.py:
from dotenv import load_dotenvload_dotenv()
# only after load_dotenv()from agent import build_agentfrom google.adk.runners import RunnerThe import order matters — ADK reads env vars at import time for some providers, so loading the .env first avoids the “works locally, broken in CI” class of bug.
3. Define your first tool#
Tools are Python functions with typed signatures and docstrings. Both are read by the model — types become the JSON schema, docstrings become the human-readable description.
from datetime import datetime, timezone
def current_utc_time() -> str: """Return the current UTC time as an ISO 8601 string.
Call this when the user asks for the current time, or when a downstream step needs a fresh timestamp. """ return datetime.now(timezone.utc).isoformat()Three rules:
- Name the function for the verb.
current_utc_timesays what it does.get_time_v2says nothing. - The docstring is for the model. Write it like you’d write a function description in a library doc. Include when to call it, not just what it does.
- Return types should be simple. Strings, dicts, lists. The model serialises back to text anyway; complex objects cause confusion.
4. Write the system prompt#
Keep it in a file, not inline. The reason: prompts evolve, and you want diff-able history.
You are a small utility agent. You answer the user's question usingthe tools available to you. Rules:
1. If the user's question requires the current time or a fresh timestamp, call `current_utc_time`. Do not guess the time from conversational context.2. If the user's question can be answered from your own knowledge without a tool call, answer directly. Do not call tools you don't need.3. Keep answers terse. One or two sentences.4. If you cannot answer with the tools available, say so plainly.The format is rule-shaped, not prose. Each rule is a directive the model can check against its own behaviour.
5. Build the agent#
from pathlib import Pathfrom google.adk.agents import LlmAgentfrom tools.clock import current_utc_time
def build_agent() -> LlmAgent: system_prompt = (Path(__file__).parent / "prompts" / "system.md").read_text()
return LlmAgent( name="hello-agent", model="gemini-2.0-flash", # fast & cheap for dev instruction=system_prompt, tools=[current_utc_time], # explicit knobs we'll come back to: # max_steps, exit_predicate, output_schema, ... )A few specifics worth knowing:
nameis used in logs and traces; pick a useful one.modelis a string; you can switch to"openai:gpt-4o"(via LiteLLM) or"anthropic:claude-3-5-sonnet-latest"without touching the rest of the agent.toolsis a list of plain Python callables — ADK introspects the signatures.instructionis the system prompt.
6. Run it#
# main.py (continued)from dotenv import load_dotenvload_dotenv()
from google.adk.runners import Runnerfrom agent import build_agent
def main() -> None: agent = build_agent() runner = Runner(agent=agent, app_name="hello-agent")
user_query = "What's the current UTC time, and what day of the week is it?" result = runner.run(user_query)
print("Final answer:", result.final_response) print("Steps taken:", len(result.steps))
if __name__ == "__main__": main()Run it:
uv run python main.pyThe first run will: load credentials, instantiate the agent, send your query to the model, the model will emit a tool call to current_utc_time, ADK will execute it, the result flows back to the model, the model composes the final answer.
If it works, you have a hello-world loop. If it doesn’t, the next section is the failure-mode triage.
7. Grounding — connecting to your actual environment#
The “hello-world” agent is unrelated to your environment. The next step is grounding — giving the agent tools that read and (carefully) write to your systems. Typical first-real-tool patterns:
- Read-only filesystem tool. A
read_file(path: str) -> strthat’s scoped to a project directory. Trivial to add, immediately useful. - Read-only API tool. An HTTP client wrapping one of your internal APIs. Pass auth via env var; the tool reads it on each call.
- Vector search tool. If you have a knowledge base, a
search_docs(query: str, k: int=5) -> list[dict]. Read-only and cheap to call.
from pathlib import Path
ROOT = Path("/abs/path/to/project") # configure via env var in real code
def read_project_file(relative_path: str) -> str: """Read a file from the project directory. Returns the file's contents as a string. Path is interpreted relative to the project root; absolute paths and `..` traversal are rejected. """ path = (ROOT / relative_path).resolve() if not str(path).startswith(str(ROOT)): return "ERROR: path outside project root" if not path.exists(): return f"ERROR: not found: {relative_path}" return path.read_text()Path traversal handling matters from the first tool. The agent will at some point try to read ../../../etc/passwd because the user asked it to or because it hallucinated; your tool needs to refuse.
Register it:
from tools.clock import current_utc_timefrom tools.files import read_project_file
def build_agent() -> LlmAgent: # ... return LlmAgent( # ... tools=[current_utc_time, read_project_file], )And teach the agent about it by extending the system prompt:
5. To read a file from the project, call `read_project_file` with a relative path. Do not invent file contents; if a file isn't accessible, say so.Code structure#
The agent at the end of this writeup:
hello-agent/├── .env (credentials)├── pyproject.toml├── agent.py (build_agent — assembly only)├── main.py (entry point)├── tools/│ ├── __init__.py│ ├── clock.py│ └── files.py├── prompts/│ └── system.md└── eval/ └── (empty for now)A few structural rules worth setting on day one:
agent.pyis assembly, not logic. The build function imports tools and prompts, wires them into anLlmAgent, and returns it. No tool definitions inline. No prompt strings inline.- One file per tool, or a small group. If
tools/files.pygrows past a few hundred lines, split by sub-domain. The agent’s tool description budget is paid per tool, so consolidating two similar tools into one with amodeparameter is usually wrong — the model can’t tell when to use each mode. - Prompts as markdown files. They diff cleanly, they’re easy to share with non-engineers, and you can render them in a UI for inspection.
- Eval folder, even when empty. Day-one placeholder; you’ll thank yourself when the first regression hits.
# agent.py — final assembly formfrom pathlib import Pathfrom google.adk.agents import LlmAgent
from tools.clock import current_utc_timefrom tools.files import read_project_file
def build_agent() -> LlmAgent: system_prompt = (Path(__file__).parent / "prompts" / "system.md").read_text()
return LlmAgent( name="hello-agent", model="gemini-2.0-flash", instruction=system_prompt, tools=[current_utc_time, read_project_file], )Loop control and exit conditions#
Even for hello-world, set step and time bounds.
# main.py — with safety knobsfrom google.adk.runners import Runner, RunnerConfig
def main() -> None: agent = build_agent() config = RunnerConfig( max_steps=10, timeout_seconds=30, ) runner = Runner(agent=agent, app_name="hello-agent", config=config)
result = runner.run("What's the current UTC time?") print(result.final_response)For a single-tool agent answering one question, ten steps is overkill — you’ll see the agent finish in two or three. But the cap is what protects you from a bug, a misbehaving tool, or a clever user that pushes the agent into a loop. Default to safe; loosen only when measurements justify it.
Runner also accepts:
exit_predicate— a callable run after each step. ReturnTrueto stop. Use it for task-specific success conditions (“stop when the agent has produced an answer matching the expected schema”).on_stepcallback — fired after each step. Use it for logging, observability, or to inspect intermediate state during development.max_tokens— token cap across the run. Useful when the agent has rich context (long files, many tools) that could blow up cost.
The Loop Control and Exit Conditions writeup goes deeper into the design of these.
Common pitfalls#
Pitfalls that hit first-time setups most often:
- Credentials in the wrong place. Hardcoded in
main.py, hardcoded in tests, accidentally logged in traces. Use.envfrom day one; gitignore it; never log the full env. - Tools without docstrings. The model can’t tell when to call them. Always include a one-line “call this when…” line in the docstring.
- System prompt as a wall of text. A 2000-word prompt confuses the model and burns context. Numbered-rule format works better.
- No
max_steps. A buggy tool or a confused agent can run until your token budget is exhausted. The cap costs nothing to set and is the single most important safety knob. - Mixed concerns in
agent.py. When tools, prompts, and runner config all live inagent.py, the file balloons and resists testing. Keepagent.pyas assembly only. - Wrong development model. Iterating on a top-tier model is slow and expensive. Use
gemini-2.0-flashorgpt-4o-miniwhile you’re getting the loop running; evaluate on the production model. - Not catching tool exceptions. If a tool throws, the model sees a stack trace as the observation — useless and confusing. Catch in the tool and return a structured
"ERROR: ..."string the model can reason over.
Common pitfalls (extended — debugging)#
When the agent doesn’t do what you expect:
- Look at the trajectory, not just the final answer.
runner.runreturns the per-step history. Walk through what the agent thought, what tool it called, what came back. The failure is almost always visible in the trajectory. - Re-read your tool’s docstring. Did you tell the model when to call it? Did you tell it what to expect back? The first round of debugging is usually fixing the tool descriptions, not the prompt.
- Check the system prompt for contradictions. Two rules that pull in different directions confuse the model. Tighten until each rule is independently checkable.
- Try a stronger model. If the agent makes a reasoning error a smarter model wouldn’t, swap to the production-tier model temporarily. If the bug disappears, your prompt or tools have room to improve before the smaller model is reliable.
- Try a weaker model. If the agent works on the strong model but breaks on the cheap one, that’s a signal, not a result. Production-quality agents need to work on the model you’ll actually deploy.
Why so much emphasis on the tool docstring?
Every tool description is in the model’s context for every step. A vague docstring is read by every call, and every call has to re-derive when to use the tool. Time-budget aside, the model’s behaviour is noisier when tool descriptions are imprecise — it’ll call the tool when it shouldn’t, miss it when it should, or hallucinate parameters. A clear, narrow docstring is the single highest-leverage piece of prompt engineering in an agent system. Spend more time on it than you think you should.
Related implementations#