Skip to content
02 / ENGINEERING CASE STUDYAutonomous Agent · Three-Layer Memory · 2026LIVE

RELO

Back Office Bot for real estate — 27 AI tools, three-layer memory, shipped in a 6-day competition sprint on a multi-month Next.js 16 + Supabase foundation I built solo.

BUILT BY DAVID RAJNOHA  ·  MARCH 2026
Scroll
02
Project Index
Role
Solo Builder & Architect
Industry
Real Estate / PropTech
Year
2026
Stack
Next.js · Supabase+ OpenAI · pgvector
0
AI Tools
0
Tests Passing
0
Sprint Days
0×
Memory Layers
03
The Problem
Context
Real estate back-office
Surface area
7 tools · 3 inboxes
Cost of context loss
~4 hrs / agent / week

Back-office work for Czech real estate is death by a thousand tabs — CRM notes, ČÚZK Katastr lookups, ISIR insolvency checks, Sreality feeds, tenant email threads, contract redlines, and appointment juggling, all requiring context the agent has already explained six times this week.

Off-the-shelf LLMs forget between sessions. Custom agents hallucinate under tool-load. What was missing was an agent that remembered — a teammate, not a toy.

“An agent that forgets is a feature request. An agent that remembers is a hire.”
The Challenge

Keep 27 tools coherent, keep memory cheap, ship the sprint.

A back-office agent needs to browse listings, draft replies, read contracts, schedule viewings, and log everything to the CRM — without token budgets exploding or tool-selection collapsing into noise.

Single-context LLMs break down past ~20 tools. Vector stores alone leak irrelevant chunks. And carrying 1,700+ tests through a 6-day sprint meant the architecture had to be wrong exactly zero times.

The Solution

A three-layer memory system with a nightly consolidation pass — autoDream.

Working → Episodic → Long-term. The agent runs a multi-step loop (stopWhen stepCountIs(8)) over all 27 typed tools. Every turn and tool call appends to the L3 activity log; each night, autoDream distills salient traces into L2 semantic memory (pgvector embeddings + entities).

Result: on Monday morning the agent already knows Vinohradská 42 had an unresolved ČÚZK výpis, and that the buyer's agent replies slowly on Tuesdays.

04
System Architecture

Three layers of memory, one nightly consolidator.

FIG. 01 — MEMORY TOPOLOGY
Memory & Tool Graph
user → planner → tools → memory · fan-out: 27
RELO memory and tool architecture diagramUser input flows to an agent loop that can call any of 27 typed tools in a multi-step chain (stopWhen stepCountIs 8). Every step appends to L3 activity log. A nightly autoDream pass distills the log into L2 semantic memory (pgvector), which is retrieved on the next turn.User / AgentINPUTAGENT LOOPstepCountIs(8)gpt-5.4-mini27 TYPED TOOLSCRM · Katastr · ISIR · Email · KalendářL1 · WORKINGCurrent session context · ~8k tokL2 · EPISODICpgvector · semantic searchL3 · LONG-TERMActivity log · append-onlyNIGHTLY · 03:00 UTCautoDreamconsolidation passResponse + ActionOUTPUTtool_call()write(turn)resultconsolidate
L1 · Working

Current session context

Held in-prompt. ~8k tokens of the last user turn, tool results, and active plan. Flushed when the conversation ends.

L2 · Episodic

Semantic memory · pgvector

Postgres + pgvector semantic search. Consolidated embeddings with entity tags — retrievable by similarity, entity, or conversation id. Rebuilt nightly by autoDream.

L3 · Long-term

Activity log · append-only

Every turn, tool call, argument, and outcome appended to a structured audit log. Immutable source of truth — feeds autoDream; survives schema changes; replayable.

/lib/memory/autodream.ts
// Nightly consolidation passexport async function autoDream(userId: string) {  const episodes = await getEpisodesSince(userId, "-24h");  const traces   = await summarize(episodes, { model: "gpt-5.4-mini" });  for (const t of traces) {    await longterm.upsert({      embedding: await embed(t.text),      entities:  t.entities,      weight:    t.salience,    });  }}
/lib/agent/loop.ts
// Multi-step agent loop — all 27 typed tools in contextexport async function runAgent(turn: Turn) {  const memory = await recall(turn, { store: "l2" });  return streamText({    model: openai("gpt-5.4-mini"),    tools: allTools,    stopWhen: stepCountIs(8),    onStepFinish: (s) => appendActivity("l3", s),    messages: [...memory, ...turn.messages],  });}
05
Stack
PRIMARY MODEL · 01
OpenAI
GPT-5.4 mini
Agent loop + autoDream summarizer. Vercel AI SDK v6 · tool calls · streaming · structured outputs. Falls back to gpt-5.4 for long-context summarization.
APP · 02
Next.js 16 + TypeScript
App Router, Server Actions, streaming RSC. Thin edge layer, fat lib/.
DATA · 03
PostgreSQL + Supabase
Episodic store, vector index (pgvector), auth. One database, zero other services.
SDK · 04
Vercel AI SDK
Tool streams, typed schemas.
UI · 05
Tailwind CSS
Design tokens, zero CSS frameworks.
RUNTIME · 06
Vercel
Next.js runtime, streaming responses.
06
The 6-Day Sprint

324 commits, one at a time — Saturday brief to Thursday ship.

Competition sprint — 6 days from Saturday brief (21.3) to Thursday deadline (26.3), on a multi-month Next.js 16 + Supabase + OpenAI foundation I'd built solo beforehand. The days below cover the sprint itself, not the full project.

01
SAT · FOUNDATION
Auth, schema, deploy path
29 commitsauth + RLSbase schema

Brief landed on Saturday; started the same day. Email/password auth with per-user data isolation, migration 004 (views rebuild), Vercel framework config, TypeScript strict cleanup. Last commit 23:52.

02
SUN · AGENT CORE V1
Chat Completions switch, safe-tool wrapper, first 13 tools
15 commits13 toolssafe-tool wrapper

Switched to Chat Completions API, wrapped every tool in a recovery strategy (safe-tool wrapper), landed the first 13 typed tools, file upload system with xlsx + csv parsing. Dashboard + chat UI plumbing. First working end-to-end turn Sunday night.

03
MON · PLATFORM DEPTH
Monitoring, analytics, integrations
48 commits+20K LOCmonitoring

Biggest day by LOC. sReality monitoring, Lead Pipeline, Analytics V2, dashboard depth, ČÚZK + ISIR wiring. Agent loop hardened: streamText + stopWhen stepCountIs(8) over all typed tools.

04
TUE · TOOLING DEPTH
More tools, Czech-region search, tests
58 commits+7K LOCtool depth

Expanded tool surface, diacritics-insensitive search across tools, sReality region ID fixes (10 of 14 were wrong out of the box), Gmail draft visibility, monitor limits. Review hardening + first wave of contract tests.

05
WED · THREE-LAYER MEMORY
autoDream + L2 pgvector consolidation
96 commits+11K LOCautoDream v1

L3 activity log, L2 pgvector retrieval, and the nightly consolidation job. First autoDream pass distilled several hundred activity rows into a retrievable semantic index. Cosine threshold calibrated twice. Design polish + focus rings landed the same day.

06
THU · SHIP
1,700+ tests + V4 Thinking + 27 tools
78 commits+13K LOCshipped · 1,700+ tests ✓

V4 Thinking layer (Think-Plan-Execute), 27-tool SSOT, E2E Playwright (137 tests across 9 sections), sidebar health monitor, native PDF extraction, manual CRUD for all sections. Deadline met Thu EOD — shipped to Vercel.

07
FRI · DEMO
Post-deadline polish + demo reel
3 commitsdemo reelpost-ship

Demo reel for the competition eval. Recorded it three times because the agent got smarter between takes. Added Vercel Analytics, fixed mobile chat workspace. Window closes; judges take over.

07
Live Demo

Watch RELO handle a Monday-morning inbox.

RELO back-office agent dashboard — real production UI
↳ Production UI · live URL below · sensitive data stubbed.
Open live product
08
Results & Learnings

Results

  • R/01Placed 5 of 70 teams — top 7% at the competition. Judges called out the three-layer memory as the clearest technical differentiator.
  • R/02Hard 8-step cap, zero runaway loops across the agent — safe-tool wrapper makes every failure recoverable, no matter which of the 27 tools threw.
  • R/031,700+ tests green under sprint pressure across 70 files — unit, per-tool contract, per-memory-layer retrieval, and scripted multi-turn replay. Architecture was wrong zero times.
  • R/04Czech-native integrations as competitive moat — ČÚZK Katastr, ISIR, Valuo/CMA wired from day one. Locale expertise global competitors can't shortcut.

Learnings

  • L/01
    Consolidation is harder than retrieval. autoDream

    's salience function needed three rewrites before it stopped over-weighting the latest episode.

  • L/02
    Typed tools + safe-tool wrapper.

    Every tool has a Zod schema and a recovery strategy — the agent retries or rephrases instead of breaking the loop. Self-healing beat strict validation.

  • L/03
    Test the loop, not the turn.

    Single-turn tests were green while the full replay was broken — invest in scripted multi-turn harnesses early.

  • L/04
    If I did it again: start with the evaluation harness

    , not the agent. The hours spent retro-fitting tests on Day 4 would have paid back on Day 1.