Chatbot agent end-to-end test¶

A step-by-step guide that builds an agent, designs conversations, runs chats as test personas, and verifies that everything landed correctly in memory. Follow every step — the expected results at the end are specific and testable.

Choose your test path¶

This tutorial supports two ways to run the chats. The agent and conversation design (Parts 1–3) are the same; only the run-and-verify steps differ.

	Path A — Portal Chat tab	Path B — Claude Code (MCP)
Plays the chatbot	Your configured LLM (Anthropic / OpenAI / GLM)	Claude Code's own model
Cost	LLM API costs per turn	Free
Iteration speed	Reload Chat tab between edits	Edit + test in same session
Fidelity	Exact production model	Claude Code, not the production model
Best for	Final validation	Iterating on prompts, stages, extraction

Pick Path A when you have a Hadron portal account with an LLM API key and want production-fidelity validation. Pick Path B when you want a fast, free loop while you iterate. You can do both — Path B for development, Path A as a final pass.

The Path A and Path B sections are presented together using tabs in Parts 4 and 5. Switch tabs based on what you chose; the steps inside each tab are self-contained.

What you will build¶

By the end of this test you will have:

1 agent ("Sage") with a system memory and a knowledge memory.
2 conversations: onboarding (learn about the user) and strategy (help with a business problem).
3 stages per conversation, each with an extractionSpec that pulls structured data from the chat.
2+ personas (Alex Chen and Maria Santos), each with their own per-user memory created on first chat.
Chat transcripts stored in each persona's memory, including message history, extracted data, summaries, and stage transitions.

What you need¶

A Hadron portal account at hadronmemory.com with admin access to an organization.
Claude Code installed with the Hadron MCP server connected (used in Part 3 for both paths to author conversation designs).

Path A — Portal ChatPath B — Claude Code (MCP)

An API key for one of: Anthropic, OpenAI, or GLM.
A Workstation app (only needed for Part 3 / authoring conversations).
That's it for Path A — chats run in the browser.

A Workstation app with the Hadron MCP proxy connected (see Connecting an MCP host).
You do not need an LLM API key — Claude Code plays both the user and the chatbot, no production LLM is called.

Part 1 — Create the agent (Portal)¶

Go to your org page → Create Chatbot Agent.
Fill in:
- Agent name: Sage
- Description: AI mentor for small business owners
- Visibility: Personal

On the System Prompt step, enter:

You are Sage, a warm and practical AI mentor for small business
owners. You ask clarifying questions before giving advice. You are
direct but encouraging. When you don't know something, say so.

On the Memories step:
- System memory: auto-named Sage System (URN: sage-system) — leave as-is.
- Knowledge memory: keep checked. Auto-named Sage Knowledge (URN: sage-knowledge).
Review and create.
On the agent detail page → Settings tab, check that surfaces includes mcp (chatbot agents default to ["api"] only). Add mcp if missing — Part 3 needs it.

Checkpoint: On the agent detail page you should see:

Agent "Sage" with system memory set.
Two memories listed: Sage Knowledge (read-write).
The Chatbot Control tab showing one conversation (setup) with one stage (onboard).

Path A — Portal ChatPath B — Claude Code (MCP)

Configure the LLM provider on the agent.

Go to Settings → AI configuration → Configure:

Pick your provider (e.g. OpenAI).
Enter model (e.g. gpt-4o-mini).
Paste your API key.
Save, then Test. Expect "Works" with a short reply.

Checkpoint: The Settings panel now shows "Configured" with your provider, model, and a masked key. The Chat tab should NOT appear yet (no non-setup conversations exist).

Wire your Workstation app to the Sage agent so Claude Code can reach it.

Go to Apps → your Workstation app → Settings tab → Agents.
Add the Sage agent with the Develop role (read/write).

No LLM provider configuration is needed for Path B — Claude Code plays the chatbot itself.

Checkpoint: In Claude Code, ask:

List my memories using h-list-memories.

You should see Sage System and Sage Knowledge.

Part 2 — Set the active memory (Claude Code)¶

Open Claude Code in the working directory of the Workstation app you configured. Set the active memory to the Sage system memory:

Set the active memory to Sage System.

All subsequent node operations in Part 3 will target this memory.

Part 3 — Design conversations (Claude Code + MCP)¶

Both paths author the conversations the same way: through Claude Code with the Hadron MCP tools.

3.1 — Verify the wizard scaffold¶

The wizard already created a setup conversation with one onboard stage. Confirm:

List all nodes with prefix conversations.

Expected — 3 nodes:

conversations             (system)
conversations:setup       (system) — data: { isSetup: true, stageOrder: ["onboard"] }
conversations:setup:onboard  (system) — data: { promptRef: "prompts:setup:onboard", extractionSpec: [...] }

Read the node at prompts:setup:onboard with raw: true.

You should see the wizard's default onboard prompt.

3.2 — Create the `onboarding` conversation¶

Create a conversation called onboarding in the Sage system memory with 3 stages: welcome, background, and goals. This is not a setup conversation (isSetup: false).

Stage order: welcome → background → goals.

welcome stage:

promptRef: prompts:onboarding:welcome

extractionSpec:

memory.name (string): "The user's full name"

memory.location (string): "City and state/country"

Prompt content: "Greet the user warmly. Ask for their name and where they're based. Use the respond tool."

background stage:

promptRef: prompts:onboarding:background

extractionSpec:

memory.business_type (string): "What kind of business they run"

memory.business_age (string): "How long they've been in business"

memory.team_size (string): "Number of employees or solo"

Prompt content: "Ask about their business: what do they do, how long have they been at it, team size? Summarize before moving on. Set next_stage to goals when done. Use the respond tool."

goals stage:

promptRef: prompts:onboarding:goals

extractionSpec:

memory.top_goal (string): "Their #1 business goal right now"

memory.biggest_challenge (string): "The main obstacle to that goal"

Prompt content: "Ask the user what their #1 business goal is right now, and the biggest challenge in the way. Reflect back what you heard. Tell them you'll switch to strategy mode. Set next_stage to null. Use the respond tool."

Checkpoint: Run h-list-nodes with prefix conversations:onboarding. You should see 4 nodes (parent + 3 stages). Read the parent's data and verify:

{ "isSetup": false, "stageOrder": ["welcome", "background", "goals"] }

Read each stage's data and verify each has a promptRef and an extractionSpec with the listed fields.

3.3 — Create the `strategy` conversation¶

Create a conversation called strategy with 3 stages: diagnose, options, action-plan. Not a setup conversation. Stage order: diagnose → options → action-plan.

diagnose stage:

promptRef: prompts:strategy:diagnose

extractionSpec:

memory.current_problem (string): "The specific problem being discussed"

memory.problem_severity (string): "How urgent: low, medium, high"

Prompt: "Ask the user to describe a specific business problem. Probe: when did it start, what have they tried, how urgent? Use the respond tool."

options stage:

promptRef: prompts:strategy:options

extractionSpec:

memory.options_discussed (string): "Comma-separated list of options"

memory.preferred_option (string): "Which option the user leaned toward"

Prompt: "Suggest 2-3 concrete options. For each, give a one-sentence pro and con. Ask which resonates most. Set next_stage to action-plan when the user picks. Use the respond tool."

action-plan stage:

promptRef: prompts:strategy:action-plan

extractionSpec:

memory.next_steps (string): "Agreed next steps, semicolon-separated"

memory.timeline (string): "When they'll start and any deadlines"

Prompt: "Turn the preferred option into 2-4 concrete next steps with a rough timeline. Confirm with the user. Set next_stage to null when agreed. Use the respond tool."

Checkpoint: h-list-nodes with prefix conversations:strategy should show 4 nodes. Verify stageOrder and each stage's extractionSpec.

3.4 — Verify all prompts exist¶

List all nodes with prefix prompts.

You should see at minimum:

prompts
prompts:setup
prompts:setup:onboard
prompts:onboarding
prompts:onboarding:welcome
prompts:onboarding:background
prompts:onboarding:goals
prompts:strategy
prompts:strategy:diagnose
prompts:strategy:options
prompts:strategy:action-plan
prompts:partials
prompts:partials:metadata-spec

Each leaf prompt should have non-empty content.

Part 4 — Run chats as test personas¶

This is where the two paths diverge. Pick the tab that matches the path you chose at the top.

Path A — Portal ChatPath B — Claude Code (MCP)

Go back to the portal. Open the Sage agent detail page.

Checkpoint: The Chat tab should now be visible (the agent has AI config + non-setup conversations).

4.1 — Persona 1: Alex Chen¶

Open the Chat tab. Click New chat.

The agent should greet you (the welcome turn from the first non-setup conversation, onboarding).

Play the role of Alex Chen:

Turn	You (as Alex)	What to watch for
1	"Hi! I'm Alex Chen, based in Portland, Oregon."	Agent should extract name + location.
2	"I run a specialty coffee shop. Been at it about 2 years. Just me and two part-time baristas."	Agent should extract business_type, business_age, team_size. Stage transition from `welcome` → `background` (stage toast).
3	"My goal is to break even consistently — we're profitable some months but not others. Biggest challenge is foot traffic dropping in winter."	Agent should extract top_goal + biggest_challenge. Stage transition to `goals`.
4	(Agent should wrap up onboarding and suggest switching to strategy.)	The conversation may end here. Start a new chat and select the `strategy` conversation if the agent doesn't switch automatically.
5	"My winter foot traffic drops 40%. I've tried seasonal drinks but it didn't move the needle."	Diagnose: current_problem + problem_severity.
6	(Respond to the agent's options.) Pick whichever sounds best.	Extract options_discussed + preferred_option.
7	Confirm the action plan the agent proposes.	Extract next_steps + timeline.

Stage transitions to watch for: welcome → background → goals in the onboarding chat. diagnose → options → action-plan in the strategy chat. Each transition shows a stage toast in the UI.

4.2 — Persona 2: Maria Santos¶

Click New chat again. Play Maria — a freelance graphic designer who wants to grow beyond solo work.

Turn	You (as Maria)	What to watch for
1	"Hey, I'm Maria Santos, I'm in Austin, Texas."	Name + location extracted.
2	"I'm a freelance graphic designer. 4 years in, still solo — no employees."	Business info extracted. Stage transition.
3	"Goal: double my revenue this year. Challenge: can't take on more clients without help, but hiring feels risky."	Goal + challenge extracted.
4	Start a strategy chat. "I'm stuck doing everything myself — design, invoicing, client calls. 60-hour weeks, can't scale."	Problem + severity.
5–7	Follow the agent through options and action plan.	Full stage progression.

Maria is a different user only if you can log in as a second account. If not, both personas share the same per-user memory — the chats themselves stay separate.

4.3 — Partial data extraction test¶

Start one more chat. Withhold information deliberately:

Give your name; refuse to say where you're based ("I'd rather not say").
Give your business type; dodge the team size question.

The fields you shared should be populated; the ones you withheld should be absent or null.

Claude Code will play the chatbot (generate responses from the compiled prompt) while you type the user's messages.

Key concepts before you start¶

h-start-chat starts a chat and returns the compiled prompt + tool schema. Claude Code reads the prompt and generates the welcome message.
h-process-chat-response sends the chatbot's response back to Hadron. Pass message, data (extracted fields), next_stage.
h-send-chat-message sends the user's reply and returns an updated prompt + history for the next chatbot response.
The chatbot's response must use the respond tool shape: Claude Code generates message, data, next_stage — not free-form text.
All data goes to memory automatically. Do not use h-add-node or h-update-node to save user data — the Chat API handles it.

4.1 — Persona 1: Alex Chen (onboarding → strategy)¶

Start the onboarding chat:

Start a chat with the Sage agent for user alex-chen, conversation onboarding. Use h-start-chat.

Read the returned systemMessage and tools. Generate a welcome message as Sage. Call h-process-chat-response with:

message: your welcome text

data: {} (no data to extract yet)

next_stage: null (stay in welcome stage)

Show me the welcome message and pause.

Write down the chat ID returned by h-start-chat (e.g. chats:20260507-abc12345-onboarding). You'll need it later.

Continue through the turns:

Turn	Alex says	Extract	Transition to
1	"Hi! I'm Alex Chen, based in Portland, Oregon."	`memory.name`, `memory.location`	`background`
2	"I run a specialty coffee shop. 2 years. Me + 2 part-time baristas."	`memory.business_type`, `memory.business_age`, `memory.team_size`	`goals`
3	"Break even consistently. Winter foot traffic drops 40%."	`memory.top_goal`, `memory.biggest_challenge`	null (end)

Each turn: h-send-chat-message → generate response → h-process-chat-response. Watch for stageTransitioned: true and newStageName.

Then start a new chat for the strategy conversation:

Start a new chat with Sage for user alex-chen, conversation strategy. Generate the welcome, process it. Then continue.

Turn	Alex says	Extract	Transition to
1	Winter foot traffic problem	`memory.current_problem`, `memory.problem_severity: "high"`	`options`
2	Responds to Sage's options	`memory.options_discussed`, `memory.preferred_option`	`action-plan`
3	Confirms the action plan	`memory.next_steps`, `memory.timeline`	null (end)

4.2 — Persona 2: Maria Santos (onboarding only)¶

Start a chat with Sage for user maria-santos, conversation onboarding. Same flow as Alex.

Turn	Maria says	Extract
Welcome	(Sage greets)	—
1	"Hey, I'm Maria Santos, Austin, Texas."	`memory.name`, `memory.location`
2	"Freelance graphic designer. 4 years, solo."	`memory.business_type`, `memory.business_age`, `memory.team_size`
3	"Double revenue. Can't scale without help; hiring feels risky."	`memory.top_goal`, `memory.biggest_challenge`

4.3 — Partial data test¶

Start a chat with Sage for user partial-test, conversation onboarding.

Give your name: "I'm Pat." → extract memory.name: "Pat".
Refuse location: "I'd rather not say." → omit memory.location or send null.
Give business type: "I sell handmade candles." → extract memory.business_type.
Dodge team size: "It's complicated." → omit memory.team_size.

Part 5 — Verify the results¶

Path A — Portal ChatPath B — Claude Code (MCP)

5.1 — Check the Chat tab¶

Open the Chat tab. All your chats should be in the sidebar, newest first. Each shows a title (auto-derived from the first user message) and the conversation name. Click each chat — the full message history should be intact.

5.2 — Check per-user memories¶

Per-user memories are private and scoped to the agent. To find them, use Claude Code:

List my memories. Set the active memory to my user memory for the Sage agent.

Or read chat data directly:

List all nodes with prefix chats in my user memory for the Sage agent.

You should see chat nodes per session, with a messages child holding the transcript and a data field carrying the extracted fields.

5.3 — Check extracted data¶

For Alex Chen's onboarding chat, the chat node's data should contain:

{
  "memory.name": "Alex Chen",
  "memory.location": "Portland, Oregon",
  "memory.business_type": "specialty coffee shop",
  "memory.business_age": "2 years",
  "memory.team_size": "3 (1 owner + 2 part-time baristas)",
  "memory.top_goal": "break even consistently",
  "memory.biggest_challenge": "winter foot traffic drop"
}

For the partial-data chat, withheld fields should be absent or null.

5.4 — Check stage transitions¶

Read the chat node's data:

Onboarding chats: progression through welcome → background → goals.
Strategy chats: diagnose → options → action-plan.

If the agent didn't transition when expected:

Did the LLM include next_stage in the respond tool call?
Does the stage's extractionSpec match what the LLM returned?
Is stageOrder correct on the conversation node?

5.1 — Verify chat transcripts¶

List all nodes with prefix chats in the user memory for alex-chen.

You should see:

2 chat nodes (one onboarding, one strategy).
Under each: a messages node with the full transcript.

Read the messages node for Alex's onboarding chat.

Verify the message history contains all turns (user + assistant).

5.2 — Verify extracted data¶

Read the data of Alex's onboarding chat node.

Look for:

{
  "memory.name": "Alex Chen",
  "memory.location": "Portland, Oregon",
  "memory.business_type": "specialty coffee shop",
  "memory.business_age": "2 years",
  "memory.team_size": "3 (1 owner + 2 part-time)",
  "memory.top_goal": "break even consistently",
  "memory.biggest_challenge": "winter foot traffic drop"
}

Same for Alex's strategy chat (6 fields), Maria's onboarding (5 fields), and the partial-data chat (memory.name = "Pat", memory.business_type = "handmade candles", others absent or null).

5.3 — Verify stage transitions¶

Chat	Expected transitions
Alex onboarding	welcome → background → goals (2)
Alex strategy	diagnose → options → action-plan (2)
Maria onboarding	welcome → background → goals (2)
Partial test	welcome → background (1, may stop early)

Expected end state¶

Item	Count	How to verify
Agent	1 (Sage)	Agent detail page
System memory	1 (Sage System)	`h-list-memories`
Knowledge memory	1 (Sage Knowledge)	`h-list-memories`
Conversations	3 (setup + onboarding + strategy)	`h-list-nodes` prefix `conversations`, depth 1
Stages	3 per real conversation (6 total)	`h-list-nodes` per conversation
Prompts	6+ (one per stage)	`h-list-nodes` prefix `prompts`
Chats	4+ (2 Alex, 1–2 Maria, 1 partial)	Chat tab sidebar (Path A) / `h-list-nodes` prefix `chats` (Path B)
Per-user memories	1 per user	Created by `h-start-chat`
Extracted data	Varies per chat	`data` field on chat nodes
Stage transitions	2 per full conversation	Stage toasts (Path A) / `h-process-chat-response` return values (Path B)

Test report checklist¶

Troubleshooting¶

Path A: Chat tab doesn't appear — Three conditions must all hold: (1) system memory is set, (2) LLM provider is configured and tested, (3) at least one non-setup conversation exists (the wizard's setup conversation has isSetup: true and doesn't count).

Path B: h-list-memories shows nothing — Your app doesn't have the agent attached, or the agent doesn't have mcp in its surfaces.

Path B: h-start-chat says "agent has no system memory" — The agent's systemMemoryId isn't set. Check agent Settings.

Path B: h-start-chat says "no non-setup conversation found" — You haven't created onboarding or strategy yet, or one of them has isSetup: true. Only the wizard's setup conversation should have that flag.

Stage didn't transition — The LLM must include next_stage in the respond tool call. If the prompt is vague ("move on when ready"), the LLM may not emit the field. Make the instruction explicit: "set next_stage to <name> when done."

Extraction data is missing — Check two things: (1) the stage's extractionSpec lists the fields, (2) the LLM's response (or your h-process-chat-response call in Path B) includes those fields in data. The Chat API stores what's sent — it doesn't infer from the message text.

Path A: "No AI config saved yet" on test — Save the config before testing. The Test button can also test unsaved values — enter provider, model, key, then click Test before saving.

Path A: MCP tools don't see the agent — The agent needs mcp in its surfaces. Chatbot agents default to ["api"] only. Add mcp in Settings.

Path B: Claude Code generates weird responses — Claude Code is playing the chatbot, not being one. If responses don't match the persona, just tell it what to say. The point is testing the Chat API flow (extraction, transitions), not the quality of Claude's roleplay.

Data model reference (Path B)¶

Useful for verifying state via MCP:

Concept	Where it's stored	How to inspect
Conversations, stages	System memory, `conversations:*`	`h-list-nodes` / `h-read-node`
Prompts	System memory, `prompts:*`	`h-read-node` with `raw: true`
Extraction specs	`data.extractionSpec` on stage nodes	`h-read-node` on the stage
Chat sessions	Per-user memory, `chats:*`	`h-list-nodes` in user memory
Message history	Per-user memory, `chats:<id>:messages`	`h-read-node`
Extracted user data	`data` field on chat / memory nodes	`h-read-node`
Knowledge content	Knowledge memory, any structure	`h-list-nodes` / `h-read-node`

The content field is markdown text; the data field is structured JSON. Don't mix them up.

Automated testing with personas¶

After running through this manually, automate it with test personas. Define Alex and Maria once, re-run them whenever you change the chatbot.

See test-personas.md for the full guide. The "live test" mode (POST /api/agent-chat/test-persona) runs all turns using the agent's real LLM and produces a pass/fail report — same fidelity as Path A, but automatic.

test-personas.md — automated testing with predefined personas (free dry-run + paid live test)
conversation-routing.md — topics, goals, edges, and the routing engine
portal-chat-testing.md — portal Chat tab smoke-test checklist
building-a-chatbot-agent.md — creating a chatbot from scratch
node-types.md — when to use system vs. other node types
template-syntax.md — Mustache template resolution rules

Chatbot agent end-to-end test¶

Choose your test path¶

What you will build¶

What you need¶

Part 1 — Create the agent (Portal)¶

Part 2 — Set the active memory (Claude Code)¶

Part 3 — Design conversations (Claude Code + MCP)¶

3.1 — Verify the wizard scaffold¶

3.2 — Create the onboarding conversation¶

3.3 — Create the strategy conversation¶

3.4 — Verify all prompts exist¶

Part 4 — Run chats as test personas¶

4.1 — Persona 1: Alex Chen¶

4.2 — Persona 2: Maria Santos¶

4.3 — Partial data extraction test¶

Key concepts before you start¶

4.1 — Persona 1: Alex Chen (onboarding → strategy)¶

4.2 — Persona 2: Maria Santos (onboarding only)¶

4.3 — Partial data test¶

Part 5 — Verify the results¶

5.1 — Check the Chat tab¶

5.2 — Check per-user memories¶

5.3 — Check extracted data¶

5.4 — Check stage transitions¶

5.1 — Verify chat transcripts¶

5.2 — Verify extracted data¶

5.3 — Verify stage transitions¶

Expected end state¶

Test report checklist¶

Troubleshooting¶

Data model reference (Path B)¶

Automated testing with personas¶

Related docs¶

3.2 — Create the `onboarding` conversation¶

3.3 — Create the `strategy` conversation¶