Chatbot agent end-to-end test¶
A step-by-step guide that builds an agent, designs conversations, runs chats as test personas, and verifies that everything landed correctly in memory. Follow every step — the expected results at the end are specific and testable.
Choose your test path¶
This tutorial supports two ways to run the chats. The agent and conversation design (Parts 1–3) are the same; only the run-and-verify steps differ.
| Path A — Portal Chat tab | Path B — Claude Code (MCP) | |
|---|---|---|
| Plays the chatbot | Your configured LLM (Anthropic / OpenAI / GLM) | Claude Code's own model |
| Cost | LLM API costs per turn | Free |
| Iteration speed | Reload Chat tab between edits | Edit + test in same session |
| Fidelity | Exact production model | Claude Code, not the production model |
| Best for | Final validation | Iterating on prompts, stages, extraction |
Pick Path A when you have a Hadron portal account with an LLM API key and want production-fidelity validation. Pick Path B when you want a fast, free loop while you iterate. You can do both — Path B for development, Path A as a final pass.
The Path A and Path B sections are presented together using tabs in Parts 4 and 5. Switch tabs based on what you chose; the steps inside each tab are self-contained.
What you will build¶
By the end of this test you will have:
- 1 agent ("Sage") with a system memory and a knowledge memory.
- 2 conversations:
onboarding(learn about the user) andstrategy(help with a business problem). - 3 stages per conversation, each with an
extractionSpecthat pulls structured data from the chat. - 2+ personas (Alex Chen and Maria Santos), each with their own per-user memory created on first chat.
- Chat transcripts stored in each persona's memory, including message history, extracted data, summaries, and stage transitions.
What you need¶
- A Hadron portal account at hadronmemory.com with admin access to an organization.
- Claude Code installed with the Hadron MCP server connected (used in Part 3 for both paths to author conversation designs).
- An API key for one of: Anthropic, OpenAI, or GLM.
- A Workstation app (only needed for Part 3 / authoring conversations).
- That's it for Path A — chats run in the browser.
- A Workstation app with the Hadron MCP proxy connected (see Connecting an MCP host).
- You do not need an LLM API key — Claude Code plays both the user and the chatbot, no production LLM is called.
Part 1 — Create the agent (Portal)¶
- Go to your org page → Create Chatbot Agent.
- Fill in:
- Agent name:
Sage - Description:
AI mentor for small business owners - Visibility: Personal
- Agent name:
- On the System Prompt step, enter:
- On the Memories step:
- System memory: auto-named
Sage System(URN:sage-system) — leave as-is. - Knowledge memory: keep checked. Auto-named
Sage Knowledge(URN:sage-knowledge).
- System memory: auto-named
- Review and create.
- On the agent detail page → Settings tab, check that surfaces
includes
mcp(chatbot agents default to["api"]only). Addmcpif missing — Part 3 needs it.
Checkpoint: On the agent detail page you should see:
- Agent "Sage" with system memory set.
- Two memories listed:
Sage Knowledge(read-write). - The Chatbot Control tab showing one conversation (
setup) with one stage (onboard).
Configure the LLM provider on the agent.
Go to Settings → AI configuration → Configure:
- Pick your provider (e.g. OpenAI).
- Enter model (e.g.
gpt-4o-mini). - Paste your API key.
- Save, then Test. Expect "Works" with a short reply.
Checkpoint: The Settings panel now shows "Configured" with your provider, model, and a masked key. The Chat tab should NOT appear yet (no non-setup conversations exist).
Wire your Workstation app to the Sage agent so Claude Code can reach it.
- Go to Apps → your Workstation app → Settings tab → Agents.
- Add the Sage agent with the Develop role (read/write).
No LLM provider configuration is needed for Path B — Claude Code plays the chatbot itself.
Checkpoint: In Claude Code, ask:
List my memories using
h-list-memories.
You should see Sage System and Sage Knowledge.
Part 2 — Set the active memory (Claude Code)¶
Open Claude Code in the working directory of the Workstation app you configured. Set the active memory to the Sage system memory:
Set the active memory to Sage System.
All subsequent node operations in Part 3 will target this memory.
Part 3 — Design conversations (Claude Code + MCP)¶
Both paths author the conversations the same way: through Claude Code with the Hadron MCP tools.
3.1 — Verify the wizard scaffold¶
The wizard already created a setup conversation with one onboard stage.
Confirm:
List all nodes with prefix
conversations.
Expected — 3 nodes:
conversations (system)
conversations:setup (system) — data: { isSetup: true, stageOrder: ["onboard"] }
conversations:setup:onboard (system) — data: { promptRef: "prompts:setup:onboard", extractionSpec: [...] }
Read the node at
prompts:setup:onboardwithraw: true.
You should see the wizard's default onboard prompt.
3.2 — Create the onboarding conversation¶
Create a conversation called
onboardingin the Sage system memory with 3 stages:welcome,background, andgoals. This is not a setup conversation (isSetup: false).Stage order: welcome → background → goals.
welcome stage:
promptRef:prompts:onboarding:welcomeextractionSpec:
memory.name(string): "The user's full name"memory.location(string): "City and state/country"- Prompt content: "Greet the user warmly. Ask for their name and where they're based. Use the respond tool."
background stage:
promptRef:prompts:onboarding:backgroundextractionSpec:
memory.business_type(string): "What kind of business they run"memory.business_age(string): "How long they've been in business"memory.team_size(string): "Number of employees or solo"- Prompt content: "Ask about their business: what do they do, how long have they been at it, team size? Summarize before moving on. Set
next_stagetogoalswhen done. Use the respond tool."goals stage:
promptRef:prompts:onboarding:goalsextractionSpec:
memory.top_goal(string): "Their #1 business goal right now"memory.biggest_challenge(string): "The main obstacle to that goal"- Prompt content: "Ask the user what their #1 business goal is right now, and the biggest challenge in the way. Reflect back what you heard. Tell them you'll switch to strategy mode. Set
next_stageto null. Use the respond tool."
Checkpoint: Run h-list-nodes with prefix conversations:onboarding.
You should see 4 nodes (parent + 3 stages). Read the parent's data and
verify:
Read each stage's data and verify each has a promptRef and an
extractionSpec with the listed fields.
3.3 — Create the strategy conversation¶
Create a conversation called
strategywith 3 stages:diagnose,options,action-plan. Not a setup conversation. Stage order: diagnose → options → action-plan.diagnose stage:
promptRef:prompts:strategy:diagnoseextractionSpec:
memory.current_problem(string): "The specific problem being discussed"memory.problem_severity(string): "How urgent: low, medium, high"- Prompt: "Ask the user to describe a specific business problem. Probe: when did it start, what have they tried, how urgent? Use the respond tool."
options stage:
promptRef:prompts:strategy:optionsextractionSpec:
memory.options_discussed(string): "Comma-separated list of options"memory.preferred_option(string): "Which option the user leaned toward"- Prompt: "Suggest 2-3 concrete options. For each, give a one-sentence pro and con. Ask which resonates most. Set
next_stagetoaction-planwhen the user picks. Use the respond tool."action-plan stage:
promptRef:prompts:strategy:action-planextractionSpec:
memory.next_steps(string): "Agreed next steps, semicolon-separated"memory.timeline(string): "When they'll start and any deadlines"- Prompt: "Turn the preferred option into 2-4 concrete next steps with a rough timeline. Confirm with the user. Set
next_stageto null when agreed. Use the respond tool."
Checkpoint: h-list-nodes with prefix conversations:strategy should
show 4 nodes. Verify stageOrder and each stage's extractionSpec.
3.4 — Verify all prompts exist¶
List all nodes with prefix
prompts.
You should see at minimum:
prompts
prompts:setup
prompts:setup:onboard
prompts:onboarding
prompts:onboarding:welcome
prompts:onboarding:background
prompts:onboarding:goals
prompts:strategy
prompts:strategy:diagnose
prompts:strategy:options
prompts:strategy:action-plan
prompts:partials
prompts:partials:metadata-spec
Each leaf prompt should have non-empty content.
Part 4 — Run chats as test personas¶
This is where the two paths diverge. Pick the tab that matches the path you chose at the top.
Go back to the portal. Open the Sage agent detail page.
Checkpoint: The Chat tab should now be visible (the agent has AI config + non-setup conversations).
4.1 — Persona 1: Alex Chen¶
Open the Chat tab. Click New chat.
The agent should greet you (the welcome turn from the first non-setup
conversation, onboarding).
Play the role of Alex Chen:
| Turn | You (as Alex) | What to watch for |
|---|---|---|
| 1 | "Hi! I'm Alex Chen, based in Portland, Oregon." | Agent should extract name + location. |
| 2 | "I run a specialty coffee shop. Been at it about 2 years. Just me and two part-time baristas." | Agent should extract business_type, business_age, team_size. Stage transition from welcome → background (stage toast). |
| 3 | "My goal is to break even consistently — we're profitable some months but not others. Biggest challenge is foot traffic dropping in winter." | Agent should extract top_goal + biggest_challenge. Stage transition to goals. |
| 4 | (Agent should wrap up onboarding and suggest switching to strategy.) | The conversation may end here. Start a new chat and select the strategy conversation if the agent doesn't switch automatically. |
| 5 | "My winter foot traffic drops 40%. I've tried seasonal drinks but it didn't move the needle." | Diagnose: current_problem + problem_severity. |
| 6 | (Respond to the agent's options.) Pick whichever sounds best. | Extract options_discussed + preferred_option. |
| 7 | Confirm the action plan the agent proposes. | Extract next_steps + timeline. |
Stage transitions to watch for: welcome → background → goals
in the onboarding chat. diagnose → options → action-plan in the
strategy chat. Each transition shows a stage toast in the UI.
4.2 — Persona 2: Maria Santos¶
Click New chat again. Play Maria — a freelance graphic designer who wants to grow beyond solo work.
| Turn | You (as Maria) | What to watch for |
|---|---|---|
| 1 | "Hey, I'm Maria Santos, I'm in Austin, Texas." | Name + location extracted. |
| 2 | "I'm a freelance graphic designer. 4 years in, still solo — no employees." | Business info extracted. Stage transition. |
| 3 | "Goal: double my revenue this year. Challenge: can't take on more clients without help, but hiring feels risky." | Goal + challenge extracted. |
| 4 | Start a strategy chat. "I'm stuck doing everything myself — design, invoicing, client calls. 60-hour weeks, can't scale." | Problem + severity. |
| 5–7 | Follow the agent through options and action plan. | Full stage progression. |
Maria is a different user only if you can log in as a second account. If not, both personas share the same per-user memory — the chats themselves stay separate.
4.3 — Partial data extraction test¶
Start one more chat. Withhold information deliberately:
- Give your name; refuse to say where you're based ("I'd rather not say").
- Give your business type; dodge the team size question.
The fields you shared should be populated; the ones you withheld should be absent or null.
Claude Code will play the chatbot (generate responses from the compiled prompt) while you type the user's messages.
Key concepts before you start¶
h-start-chatstarts a chat and returns the compiled prompt + tool schema. Claude Code reads the prompt and generates the welcome message.h-process-chat-responsesends the chatbot's response back to Hadron. Passmessage,data(extracted fields),next_stage.h-send-chat-messagesends the user's reply and returns an updated prompt + history for the next chatbot response.- The chatbot's response must use the
respondtool shape: Claude Code generatesmessage,data,next_stage— not free-form text. - All data goes to memory automatically. Do not use
h-add-nodeorh-update-nodeto save user data — the Chat API handles it.
4.1 — Persona 1: Alex Chen (onboarding → strategy)¶
Start the onboarding chat:
Start a chat with the Sage agent for user
alex-chen, conversationonboarding. Useh-start-chat.Read the returned
systemMessageandtools. Generate a welcome message as Sage. Callh-process-chat-responsewith:
message: your welcome textdata: {} (no data to extract yet)next_stage: null (stay in welcome stage)Show me the welcome message and pause.
Write down the chat ID returned by h-start-chat (e.g.
chats:20260507-abc12345-onboarding). You'll need it later.
Continue through the turns:
| Turn | Alex says | Extract | Transition to |
|---|---|---|---|
| 1 | "Hi! I'm Alex Chen, based in Portland, Oregon." | memory.name, memory.location |
background |
| 2 | "I run a specialty coffee shop. 2 years. Me + 2 part-time baristas." | memory.business_type, memory.business_age, memory.team_size |
goals |
| 3 | "Break even consistently. Winter foot traffic drops 40%." | memory.top_goal, memory.biggest_challenge |
null (end) |
Each turn: h-send-chat-message → generate response → h-process-chat-response. Watch for stageTransitioned: true and newStageName.
Then start a new chat for the strategy conversation:
Start a new chat with Sage for user
alex-chen, conversationstrategy. Generate the welcome, process it. Then continue.
| Turn | Alex says | Extract | Transition to |
|---|---|---|---|
| 1 | Winter foot traffic problem | memory.current_problem, memory.problem_severity: "high" |
options |
| 2 | Responds to Sage's options | memory.options_discussed, memory.preferred_option |
action-plan |
| 3 | Confirms the action plan | memory.next_steps, memory.timeline |
null (end) |
4.2 — Persona 2: Maria Santos (onboarding only)¶
Start a chat with Sage for user
maria-santos, conversationonboarding. Same flow as Alex.
| Turn | Maria says | Extract |
|---|---|---|
| Welcome | (Sage greets) | — |
| 1 | "Hey, I'm Maria Santos, Austin, Texas." | memory.name, memory.location |
| 2 | "Freelance graphic designer. 4 years, solo." | memory.business_type, memory.business_age, memory.team_size |
| 3 | "Double revenue. Can't scale without help; hiring feels risky." | memory.top_goal, memory.biggest_challenge |
4.3 — Partial data test¶
Start a chat with Sage for user
partial-test, conversationonboarding.
- Give your name: "I'm Pat." → extract
memory.name: "Pat". - Refuse location: "I'd rather not say." → omit
memory.locationor send null. - Give business type: "I sell handmade candles." → extract
memory.business_type. - Dodge team size: "It's complicated." → omit
memory.team_size.
Part 5 — Verify the results¶
5.1 — Check the Chat tab¶
Open the Chat tab. All your chats should be in the sidebar, newest first. Each shows a title (auto-derived from the first user message) and the conversation name. Click each chat — the full message history should be intact.
5.2 — Check per-user memories¶
Per-user memories are private and scoped to the agent. To find them, use Claude Code:
List my memories. Set the active memory to my user memory for the Sage agent.
Or read chat data directly:
List all nodes with prefix
chatsin my user memory for the Sage agent.
You should see chat nodes per session, with a messages child holding
the transcript and a data field carrying the extracted fields.
5.3 — Check extracted data¶
For Alex Chen's onboarding chat, the chat node's data should contain:
{
"memory.name": "Alex Chen",
"memory.location": "Portland, Oregon",
"memory.business_type": "specialty coffee shop",
"memory.business_age": "2 years",
"memory.team_size": "3 (1 owner + 2 part-time baristas)",
"memory.top_goal": "break even consistently",
"memory.biggest_challenge": "winter foot traffic drop"
}
For the partial-data chat, withheld fields should be absent or null.
5.4 — Check stage transitions¶
Read the chat node's data:
- Onboarding chats: progression through
welcome→background→goals. - Strategy chats:
diagnose→options→action-plan.
If the agent didn't transition when expected:
- Did the LLM include
next_stagein the respond tool call? - Does the stage's
extractionSpecmatch what the LLM returned? - Is
stageOrdercorrect on the conversation node?
5.1 — Verify chat transcripts¶
List all nodes with prefix
chatsin the user memory foralex-chen.
You should see:
- 2 chat nodes (one onboarding, one strategy).
- Under each: a
messagesnode with the full transcript.
Read the messages node for Alex's onboarding chat.
Verify the message history contains all turns (user + assistant).
5.2 — Verify extracted data¶
Read the data of Alex's onboarding chat node.
Look for:
{
"memory.name": "Alex Chen",
"memory.location": "Portland, Oregon",
"memory.business_type": "specialty coffee shop",
"memory.business_age": "2 years",
"memory.team_size": "3 (1 owner + 2 part-time)",
"memory.top_goal": "break even consistently",
"memory.biggest_challenge": "winter foot traffic drop"
}
Same for Alex's strategy chat (6 fields), Maria's onboarding (5
fields), and the partial-data chat (memory.name = "Pat",
memory.business_type = "handmade candles", others absent or null).
5.3 — Verify stage transitions¶
| Chat | Expected transitions |
|---|---|
| Alex onboarding | welcome → background → goals (2) |
| Alex strategy | diagnose → options → action-plan (2) |
| Maria onboarding | welcome → background → goals (2) |
| Partial test | welcome → background (1, may stop early) |
Expected end state¶
| Item | Count | How to verify |
|---|---|---|
| Agent | 1 (Sage) | Agent detail page |
| System memory | 1 (Sage System) | h-list-memories |
| Knowledge memory | 1 (Sage Knowledge) | h-list-memories |
| Conversations | 3 (setup + onboarding + strategy) | h-list-nodes prefix conversations, depth 1 |
| Stages | 3 per real conversation (6 total) | h-list-nodes per conversation |
| Prompts | 6+ (one per stage) | h-list-nodes prefix prompts |
| Chats | 4+ (2 Alex, 1–2 Maria, 1 partial) | Chat tab sidebar (Path A) / h-list-nodes prefix chats (Path B) |
| Per-user memories | 1 per user | Created by h-start-chat |
| Extracted data | Varies per chat | data field on chat nodes |
| Stage transitions | 2 per full conversation | Stage toasts (Path A) / h-process-chat-response return values (Path B) |
Test report checklist¶
- Agent created with correct system + knowledge memory
-
mcpin agent surfaces - Conversations and stages match the spec above
- All 6 stage prompts created with correct content
- Each stage has
extractionSpecindata - Path A: Chat tab appeared after AI config + conversations were set up
- Path B: Workstation app has Sage agent attached with Develop role
- Alex Chen onboarding: 5 fields extracted, 2 stage transitions
- Alex Chen strategy: 6 fields extracted, 2 stage transitions
- Maria Santos onboarding: 5 fields extracted, 2 stage transitions
- Partial data chat: present fields correct, absent fields null/missing
- Chat transcripts intact (messages preserved on reload)
- Per-user memory exists and contains chat nodes
- Any bugs, unexpected behavior, or unclear steps noted
Troubleshooting¶
Path A: Chat tab doesn't appear — Three conditions must all hold:
(1) system memory is set, (2) LLM provider is configured and tested,
(3) at least one non-setup conversation exists (the wizard's setup
conversation has isSetup: true and doesn't count).
Path B: h-list-memories shows nothing — Your app doesn't have the
agent attached, or the agent doesn't have mcp in its surfaces.
Path B: h-start-chat says "agent has no system memory" — The agent's
systemMemoryId isn't set. Check agent Settings.
Path B: h-start-chat says "no non-setup conversation found" — You
haven't created onboarding or strategy yet, or one of them has
isSetup: true. Only the wizard's setup conversation should have that
flag.
Stage didn't transition — The LLM must include next_stage in the
respond tool call. If the prompt is vague ("move on when ready"), the LLM
may not emit the field. Make the instruction explicit: "set next_stage
to <name> when done."
Extraction data is missing — Check two things: (1) the stage's
extractionSpec lists the fields, (2) the LLM's response (or your
h-process-chat-response call in Path B) includes those fields in data.
The Chat API stores what's sent — it doesn't infer from the message text.
Path A: "No AI config saved yet" on test — Save the config before testing. The Test button can also test unsaved values — enter provider, model, key, then click Test before saving.
Path A: MCP tools don't see the agent — The agent needs mcp in its
surfaces. Chatbot agents default to ["api"] only. Add mcp in Settings.
Path B: Claude Code generates weird responses — Claude Code is playing the chatbot, not being one. If responses don't match the persona, just tell it what to say. The point is testing the Chat API flow (extraction, transitions), not the quality of Claude's roleplay.
Data model reference (Path B)¶
Useful for verifying state via MCP:
| Concept | Where it's stored | How to inspect |
|---|---|---|
| Conversations, stages | System memory, conversations:* |
h-list-nodes / h-read-node |
| Prompts | System memory, prompts:* |
h-read-node with raw: true |
| Extraction specs | data.extractionSpec on stage nodes |
h-read-node on the stage |
| Chat sessions | Per-user memory, chats:* |
h-list-nodes in user memory |
| Message history | Per-user memory, chats:<id>:messages |
h-read-node |
| Extracted user data | data field on chat / memory nodes |
h-read-node |
| Knowledge content | Knowledge memory, any structure | h-list-nodes / h-read-node |
The content field is markdown text; the data field is structured JSON.
Don't mix them up.
Automated testing with personas¶
After running through this manually, automate it with test personas. Define Alex and Maria once, re-run them whenever you change the chatbot.
See test-personas.md for the full guide.
The "live test" mode (POST /api/agent-chat/test-persona) runs all turns
using the agent's real LLM and produces a pass/fail report — same fidelity
as Path A, but automatic.
Related docs¶
- test-personas.md — automated testing with predefined personas (free dry-run + paid live test)
- conversation-routing.md — topics, goals, edges, and the routing engine
- portal-chat-testing.md — portal Chat tab smoke-test checklist
- building-a-chatbot-agent.md — creating a chatbot from scratch
- node-types.md — when to use
systemvs. other node types - template-syntax.md — Mustache template resolution rules