Skip to content

Chatbot agent end-to-end test

A step-by-step guide that builds an agent, designs conversations, runs chats as test personas, and verifies that everything landed correctly in memory. Follow every step — the expected results at the end are specific and testable.

Choose your test path

This tutorial supports two ways to run the chats. The agent and conversation design (Parts 1–3) are the same; only the run-and-verify steps differ.

Path A — Portal Chat tab Path B — Claude Code (MCP)
Plays the chatbot Your configured LLM (Anthropic / OpenAI / GLM) Claude Code's own model
Cost LLM API costs per turn Free
Iteration speed Reload Chat tab between edits Edit + test in same session
Fidelity Exact production model Claude Code, not the production model
Best for Final validation Iterating on prompts, stages, extraction

Pick Path A when you have a Hadron portal account with an LLM API key and want production-fidelity validation. Pick Path B when you want a fast, free loop while you iterate. You can do both — Path B for development, Path A as a final pass.

The Path A and Path B sections are presented together using tabs in Parts 4 and 5. Switch tabs based on what you chose; the steps inside each tab are self-contained.

What you will build

By the end of this test you will have:

  • 1 agent ("Sage") with a system memory and a knowledge memory.
  • 2 conversations: onboarding (learn about the user) and strategy (help with a business problem).
  • 3 stages per conversation, each with an extractionSpec that pulls structured data from the chat.
  • 2+ personas (Alex Chen and Maria Santos), each with their own per-user memory created on first chat.
  • Chat transcripts stored in each persona's memory, including message history, extracted data, summaries, and stage transitions.

What you need

  • A Hadron portal account at hadronmemory.com with admin access to an organization.
  • Claude Code installed with the Hadron MCP server connected (used in Part 3 for both paths to author conversation designs).
  • An API key for one of: Anthropic, OpenAI, or GLM.
  • A Workstation app (only needed for Part 3 / authoring conversations).
  • That's it for Path A — chats run in the browser.
  • A Workstation app with the Hadron MCP proxy connected (see Connecting an MCP host).
  • You do not need an LLM API key — Claude Code plays both the user and the chatbot, no production LLM is called.

Part 1 — Create the agent (Portal)

  1. Go to your org page → Create Chatbot Agent.
  2. Fill in:
    • Agent name: Sage
    • Description: AI mentor for small business owners
    • Visibility: Personal
  3. On the System Prompt step, enter:
    You are Sage, a warm and practical AI mentor for small business
    owners. You ask clarifying questions before giving advice. You are
    direct but encouraging. When you don't know something, say so.
    
  4. On the Memories step:
    • System memory: auto-named Sage System (URN: sage-system) — leave as-is.
    • Knowledge memory: keep checked. Auto-named Sage Knowledge (URN: sage-knowledge).
  5. Review and create.
  6. On the agent detail page → Settings tab, check that surfaces includes mcp (chatbot agents default to ["api"] only). Add mcp if missing — Part 3 needs it.

Checkpoint: On the agent detail page you should see:

  • Agent "Sage" with system memory set.
  • Two memories listed: Sage Knowledge (read-write).
  • The Chatbot Control tab showing one conversation (setup) with one stage (onboard).

Configure the LLM provider on the agent.

Go to SettingsAI configurationConfigure:

  • Pick your provider (e.g. OpenAI).
  • Enter model (e.g. gpt-4o-mini).
  • Paste your API key.
  • Save, then Test. Expect "Works" with a short reply.

Checkpoint: The Settings panel now shows "Configured" with your provider, model, and a masked key. The Chat tab should NOT appear yet (no non-setup conversations exist).

Wire your Workstation app to the Sage agent so Claude Code can reach it.

  1. Go to Apps → your Workstation app → Settings tab → Agents.
  2. Add the Sage agent with the Develop role (read/write).

No LLM provider configuration is needed for Path B — Claude Code plays the chatbot itself.

Checkpoint: In Claude Code, ask:

List my memories using h-list-memories.

You should see Sage System and Sage Knowledge.


Part 2 — Set the active memory (Claude Code)

Open Claude Code in the working directory of the Workstation app you configured. Set the active memory to the Sage system memory:

Set the active memory to Sage System.

All subsequent node operations in Part 3 will target this memory.


Part 3 — Design conversations (Claude Code + MCP)

Both paths author the conversations the same way: through Claude Code with the Hadron MCP tools.

3.1 — Verify the wizard scaffold

The wizard already created a setup conversation with one onboard stage. Confirm:

List all nodes with prefix conversations.

Expected — 3 nodes:

conversations             (system)
conversations:setup       (system) — data: { isSetup: true, stageOrder: ["onboard"] }
conversations:setup:onboard  (system) — data: { promptRef: "prompts:setup:onboard", extractionSpec: [...] }

Read the node at prompts:setup:onboard with raw: true.

You should see the wizard's default onboard prompt.

3.2 — Create the onboarding conversation

Create a conversation called onboarding in the Sage system memory with 3 stages: welcome, background, and goals. This is not a setup conversation (isSetup: false).

Stage order: welcome → background → goals.

welcome stage:

  • promptRef: prompts:onboarding:welcome
  • extractionSpec:
    • memory.name (string): "The user's full name"
    • memory.location (string): "City and state/country"
  • Prompt content: "Greet the user warmly. Ask for their name and where they're based. Use the respond tool."

background stage:

  • promptRef: prompts:onboarding:background
  • extractionSpec:
    • memory.business_type (string): "What kind of business they run"
    • memory.business_age (string): "How long they've been in business"
    • memory.team_size (string): "Number of employees or solo"
  • Prompt content: "Ask about their business: what do they do, how long have they been at it, team size? Summarize before moving on. Set next_stage to goals when done. Use the respond tool."

goals stage:

  • promptRef: prompts:onboarding:goals
  • extractionSpec:
    • memory.top_goal (string): "Their #1 business goal right now"
    • memory.biggest_challenge (string): "The main obstacle to that goal"
  • Prompt content: "Ask the user what their #1 business goal is right now, and the biggest challenge in the way. Reflect back what you heard. Tell them you'll switch to strategy mode. Set next_stage to null. Use the respond tool."

Checkpoint: Run h-list-nodes with prefix conversations:onboarding. You should see 4 nodes (parent + 3 stages). Read the parent's data and verify:

{ "isSetup": false, "stageOrder": ["welcome", "background", "goals"] }

Read each stage's data and verify each has a promptRef and an extractionSpec with the listed fields.

3.3 — Create the strategy conversation

Create a conversation called strategy with 3 stages: diagnose, options, action-plan. Not a setup conversation. Stage order: diagnose → options → action-plan.

diagnose stage:

  • promptRef: prompts:strategy:diagnose
  • extractionSpec:
    • memory.current_problem (string): "The specific problem being discussed"
    • memory.problem_severity (string): "How urgent: low, medium, high"
  • Prompt: "Ask the user to describe a specific business problem. Probe: when did it start, what have they tried, how urgent? Use the respond tool."

options stage:

  • promptRef: prompts:strategy:options
  • extractionSpec:
    • memory.options_discussed (string): "Comma-separated list of options"
    • memory.preferred_option (string): "Which option the user leaned toward"
  • Prompt: "Suggest 2-3 concrete options. For each, give a one-sentence pro and con. Ask which resonates most. Set next_stage to action-plan when the user picks. Use the respond tool."

action-plan stage:

  • promptRef: prompts:strategy:action-plan
  • extractionSpec:
    • memory.next_steps (string): "Agreed next steps, semicolon-separated"
    • memory.timeline (string): "When they'll start and any deadlines"
  • Prompt: "Turn the preferred option into 2-4 concrete next steps with a rough timeline. Confirm with the user. Set next_stage to null when agreed. Use the respond tool."

Checkpoint: h-list-nodes with prefix conversations:strategy should show 4 nodes. Verify stageOrder and each stage's extractionSpec.

3.4 — Verify all prompts exist

List all nodes with prefix prompts.

You should see at minimum:

prompts
prompts:setup
prompts:setup:onboard
prompts:onboarding
prompts:onboarding:welcome
prompts:onboarding:background
prompts:onboarding:goals
prompts:strategy
prompts:strategy:diagnose
prompts:strategy:options
prompts:strategy:action-plan
prompts:partials
prompts:partials:metadata-spec

Each leaf prompt should have non-empty content.


Part 4 — Run chats as test personas

This is where the two paths diverge. Pick the tab that matches the path you chose at the top.

Go back to the portal. Open the Sage agent detail page.

Checkpoint: The Chat tab should now be visible (the agent has AI config + non-setup conversations).

4.1 — Persona 1: Alex Chen

Open the Chat tab. Click New chat.

The agent should greet you (the welcome turn from the first non-setup conversation, onboarding).

Play the role of Alex Chen:

Turn You (as Alex) What to watch for
1 "Hi! I'm Alex Chen, based in Portland, Oregon." Agent should extract name + location.
2 "I run a specialty coffee shop. Been at it about 2 years. Just me and two part-time baristas." Agent should extract business_type, business_age, team_size. Stage transition from welcomebackground (stage toast).
3 "My goal is to break even consistently — we're profitable some months but not others. Biggest challenge is foot traffic dropping in winter." Agent should extract top_goal + biggest_challenge. Stage transition to goals.
4 (Agent should wrap up onboarding and suggest switching to strategy.) The conversation may end here. Start a new chat and select the strategy conversation if the agent doesn't switch automatically.
5 "My winter foot traffic drops 40%. I've tried seasonal drinks but it didn't move the needle." Diagnose: current_problem + problem_severity.
6 (Respond to the agent's options.) Pick whichever sounds best. Extract options_discussed + preferred_option.
7 Confirm the action plan the agent proposes. Extract next_steps + timeline.

Stage transitions to watch for: welcomebackgroundgoals in the onboarding chat. diagnoseoptionsaction-plan in the strategy chat. Each transition shows a stage toast in the UI.

4.2 — Persona 2: Maria Santos

Click New chat again. Play Maria — a freelance graphic designer who wants to grow beyond solo work.

Turn You (as Maria) What to watch for
1 "Hey, I'm Maria Santos, I'm in Austin, Texas." Name + location extracted.
2 "I'm a freelance graphic designer. 4 years in, still solo — no employees." Business info extracted. Stage transition.
3 "Goal: double my revenue this year. Challenge: can't take on more clients without help, but hiring feels risky." Goal + challenge extracted.
4 Start a strategy chat. "I'm stuck doing everything myself — design, invoicing, client calls. 60-hour weeks, can't scale." Problem + severity.
5–7 Follow the agent through options and action plan. Full stage progression.

Maria is a different user only if you can log in as a second account. If not, both personas share the same per-user memory — the chats themselves stay separate.

4.3 — Partial data extraction test

Start one more chat. Withhold information deliberately:

  • Give your name; refuse to say where you're based ("I'd rather not say").
  • Give your business type; dodge the team size question.

The fields you shared should be populated; the ones you withheld should be absent or null.

Claude Code will play the chatbot (generate responses from the compiled prompt) while you type the user's messages.

Key concepts before you start

  • h-start-chat starts a chat and returns the compiled prompt + tool schema. Claude Code reads the prompt and generates the welcome message.
  • h-process-chat-response sends the chatbot's response back to Hadron. Pass message, data (extracted fields), next_stage.
  • h-send-chat-message sends the user's reply and returns an updated prompt + history for the next chatbot response.
  • The chatbot's response must use the respond tool shape: Claude Code generates message, data, next_stage — not free-form text.
  • All data goes to memory automatically. Do not use h-add-node or h-update-node to save user data — the Chat API handles it.

4.1 — Persona 1: Alex Chen (onboarding → strategy)

Start the onboarding chat:

Start a chat with the Sage agent for user alex-chen, conversation onboarding. Use h-start-chat.

Read the returned systemMessage and tools. Generate a welcome message as Sage. Call h-process-chat-response with:

  • message: your welcome text
  • data: {} (no data to extract yet)
  • next_stage: null (stay in welcome stage)

Show me the welcome message and pause.

Write down the chat ID returned by h-start-chat (e.g. chats:20260507-abc12345-onboarding). You'll need it later.

Continue through the turns:

Turn Alex says Extract Transition to
1 "Hi! I'm Alex Chen, based in Portland, Oregon." memory.name, memory.location background
2 "I run a specialty coffee shop. 2 years. Me + 2 part-time baristas." memory.business_type, memory.business_age, memory.team_size goals
3 "Break even consistently. Winter foot traffic drops 40%." memory.top_goal, memory.biggest_challenge null (end)

Each turn: h-send-chat-message → generate response → h-process-chat-response. Watch for stageTransitioned: true and newStageName.

Then start a new chat for the strategy conversation:

Start a new chat with Sage for user alex-chen, conversation strategy. Generate the welcome, process it. Then continue.

Turn Alex says Extract Transition to
1 Winter foot traffic problem memory.current_problem, memory.problem_severity: "high" options
2 Responds to Sage's options memory.options_discussed, memory.preferred_option action-plan
3 Confirms the action plan memory.next_steps, memory.timeline null (end)

4.2 — Persona 2: Maria Santos (onboarding only)

Start a chat with Sage for user maria-santos, conversation onboarding. Same flow as Alex.

Turn Maria says Extract
Welcome (Sage greets)
1 "Hey, I'm Maria Santos, Austin, Texas." memory.name, memory.location
2 "Freelance graphic designer. 4 years, solo." memory.business_type, memory.business_age, memory.team_size
3 "Double revenue. Can't scale without help; hiring feels risky." memory.top_goal, memory.biggest_challenge

4.3 — Partial data test

Start a chat with Sage for user partial-test, conversation onboarding.

  • Give your name: "I'm Pat." → extract memory.name: "Pat".
  • Refuse location: "I'd rather not say." → omit memory.location or send null.
  • Give business type: "I sell handmade candles." → extract memory.business_type.
  • Dodge team size: "It's complicated." → omit memory.team_size.

Part 5 — Verify the results

5.1 — Check the Chat tab

Open the Chat tab. All your chats should be in the sidebar, newest first. Each shows a title (auto-derived from the first user message) and the conversation name. Click each chat — the full message history should be intact.

5.2 — Check per-user memories

Per-user memories are private and scoped to the agent. To find them, use Claude Code:

List my memories. Set the active memory to my user memory for the Sage agent.

Or read chat data directly:

List all nodes with prefix chats in my user memory for the Sage agent.

You should see chat nodes per session, with a messages child holding the transcript and a data field carrying the extracted fields.

5.3 — Check extracted data

For Alex Chen's onboarding chat, the chat node's data should contain:

{
  "memory.name": "Alex Chen",
  "memory.location": "Portland, Oregon",
  "memory.business_type": "specialty coffee shop",
  "memory.business_age": "2 years",
  "memory.team_size": "3 (1 owner + 2 part-time baristas)",
  "memory.top_goal": "break even consistently",
  "memory.biggest_challenge": "winter foot traffic drop"
}

For the partial-data chat, withheld fields should be absent or null.

5.4 — Check stage transitions

Read the chat node's data:

  • Onboarding chats: progression through welcomebackgroundgoals.
  • Strategy chats: diagnoseoptionsaction-plan.

If the agent didn't transition when expected:

  1. Did the LLM include next_stage in the respond tool call?
  2. Does the stage's extractionSpec match what the LLM returned?
  3. Is stageOrder correct on the conversation node?

5.1 — Verify chat transcripts

List all nodes with prefix chats in the user memory for alex-chen.

You should see:

  • 2 chat nodes (one onboarding, one strategy).
  • Under each: a messages node with the full transcript.

Read the messages node for Alex's onboarding chat.

Verify the message history contains all turns (user + assistant).

5.2 — Verify extracted data

Read the data of Alex's onboarding chat node.

Look for:

{
  "memory.name": "Alex Chen",
  "memory.location": "Portland, Oregon",
  "memory.business_type": "specialty coffee shop",
  "memory.business_age": "2 years",
  "memory.team_size": "3 (1 owner + 2 part-time)",
  "memory.top_goal": "break even consistently",
  "memory.biggest_challenge": "winter foot traffic drop"
}

Same for Alex's strategy chat (6 fields), Maria's onboarding (5 fields), and the partial-data chat (memory.name = "Pat", memory.business_type = "handmade candles", others absent or null).

5.3 — Verify stage transitions

Chat Expected transitions
Alex onboarding welcome → background → goals (2)
Alex strategy diagnose → options → action-plan (2)
Maria onboarding welcome → background → goals (2)
Partial test welcome → background (1, may stop early)

Expected end state

Item Count How to verify
Agent 1 (Sage) Agent detail page
System memory 1 (Sage System) h-list-memories
Knowledge memory 1 (Sage Knowledge) h-list-memories
Conversations 3 (setup + onboarding + strategy) h-list-nodes prefix conversations, depth 1
Stages 3 per real conversation (6 total) h-list-nodes per conversation
Prompts 6+ (one per stage) h-list-nodes prefix prompts
Chats 4+ (2 Alex, 1–2 Maria, 1 partial) Chat tab sidebar (Path A) / h-list-nodes prefix chats (Path B)
Per-user memories 1 per user Created by h-start-chat
Extracted data Varies per chat data field on chat nodes
Stage transitions 2 per full conversation Stage toasts (Path A) / h-process-chat-response return values (Path B)

Test report checklist

  • Agent created with correct system + knowledge memory
  • mcp in agent surfaces
  • Conversations and stages match the spec above
  • All 6 stage prompts created with correct content
  • Each stage has extractionSpec in data
  • Path A: Chat tab appeared after AI config + conversations were set up
  • Path B: Workstation app has Sage agent attached with Develop role
  • Alex Chen onboarding: 5 fields extracted, 2 stage transitions
  • Alex Chen strategy: 6 fields extracted, 2 stage transitions
  • Maria Santos onboarding: 5 fields extracted, 2 stage transitions
  • Partial data chat: present fields correct, absent fields null/missing
  • Chat transcripts intact (messages preserved on reload)
  • Per-user memory exists and contains chat nodes
  • Any bugs, unexpected behavior, or unclear steps noted

Troubleshooting

Path A: Chat tab doesn't appear — Three conditions must all hold: (1) system memory is set, (2) LLM provider is configured and tested, (3) at least one non-setup conversation exists (the wizard's setup conversation has isSetup: true and doesn't count).

Path B: h-list-memories shows nothing — Your app doesn't have the agent attached, or the agent doesn't have mcp in its surfaces.

Path B: h-start-chat says "agent has no system memory" — The agent's systemMemoryId isn't set. Check agent Settings.

Path B: h-start-chat says "no non-setup conversation found" — You haven't created onboarding or strategy yet, or one of them has isSetup: true. Only the wizard's setup conversation should have that flag.

Stage didn't transition — The LLM must include next_stage in the respond tool call. If the prompt is vague ("move on when ready"), the LLM may not emit the field. Make the instruction explicit: "set next_stage to <name> when done."

Extraction data is missing — Check two things: (1) the stage's extractionSpec lists the fields, (2) the LLM's response (or your h-process-chat-response call in Path B) includes those fields in data. The Chat API stores what's sent — it doesn't infer from the message text.

Path A: "No AI config saved yet" on test — Save the config before testing. The Test button can also test unsaved values — enter provider, model, key, then click Test before saving.

Path A: MCP tools don't see the agent — The agent needs mcp in its surfaces. Chatbot agents default to ["api"] only. Add mcp in Settings.

Path B: Claude Code generates weird responses — Claude Code is playing the chatbot, not being one. If responses don't match the persona, just tell it what to say. The point is testing the Chat API flow (extraction, transitions), not the quality of Claude's roleplay.

Data model reference (Path B)

Useful for verifying state via MCP:

Concept Where it's stored How to inspect
Conversations, stages System memory, conversations:* h-list-nodes / h-read-node
Prompts System memory, prompts:* h-read-node with raw: true
Extraction specs data.extractionSpec on stage nodes h-read-node on the stage
Chat sessions Per-user memory, chats:* h-list-nodes in user memory
Message history Per-user memory, chats:<id>:messages h-read-node
Extracted user data data field on chat / memory nodes h-read-node
Knowledge content Knowledge memory, any structure h-list-nodes / h-read-node

The content field is markdown text; the data field is structured JSON. Don't mix them up.

Automated testing with personas

After running through this manually, automate it with test personas. Define Alex and Maria once, re-run them whenever you change the chatbot.

See test-personas.md for the full guide. The "live test" mode (POST /api/agent-chat/test-persona) runs all turns using the agent's real LLM and produces a pass/fail report — same fidelity as Path A, but automatic.