Skip to content

Assets vs. References

Hadron has two ways to point at content that lives outside a memory's knowledge graph:

  • References — papers, URLs, legislation, books, prior conversations. Citable sources with metadata you might cite back to from many places.
  • Assets — uploaded files (PDFs, images, scanned documents). Resources the user owns that the agent can read or hand back.

The two look similar at first glance: both are "the user has a thing, we want to point at it from the memory." But under the hood they sit in different places, and the reasons are worth knowing.

What's in the database

Aspect References Assets
Storage Node rows with nodeType: "reference" A dedicated Asset table
Cross-memory Yes — citation edges across memories No — single-memory in v1
Bytes location None (textual abstract in content) Object storage (R2 / MinIO)
Cross-references Real graph edges (cites, supports, contradicts, …) Foreign-key from a node, not edges
Search Full-text searchable Excluded from default search
Lifecycle Approximately immutable Soft-delete + 24h restore + janitor
Encryption at rest Per-node (covers data + content) Column-level on storageKey only

Why the split

The brainstorming session that settled this (in hadron-concept/brainstorming/2026-04-28-asset-upload-design/) opened with the question "should assets be a node type, like references?" and closed with three load-bearing reasons for going the other way:

  1. Normalization. References and assets have diverging schemas. Reference metadata (DOI, authors, publication year) has nothing in common with asset metadata (storage key, MIME type, scan status). Single-table inheritance with diverging subtypes is a recognized anti-pattern when the subtypes barely share fields.

  2. Scale and access patterns. A research-heavy memory may accumulate hundreds or thousands of asset uploads — the memory is dehydrated knowledge, but assets are raw material. Mixing them in the Node table inflates the table and competes with knowledge nodes for index budget. References grow more slowly and look more like the rest of the knowledge graph in their access patterns.

  3. Containment is foreign-key-shaped, not graph-shaped. Citation chains are real graph traversals — paper A cites paper B which contradicts paper C. Asset relationships are typically containment ("this recipe uses that image") — a foreign key from the recipe to the asset, with an optional label. Treating it as a graph edge buys generality the use case doesn't need and pays storage + query cost we definitely do.

What this means for builders

  • A reference can be cited from anywhere — many memories, many nodes, traversable as a graph. If you want a paper to appear in a citation network, it's a reference.
  • An asset belongs to one memory — the user who uploaded it. Its existence is tied to their per-agent memory; deleting the memory deletes the assets.
  • You don't need to choose between them at upload time. The asset-upload tool always creates an asset. A future "promote this to a reference" flow will create a reference node citing the asset's URN — composing the two primitives without conflating them.