Guides

Searching Memory

Natural-language queries, thresholds, scoping, and score interpretation.

search() takes a natural-language query and returns ranked memories.

Basic usage

results = client.search("what does the user prefer?")

for r in results.results:
    print(r.score, r.content)

Parameters

client.search(
    query,                              # required, natural language
    *,
    actor_id=None,                      # optional — filter to one actor
    limit=10,                           # max results (default 10)
    threshold=0.0,                      # minimum similarity (default 0.0)
    include_source_events=False,        # attach source events to results
)

query

Plain English. You don't need to phrase it like a keyword search — "things the user mentioned about their dog" works.

actor_id / actorId

Pass it to scope the search to one user/agent. Omit it to search across every actor in the organization. In most user-facing flows you want to pass the current actor.

limit

How many results to return. Memsy may return fewer if not enough pass the threshold. Valid range: 1–100.

threshold

A floor on the similarity score — results scoring below it are dropped. The SDK default is 0.0 (no filter), so out of the box you get the top limit results ranked by relevance.

Raise the threshold to cut noise. The right value depends on tier:

  • Free (no reranking): raw retrieval scores cluster in 0.00.1 even for strong matches. Stay near 0.00.05.
  • Pro and above (with Cohere reranking): scores are normalised to 0.01.0. A threshold around 0.30.5 becomes a meaningful "good match" floor; 0.8+ returns only near-exact matches.

Scores are not comparable across queries — 0.6 for query A is not the same thing as 0.6 for query B. Treat threshold as a relative tuning knob, not an absolute quality bar.

include_source_events / includeSourceEvents

When true, each result additionally includes the source events the memory was extracted from. Useful for showing provenance in a UI.

The result shape

results.results  # list[SearchResult]

r.id                # stable memory ID
r.content           # natural-language memory, e.g. "User prefers dark mode"
r.score             # float; higher = closer match
r.metadata          # dict | None — server-attached fields (type, kind, …)
r.strength          # float — reinforcement strength, bounded 0.0–5.0
r.source_metadata   # list[dict] — user metadata from the originating event(s)

strength is a reinforcement signal bounded 0.05.0 by the platform's policy ceiling; it rises by a small amount each time a memory is returned by search() and decays slowly when unused. Not a probability — don't normalise it into the [0, 1] range.

source_metadata carries the JSON metadata you passed at ingest, attached to the originating event(s) the memory was derived from. Up to 5 entries; each entry is {event_id, metadata: {...}} for valid JSON-object payloads or {event_id, raw: "..."} otherwise. Use it for tagging, filtering, or correlating back to your application's own records.

Reading usage and rate-limit info

Every response carries observability metadata:

results = client.search("preferences", actor_id="user_42")

if results.usage:
    print(f"Plan: {results.usage.plan}")
    print(f"Search queries: {results.usage.search_queries}/{results.usage.search_queries_limit}")

if results.rate_limit:
    print(f"Remaining this window: {results.rate_limit.remaining}")

See Usage and Rate Limits for every field.

Common patterns

Personalisation at prompt time

memories = client.search(
    query=user_message,
    actor_id=user_id,
    limit=5,
    threshold=0.4,
).results

system_preamble = "Relevant context:\n" + "\n".join(f"- {m.content}" for m in memories)

Cross-actor search (admin tools)

Leave actor_id/actorId unset and you'll search every actor in the org. Good for analytics dashboards and support tooling; almost never what you want in an end-user-facing agent loop.

Empty results

search() always returns a response object. If nothing passed the threshold, results.results is []. Handle that case in your UI.

Limitations

Search only returns memories that have finished processing. Very recent events may still be in the queue — see Async Processing.

Next

  • Run the same thing from asyncio (Python) → Async client.