Searching Memory

search() takes a natural-language query and returns ranked memories.

Basic usage

results = client.search("what does the user prefer?")

for r in results.results:
    print(r.score, r.content)

const results = await client.search('what does the user prefer?');

for (const r of results.results) {
  console.log(r.score, r.content);
}

curl -X POST "$MEMSY_BASE_URL/search" \
  -H "Authorization: Bearer $MEMSY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "what does the user prefer?"}'

Parameters

client.search(
    query,                              # required, natural language
    *,
    actor_id=None,                      # optional — filter to one actor
    limit=10,                           # max results (default 10)
    threshold=0.0,                      # minimum similarity (default 0.0)
    include_source_events=False,        # attach source events to results
)

client.search(query, {
  actorId,                  // optional — filter to one actor
  limit: 10,                // max results (default 10)
  threshold: 0.0,           // minimum similarity (default 0.0)
  includeSourceEvents: false,
});

{
  "query": "natural language",
  "actor_id": "user_42",
  "limit": 10,
  "threshold": 0.0,
  "include_source_events": false
}

`query`

Plain English. You don't need to phrase it like a keyword search — "things the user mentioned about their dog" works.

`actor_id` / `actorId`

Pass it to scope the search to one user/agent. Omit it to search across every actor in the organization. In most user-facing flows you want to pass the current actor.

`limit`

How many results to return. Memsy may return fewer if not enough pass the threshold. Valid range: 1–100.

`threshold`

A floor on the similarity score — results scoring below it are dropped. The SDK default is 0.0 (no filter), so out of the box you get the top limit results ranked by relevance.

Raise the threshold to cut noise. The right value depends on tier:

Free (no reranking): raw retrieval scores cluster in 0.0–0.1 even for strong matches. Stay near 0.0–0.05.
Pro and above (with Cohere reranking): scores are normalised to 0.0–1.0. A threshold around 0.3–0.5 becomes a meaningful "good match" floor; 0.8+ returns only near-exact matches.

Scores are not comparable across queries — 0.6 for query A is not the same thing as 0.6 for query B. Treat threshold as a relative tuning knob, not an absolute quality bar.

`include_source_events` / `includeSourceEvents`

When true, each result additionally includes the source events the memory was extracted from. Useful for showing provenance in a UI.

The result shape

results.results  # list[SearchResult]

r.id                # stable memory ID
r.content           # natural-language memory, e.g. "User prefers dark mode"
r.score             # float; higher = closer match
r.metadata          # dict | None — server-attached fields (type, kind, …)
r.strength          # float — reinforcement strength, bounded 0.0–5.0
r.source_metadata   # list[dict] — user metadata from the originating event(s)

results.results;  // SearchResult[]

r.id;              // stable memory ID
r.content;         // natural-language memory
r.score;           // number; higher = closer match
r.metadata;        // Record<string, unknown> | null
r.sourceMetadata;  // SourceMetadata[] | null — user metadata from source events

{
  "results": [
    {
      "id": "01JK...",
      "content": "User prefers dark mode across apps.",
      "score": 0.84,
      "metadata": {
        "type": "preference",
        "strength": 1.5,
        "source_metadata": [
          { "event_id": "01JK...", "metadata": { "plan": "pro" } }
        ]
      }
    }
  ]
}

strength is a reinforcement signal bounded 0.0–5.0 by the platform's policy ceiling; it rises by a small amount each time a memory is returned by search() and decays slowly when unused. Not a probability — don't normalise it into the [0, 1] range.

source_metadata carries the JSON metadata you passed at ingest, attached to the originating event(s) the memory was derived from. Up to 5 entries; each entry is {event_id, metadata: {...}} for valid JSON-object payloads or {event_id, raw: "..."} otherwise. Use it for tagging, filtering, or correlating back to your application's own records.

Reading usage and rate-limit info

Every response carries observability metadata:

results = client.search("preferences", actor_id="user_42")

if results.usage:
    print(f"Plan: {results.usage.plan}")
    print(f"Search queries: {results.usage.search_queries}/{results.usage.search_queries_limit}")

if results.rate_limit:
    print(f"Remaining this window: {results.rate_limit.remaining}")

const results = await client.search('preferences', { actorId: 'user_42' });

if (results.usage) {
  console.log(`Plan: ${results.usage.plan}`);
  console.log(
    `Search queries: ${results.usage.searchQueries}/${results.usage.searchQueriesLimit}`,
  );
}

if (results.rateLimit) {
  console.log(`Remaining this window: ${results.rateLimit.remaining}`);
}

# Read response headers — same data the SDKs parse for you.
curl -i -X POST "$MEMSY_BASE_URL/search" \
  -H "Authorization: Bearer $MEMSY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "preferences", "actor_id": "user_42"}' | grep -E "X-(Usage|Plan|RateLimit)-"

See Usage and Rate Limits for every field.

Common patterns

Personalisation at prompt time

memories = client.search(
    query=user_message,
    actor_id=user_id,
    limit=5,
    threshold=0.4,
).results

system_preamble = "Relevant context:\n" + "\n".join(f"- {m.content}" for m in memories)

const { results: memories } = await client.search(userMessage, {
  actorId: userId,
  limit: 5,
  threshold: 0.4,
});

const systemPreamble =
  'Relevant context:\n' + memories.map((m) => `- ${m.content}`).join('\n');

# In a real app, this lives in your backend code — cURL example shown for clarity only.
curl -X POST "$MEMSY_BASE_URL/search" \
  -H "Authorization: Bearer $MEMSY_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"query\": \"$USER_MESSAGE\", \"actor_id\": \"$USER_ID\", \"limit\": 5, \"threshold\": 0.4}"

Run the same thing from asyncio (Python) → Async client.