Guides

Usage and Rate Limits

Track plan consumption from response headers automatically.

Every successful response from the Memsy API carries HTTP headers describing your current usage and rate-limit posture. The SDKs parse them into two typed objects on every response:

  • usage — a UsageInfo, from X-Usage-* and X-Plan headers.
  • rate_limit (Python) / rateLimit (Node) — a RateLimitInfo, from X-RateLimit-* headers.

The rate-limit triple is populated on every authenticated response (its dimension is api_calls); usage may still be None / null on routes that don't carry usage headers (e.g. /health).

Accessing usage

result = client.search("query")

if result.usage:
    print(f"Plan: {result.usage.plan}")
    print(f"API calls: {result.usage.api_calls}/{result.usage.api_calls_limit}")
    print(f"Search queries: {result.usage.search_queries}/{result.usage.search_queries_limit}")

Fields on UsageInfo (Python snake_case / Node camelCase):

PythonNodeDescription
api_callsapiCallsTotal API calls this billing period.
api_calls_limitapiCallsLimitAPI call limit for your plan.
events_ingestedeventsIngestedEvents ingested this period.
events_ingested_limiteventsIngestedLimitIngestion limit.
memory_storedmemoryStoredMemories currently stored.
memory_stored_limitmemoryStoredLimitStorage limit.
llm_tokensllmTokensLLM tokens consumed during extraction.
llm_tokens_limitllmTokensLimitToken limit.
search_queriessearchQueriesSearch queries issued this period.
search_queries_limitsearchQueriesLimitQuery limit.
planplanYour plan name (e.g. "free", "pro").

Accessing rate limits

if result.rate_limit:
    print(f"Remaining: {result.rate_limit.remaining}")
    print(f"Resets at:  {result.rate_limit.reset}")  # Unix timestamp

Fields on RateLimitInfo:

FieldDescription
limitPlan limit for the api_calls dimension this billing period.
remainingCalls remaining in the current period (clamped at 0).
resetUnix timestamp (UTC) of the next billing-period boundary — the start of the next month.

Underlying headers

HeaderField
X-Usage-ApiCallsusage.api_calls
X-Usage-ApiCalls-Limitusage.api_calls_limit
X-Usage-EventsIngestedusage.events_ingested
X-Usage-EventsIngested-Limitusage.events_ingested_limit
X-Usage-MemoriesStoredusage.memory_stored
X-Usage-MemoriesStored-Limitusage.memory_stored_limit
X-Usage-LlmTokensusage.llm_tokens
X-Usage-LlmTokens-Limitusage.llm_tokens_limit
X-Usage-SearchQueriesusage.search_queries
X-Usage-SearchQueries-Limitusage.search_queries_limit
X-Planusage.plan
X-RateLimit-Limitrate_limit.limit
X-RateLimit-Remainingrate_limit.remaining
X-RateLimit-Resetrate_limit.reset

Patterns

Log usage to your observability stack

import logging

log = logging.getLogger("memsy-usage")

def log_usage(usage):
    if usage is None:
        return
    log.info(
        "plan=%s api=%s/%s tokens=%s/%s",
        usage.plan, usage.api_calls, usage.api_calls_limit,
        usage.llm_tokens, usage.llm_tokens_limit,
    )

result = client.search("query")
log_usage(result.usage)

Alert before you hit a cap

def too_close(current, limit, ratio=0.9):
    return current is not None and limit is not None and current >= ratio * limit

if too_close(result.usage.api_calls, result.usage.api_calls_limit):
    emit_alert("Memsy API call usage above 90% of plan limit")

When usage is missing

Note

response.usage may be None on routes that don't carry usage headers (e.g. /health). That is not an error. rate_limit / rateLimit is populated on every authenticated response.

Next