Guides
Usage and Rate Limits
Track plan consumption from response headers automatically.
Every successful response from the Memsy API carries HTTP headers describing your current usage and rate-limit posture. The SDKs parse them into two typed objects on every response:
usage— aUsageInfo, fromX-Usage-*andX-Planheaders.rate_limit(Python) /rateLimit(Node) — aRateLimitInfo, fromX-RateLimit-*headers.
The rate-limit triple is populated on every authenticated response (its dimension is api_calls); usage may still be None / null on routes that don't carry usage headers (e.g. /health).
Accessing usage
result = client.search("query")
if result.usage:
print(f"Plan: {result.usage.plan}")
print(f"API calls: {result.usage.api_calls}/{result.usage.api_calls_limit}")
print(f"Search queries: {result.usage.search_queries}/{result.usage.search_queries_limit}")Fields on UsageInfo (Python snake_case / Node camelCase):
| Python | Node | Description |
|---|---|---|
api_calls | apiCalls | Total API calls this billing period. |
api_calls_limit | apiCallsLimit | API call limit for your plan. |
events_ingested | eventsIngested | Events ingested this period. |
events_ingested_limit | eventsIngestedLimit | Ingestion limit. |
memory_stored | memoryStored | Memories currently stored. |
memory_stored_limit | memoryStoredLimit | Storage limit. |
llm_tokens | llmTokens | LLM tokens consumed during extraction. |
llm_tokens_limit | llmTokensLimit | Token limit. |
search_queries | searchQueries | Search queries issued this period. |
search_queries_limit | searchQueriesLimit | Query limit. |
plan | plan | Your plan name (e.g. "free", "pro"). |
Accessing rate limits
if result.rate_limit:
print(f"Remaining: {result.rate_limit.remaining}")
print(f"Resets at: {result.rate_limit.reset}") # Unix timestampFields on RateLimitInfo:
| Field | Description |
|---|---|
limit | Plan limit for the api_calls dimension this billing period. |
remaining | Calls remaining in the current period (clamped at 0). |
reset | Unix timestamp (UTC) of the next billing-period boundary — the start of the next month. |
Underlying headers
| Header | Field |
|---|---|
X-Usage-ApiCalls | usage.api_calls |
X-Usage-ApiCalls-Limit | usage.api_calls_limit |
X-Usage-EventsIngested | usage.events_ingested |
X-Usage-EventsIngested-Limit | usage.events_ingested_limit |
X-Usage-MemoriesStored | usage.memory_stored |
X-Usage-MemoriesStored-Limit | usage.memory_stored_limit |
X-Usage-LlmTokens | usage.llm_tokens |
X-Usage-LlmTokens-Limit | usage.llm_tokens_limit |
X-Usage-SearchQueries | usage.search_queries |
X-Usage-SearchQueries-Limit | usage.search_queries_limit |
X-Plan | usage.plan |
X-RateLimit-Limit | rate_limit.limit |
X-RateLimit-Remaining | rate_limit.remaining |
X-RateLimit-Reset | rate_limit.reset |
Patterns
Log usage to your observability stack
import logging
log = logging.getLogger("memsy-usage")
def log_usage(usage):
if usage is None:
return
log.info(
"plan=%s api=%s/%s tokens=%s/%s",
usage.plan, usage.api_calls, usage.api_calls_limit,
usage.llm_tokens, usage.llm_tokens_limit,
)
result = client.search("query")
log_usage(result.usage)Alert before you hit a cap
def too_close(current, limit, ratio=0.9):
return current is not None and limit is not None and current >= ratio * limit
if too_close(result.usage.api_calls, result.usage.api_calls_limit):
emit_alert("Memsy API call usage above 90% of plan limit")When usage is missing
Note
response.usage may be None on routes that don't carry usage headers (e.g. /health). That is not an error. rate_limit / rateLimit is populated on every authenticated response.
Next
- What happens when you blow a cap → Error handling.
- How the SDK auto-handles 429s → Retries.