> ## Documentation Index > Fetch the complete documentation index at: https://docs.prisme.ai/llms.txt > Use this file to discover all available pages before exploring further. # Rate Limits > Understanding and working with Prisme.ai API rate limits Prisme.ai implements rate limiting to ensure platform stability and fair usage across all users. This page explains the rate limits in place, how to monitor your usage, and best practices for working within these limits. ## Rate Limit Overview Most API endpoints have rate limits applied based on: * **User or API Key**: Limits are tracked per authenticated user or API key * **Endpoint category**: Different endpoint categories have different limits * **Workspace**: Some limits are applied per workspace Most API endpoints: 100 requests per minute Create/update operations: 60 requests per minute Search and list operations: 30 requests per minute These are general guidelines. Specific endpoints may have custom rate limits based on their resource intensity. While executing workspace automations, the Runtime enforces four distinct rate limits: * Default: 100 automations per second * Burst: 400 automations * Environment Variable: `RATE_LIMIT_AUTOMATIONS` * Workspace Secret: `prismeai_ratelimit_automations` * Default: 30 emits per second * Burst: 100 emits * Environment Variable: `RATE_LIMIT_EMITS` * Workspace Secret: `prismeai_ratelimit_emits` * Default: 50 fetches per second * Burst: 200 fetches * Environment Variable: `RATE_LIMIT_FETCHS` * Workspace Secret: `prismeai_ratelimit_fetchs` * Default: 1000 iterations per second * Burst: 4000 iterations * Environment Variable: `RATE_LIMIT_REPEATS` * Workspace Secret: `prismeai_ratelimit_repeats` These rate limits are applied per workspace and per Runtime instance. They are not shared across instances in a cluster deployment. ## Understanding Runtime Rate Limits Each workspace has its own rate limits for automations, emits, fetches, and repeat loops. These rate limits are local to each Runtime instance and not shared between multiple Runtime instances. For example, with 2 Runtime instances and a workspace that has a 100 automation executions/second rate limit, the workspace might reach 200 automations/second across both instances. To fully utilize this capacity, automations should distribute workload using events. The burst rate represents the number of operations that can be executed during a momentary peak before being throttled to the normal rate. For example, an automation with a rate limit of 100/second and burst of 400 can execute 400 operations in a burst, after which it will be throttled to 100 operations per second. Rate limits are applied at different scopes: ``` Sequential execution on a single instance: ``` ```yaml theme={null} slug: thisWillBeThrottled do: - repeat: on: 2000 do: - callSomeOtherAutomation: {} ``` This automation will be throttled to the single-instance limit because all operations run on the same instance. ``` Distributed execution across instances: ``` ```yaml theme={null} do: - repeat: on: 2000 do: - emit: event: triggerSomeOtherAutomation payload: {} --- # The second automation: slug: callSomeOtherAutomation when: events: - triggerSomeOtherAutomation do: [] ``` This approach can leverage multiple instances since events can be processed by any available instance in the cluster. ## Rate Limit Headers When making API requests, rate limit information is returned in the response headers: The maximum number of requests allowed in the current time window The number of requests remaining in the current time window The time when the current rate limit window resets, in Unix epoch seconds Present only when rate limited, indicates seconds to wait before retrying ## Rate Limit Response When you exceed a rate limit, the API returns a 429 "Too Many Requests" response with details about the limit: ```json theme={null} { "error": { "code": "rate_limit_exceeded", "message": "Rate limit exceeded. Please retry after 30 seconds.", "details": { "limit": 100, "period": "60s", "retryAfter": 30 }, "requestId": "req-1234567890abcdef" } } ``` For Runtime automations, a `payload.throttled` field in the `runtime.automations.executed` event indicates the throttling duration. ## Configuration Options Rate limits can be configured globally using environment variables: | Environment Variable | Description | Default Value | | ------------------------------ | ----------------------------- | ------------- | | `RATE_LIMIT_AUTOMATIONS` | Automations per second | 100 | | `RATE_LIMIT_EMITS` | Event emits per second | 30 | | `RATE_LIMIT_FETCHS` | HTTP fetches per second | 50 | | `RATE_LIMIT_REPEATS` | Repeat iterations per second | 1000 | | `RATE_LIMIT_AUTOMATIONS_BURST` | Automations burst limit | 400 | | `RATE_LIMIT_EMITS_BURST` | Event emits burst limit | 100 | | `RATE_LIMIT_FETCHS_BURST` | HTTP fetches burst limit | 200 | | `RATE_LIMIT_REPEATS_BURST` | Repeat iterations burst limit | 4000 | | `RATE_LIMIT_DISABLED` | Disable all rate limits | false | Setting any of these environment variables to 0 disables the corresponding rate limit for all workspaces. Rate limits can be configured per workspace using workspace secrets: | Workspace Secret | Description | Default Value | | -------------------------------------- | -------------------------------------- | ------------- | | `prismeai_ratelimit_automations` | Automations per second | 100 | | `prismeai_ratelimit_emits` | Event emits per second | 30 | | `prismeai_ratelimit_fetchs` | HTTP fetches per second | 50 | | `prismeai_ratelimit_repeats` | Repeat iterations per second | 1000 | | `prismeai_ratelimit_automations_burst` | Automations burst limit | 400 | | `prismeai_ratelimit_emits_burst` | Event emits burst limit | 100 | | `prismeai_ratelimit_fetchs_burst` | HTTP fetches burst limit | 200 | | `prismeai_ratelimit_repeats_burst` | Repeat iterations burst limit | 4000 | | `prismeai_ratelimit_disabled` | Disable rate limits for this workspace | false | These workspace secrets are restricted to super admins. Regular workspace admins cannot modify these values. ## Best Practices Track your API usage and rate limit headers to understand your consumption patterns: ```javascript theme={null} function checkRateLimits(response) { const limit = response.headers.get('X-RateLimit-Limit'); const remaining = response.headers.get('X-RateLimit-Remaining'); const reset = response.headers.get('X-RateLimit-Reset'); console.log(`Rate limits: ${remaining}/${limit} remaining, reset at ${new Date(reset * 1000).toLocaleTimeString()}`); // Alert if approaching limit if (remaining && parseInt(remaining, 10) < parseInt(limit, 10) * 0.1) { console.warn('Approaching rate limit!'); } } ``` When rate limited, implement exponential backoff with jitter: ```javascript theme={null} async function apiCallWithRetry(url, options, maxRetries = 5) { let retries = 0; while (retries < maxRetries) { try { const response = await fetch(url, options); checkRateLimits(response); if (response.status === 429) { // Get retry after header or default to exponential backoff const retryAfter = response.headers.get('Retry-After'); let delay; if (retryAfter) { delay = parseInt(retryAfter, 10) * 1000; } else { // Exponential backoff with jitter delay = Math.pow(2, retries) * 1000 + Math.random() * 1000; } console.log(`Rate limited. Retrying after ${delay}ms`); await new Promise(resolve => setTimeout(resolve, delay)); retries++; continue; } return response; } catch (error) { retries++; if (retries >= maxRetries) throw error; // Exponential backoff for network errors const delay = Math.pow(2, retries) * 1000 + Math.random() * 1000; await new Promise(resolve => setTimeout(resolve, delay)); } } } ``` Design automations to distribute work effectively: 1. Use events to distribute processing across multiple Runtime instances 2. Batch operations where possible instead of making multiple single calls 3. Implement queuing for high-volume operations 4. Use parallel processing for independent operations ```yaml theme={null} # Example of batched processing do: - repeat: on: '{{workQueue}}' batch: size: 3 # Process 3 items at once interval: 500 # Pause 500ms between batches do: - process: item: '{{item}}' # Rather than: do: - repeat: on: '{{workQueue}}' do: - process: item: '{{item}}' ``` Implement caching for frequently accessed data: ```javascript theme={null} // Simple in-memory cache const cache = new Map(); async function fetchWithCache(url, options, ttlMs = 60000) { const cacheKey = `${url}:${JSON.stringify(options)}`; if (cache.has(cacheKey)) { const { data, expiry } = cache.get(cacheKey); if (expiry > Date.now()) { return data; } cache.delete(cacheKey); } const response = await fetch(url, options); const data = await response.json(); cache.set(cacheKey, { data, expiry: Date.now() + ttlMs }); return data; } ``` ## Monitoring Throttling You can monitor throttling in Runtime automations through the following methods: Each automation execution generates a `runtime.automations.executed` event that includes throttling information: ```json theme={null} { "event": "runtime.automations.executed", "payload": { "automation": "my-automation", "workspace": "my-workspace", "duration": 1250, // Total duration in milliseconds "throttled": 1000, // Time spent being throttled in milliseconds "status": "success" } } ``` If `throttled` is greater than zero, the automation was rate limited. The Prisme.ai dashboard provides metrics on automation execution, including: * Execution counts * Average duration * Throttling rates * Error rates ## Common Rate Limit Scenarios When processing large datasets, use batching and distributed processing: * Split large datasets into manageable chunks * Process chunks in parallel using events * Implement checkpointing to resume interrupted processing * Consider scheduled automations for very large datasets For systems handling many user-triggered events: * Implement client-side throttling for UI interactions * Queue events server-side for processing * Consider debouncing or deduplicating similar events * Prioritize critical user actions in your processing queue When synchronizing with external systems: * Use webhooks where possible instead of polling * Implement incremental synchronization (only changed data) * Schedule large synchronization jobs during off-peak hours * Prioritize critical data for real-time sync For generating scheduled reports or analytics: * Pre-compute and cache common metrics * Generate reports during off-peak hours * Split large reports into smaller segments * Implement progressive loading for user interfaces ## Next Steps Learn about API security best practices Understand authentication methods Learn how to handle API errors