Skip to content

Rate Limiting Service Specification

The SmartSRED application implements a comprehensive dual-layer rate limiting and quota management system to protect backend resources, ensure fair usage, and support tier-based differentiation (FREE vs PRO users).

Layer 1: Short-term Rate Limiting (Redis + Redisson)

  • Purpose: Handle transient traffic spikes and prevent DDoS or brute-force attacks at the perimeter.
  • Technology: Redisson RRateLimiter using the Token Bucket algorithm.
  • Storage: Redis (in-memory, distributed). All limit definitions are centralized in the com.sred.ai.SREDSimplify.common.redis.* module.

Base Configuration Limits

Rate limit settings are defined dynamically and can be overridden via ConfigCat Feature Flags. The default strict limits are applied based on IP address (PER_IP) for unauthenticated routes, and by USER_ID for authenticated actions.

Limit TypeDefault LimitDimensionWindow
RATE_AUTH_LOGIN5 requestsPER_IP60s
RATE_AUTH_REGISTER3 requestsPER_IP60s
RATE_AUTH_REFRESH30 requestsPER_IP60s
RATE_AI_PRECHECK10 requestsPER_USER60s

Note: If an IP exceeds the authentication limits, Redisson will return an error resulting in a standard 429 Too Many Requests response from the Spring Controller.

Layer 2: Long-term Quota Management (PostgreSQL)

  • Purpose: Track cumulative resource usage over extended periods (e.g., monthly limits on T661 generations) tied to billing.
  • Technology: JPA entities (Quota) tracking numerical thresholds.
  • Storage: PostgreSQL database ensuring transactional integrity when credits are consumed.

Enforcement Strategy

Before executing heavy AI tasks or Python service delegations, the QuotaService intervenes:

  1. Determines the user's current subscription tier (e.g., FREE_USER, STARTER_USER, ENTERPRISE_USER).
  2. Checks the usage counters for the requested operation in the current billing cycle.
  3. If exceeded, the operation is blocked entirely, and a specific business error indicating quota exhaustion is returned to the frontend.