Rate Limiting

Rate Limiting

📖 Definition

Rate Limiting is a technique for controlling API request rates by limiting the number of requests per unit time, protecting backend services from overload and ensuring system stability and fair resource allocation. Common algorithms include token bucket, leaky bucket, sliding window, etc.

🔗 How Higress Uses This

Higress provides multi-dimensional rate limiting capabilities, supporting fine-grained rate limiting strategies based on routes, headers, and parameters, and also supports token-level rate limiting in AI scenarios.

💡 Examples

  • 1 Each API Key allows a maximum of 100 requests per minute
  • 2 Limit call frequency by user ID to prevent abuse
  • 3 AI interfaces perform quota control based on token consumption

⚙️ Configuration Example

YAML
# Higress Rate Limiting Configuration Example
plugins:
  - name: request-rate-limiter
    config:
      rate: 100
      burst: 200
      key: consumer

🔄 Related Terms

FAQ

What is Rate Limiting?
Rate Limiting is a technique for controlling API request rates by limiting the number of requests per unit time, protecting backend services from overload and ensuring system stability and fair resource allocation. Common algorithms include token bucket, leaky bucket, sliding window, etc.
How does Higress support Rate Limiting?
Higress provides multi-dimensional rate limiting capabilities, supporting fine-grained rate limiting strategies based on routes, headers, and parameters, and also supports token-level rate limiting in AI scenarios.

Learn More About Higress

Explore more Higress features and best practices