What is Rate Limiting? - Higress Technical Glossary

📖 Definition

Rate Limiting is a technique for controlling API request rates by limiting the number of requests per unit time, protecting backend services from overload and ensuring system stability and fair resource allocation. Common algorithms include token bucket, leaky bucket, sliding window, etc.

🔗 How Higress Uses This

Higress provides multi-dimensional rate limiting capabilities, supporting fine-grained rate limiting strategies based on routes, headers, and parameters, and also supports token-level rate limiting in AI scenarios.

💡 Examples

1 Each API Key allows a maximum of 100 requests per minute
2 Limit call frequency by user ID to prevent abuse
3 AI interfaces perform quota control based on token consumption

⚙️ Configuration Example

YAML

# Higress Rate Limiting Configuration Example
plugins:
  - name: request-rate-limiter
    config:
      rate: 100
      burst: 200
      key: consumer

🔄 Related Terms

Circuit Breaker

API Gateway

Token

❓ FAQ

What is Rate Limiting?