AI Data Masking
Function Description
Interception and replacement of sensitive words in requests/responses
Data Handling Scope
- openai protocol: Request/response conversation content
- jsonpath: Only process specified fields
- raw: Entire request/response body
Sensitive Word Interception
- Directly intercept sensitive words in the data handling scope and return preset error messages
- Supports system’s built-in sensitive word library and custom sensitive words
Sensitive Word Replacement
- Replace sensitive words in request data with masked strings before passing to back-end services. Ensures that sensitive data does not leave the domain
- Some masked data can be restored after being returned by the back-end service
- Custom rules support standard regular expressions and grok rules, and replacement strings support variable substitution
Execution Properties
Plugin Execution Phase: Authentication Phase
Plugin Execution Priority: 991
Configuration Fields
Name | Data Type | Default Value | Description |
---|---|---|---|
deny_openai | bool | true | Intercept openai protocol |
deny_jsonpath | string | [] | Intercept specified jsonpath |
deny_raw | bool | false | Intercept raw body |
system_deny | bool | true | Enable built-in interception rules |
deny_code | int | 200 | HTTP status code when intercepted |
deny_message | string | Sensitive words found in the question or answer have been blocked | AI returned message when intercepted |
deny_raw_message | string | {“errmsg”:“Sensitive words found in the question or answer have been blocked”} | Content returned when not openai intercepted |
deny_content_type | string | application/json | Content type header returned when not openai intercepted |
deny_words | array of string | [] | Custom sensitive word list |
replace_roles | array | - | Custom sensitive word regex replacement |
replace_roles.regex | string | - | Rule regex (built-in GROK rule) |
replace_roles.type | [replace, hash] | - | Replacement type |
replace_roles.restore | bool | false | Whether to restore |
replace_roles.value | string | - | Replacement value (supports regex variables) |
Configuration Example
Sensitive Word Replacement Example
User Request Content
Please change curl http://172.20.5.14/api/openai/v1/chat/completions -H "Authorization: sk-12345" -H "Auth: test@gmail.com"
to POST method
Processed Request Large Model Content
curl http://***.***.***.***/api/openai/v1/chat/completions -H "Authorization: 48a7e98a91d93896d8dac522c5853948" -H "Auth: ****@gmail.com"
change to POST method
Large Model Returned Content
You want to convert a curl
GET request to a POST request, and this request is sending data to a specific API. Below is the modified curl
command to send as POST:
Here are the following modifications made:
-X POST
sets the request method to POST.-H "Content-Type: application/json"
sets theContent-Type
in the request header toapplication/json
, which is typically used to inform the server that the data you are sending is in JSON format.-d '{"key":"value"}'
sets the data to be sent, where'{"key":"value"}'
is a simple example of a JSON object. You need to replace it with the actual data you want to send.
Please note that you need to replace "key":"value"
with the actual data content you want to send. If your API accepts a different data structure or requires specific fields, please adjust this part according to your actual situation.
Processed Return to User Content
You want to convert a curl
GET request to a POST request, and this request is sending data to a specific API. Below is the modified curl
command to send as POST:
Here are the following modifications made:
-X POST
sets the request method to POST.-H "Content-Type: application/json"
sets theContent-Type
in the request header toapplication/json
, which is typically used to inform the server that the data you are sending is in JSON format.-d '{"key":"value"}'
sets the data to be sent, where'{"key":"value"}'
is a simple example of a JSON object. You need to replace it with the actual data you want to send.
Please note that you need to replace "key":"value"
with the actual data content you want to send. If your API accepts a different data structure or requires specific fields, please adjust this part according to your actual situation.
Related Notes
- In streaming mode, if the masked words are split across multiple chunks, restoration may not be possible
- In streaming mode, if sensitive words are split across multiple chunks, there may be cases where part of the sensitive word is returned to the user
- Grok built-in rule list: https://help.aliyun.com/zh/sls/user-guide/grok-patterns
- Built-in sensitive word library data source: https://github.com/houbb/sensitive-word/tree/master/src/main/resources