PII Masking Processor
In this article
The PII Masking Processor automatically detects and marks Personally Identifiable Information (PII) in data.
Below are the default types of PII that will be identified, along with examples:
Default Masked PII Types
Email:
john.doe@example.com
→*****@*****.com
US Phone (Formatted):
(123) 456-7890
→(***) ***-****
UK Phone Number:
+44 20 7946 0958
→+44 ** **** ****
URL:
https://www.example.com/path
→https://*****.com/****
Hostname:
server.example.com
→*****.example.com
Street Address:
123 Main St, Springfield, IL
→*** Main St, *******, **
Zip Code:
60601
→*****
IPv4:
192.168.1.1
→***.***.*.*
IPv6:
2001:db8::ff00:42:8329
→****:****::****:****:****
SSN (Spaces):
123 45 6789
→*** ** ****
SSN (Dashes):
123-45-6789
→***-**-****
By default, these PII types will be masked to ensure data privacy and security.
Enable the PII Masking functionality under the Schema Type tab in any data source.
Also, a PII Masking processor will be automatically added in the pipeline flow on the canvas. The processor will have details of the columns selected for PII Masking from the incoming data of the source file.
Notes:
The columns that have been detected as PII Masked will appear highlighted in blue color under the detected current schema tab.
Option to enable/disable PII Masking is available under (each PII Masking column) gear icon.
Supported file formats for PII Masking are CSV, Parquet, JSON, Avro, and XML.
PII Masking Processor Configuration
Under the Select Output field, the columns that have been enabled for PII Masking in the schema will be available in the drop-down list.
Select Output Field and provide character under the Add Masking Character column to mask the details of the schema. The Mask Type options are mentioned below:
Field | Description |
---|---|
All | Selects all the characters for masking. |
Alternate Character | Selects alternative character for masking. |
Head Characters | Select characters from the beginning of the data in the selected column for masking. User needs to provide the number of characters that needs to be masked from the beginning. |
Trailing Characters | Select characters from the end of the string (right most part of the string) of the data in the selected column for masking. User needs to provide the number of characters that needs to be masked from end of the string. |
If you have any feedback on Gathr documentation, please email us!