(data profiling) Option to exclude incompatible DATA TYPES (column-types) types for given search rule ?

  • 1
  • 1
  • Question
  • Updated 4 months ago
  • Answered
Dear Community,

Is there any option available to EXCLUDE particular data-types from being searched for each of search rules please ? (for single search rule, not globally)

For example, when searching for email addresses (both in column names and data), then option to exclude ALL columns with data type NUMBER or DATE or BINARY or TIMESTAMP would be helpful.

Each of search rules may work faster and produce better results, if incompatible DATA TYPES can be excluded. (improvement in quality and performance).

As each sensitive attributes have different data-types to be excluded, then this option should be available separately for each search rule.

Is there any existing option to enable this functionality for separate search rule please ?

Looking forward for yours suggestions,

thanks in advance,

Adam Przybyslawski
Photo of Adam

Adam

  • 140 Points 100 badge 2x thumb

Posted 4 months ago

  • 1
  • 1
Photo of Hims

Hims, Employee

  • 2,096 Points 2k badge 2x thumb
Hi Adam,
Most data types which will not have PII/PHI inside are already excluded from the profiling scope. E.g. 2 or 3 width Varchar/char, Boolean/ Flag data types.

Any data type which may have sensitive data like DATE/ TIMESTAMP ( for birthdates) or BINARY data ( PDF/JPEG) are NOT excluded.

These properties are currently global and not configured at the job level. In future, they should be user-configurable.

Hope this helps.

Hims