(data profiling) Searching for personal data with weak-defined formats using profiling.

  • 0
  • 1
  • Question
  • Updated 4 months ago
  • Answered
Dear Delphix Community,

When sensitive data to be discovered  do have pre-defined format, then REGEX-based profiling is working fineExample of pre-defined-format data: email address or IBAN accout number, etc. 

When sensitive data to be discovered doesn't have any defined format (like human related info's: first name, last name, etc) then regex usually is not giving optimal results. 

My questions:
  • Is there a option to use valid-list-lookup data profiling please  (e.g. list of 500 city names)?
  • Is there an option to write custom data profile plugin in some programming language please ?

Data profiling is planned be executed on enterprise-scale systems, with thousands of tables, etc. High-quality of profiling will save a lot of time on manual reviews of results.


Looking forward for yours suggestions.

Many thanks in advance,

best regards,

Adam Przybyslawski
Photo of Adam

Adam

  • 140 Points 100 badge 2x thumb

Posted 5 months ago

  • 0
  • 1
Photo of Gary Hallam

Gary Hallam, Official Rep

  • 1,702 Points 1k badge 2x thumb
Hi Adam,
Thank you for your suggestions.

Like you, I see comprehensive profiling as a fundamental feature of data protection and the recent release of our Masking API allows a much more automated approach to profiling and masking, which assists large enterprises to profile large data estates.

The option to use list lookups is not currently available but has been requested in the past and is accepted onto our feature enhancement backlog.

I like the idea of a custom profiling plug-in.  This marries with the ability to create custom masking algorithms also.  I did not see this in our enhancement request catalogue so I will discuss this feature with product management.

Regards,
Gary
Photo of Adam

Adam

  • 140 Points 100 badge 2x thumb
Dear Garry,

Thanks for prompt and professional answer.  

have a nice day,

best regards,
Adam
Photo of Adam

Adam

  • 140 Points 100 badge 2x thumb
Dear Garry,

Thanks for prompt and professional answer.  

have a nice day,

best regards,
Adam
Photo of Jonathan Prévost

Jonathan Prévost

  • 60 Points

Hello Adam,

We are also looking for the same thing since our goal is to reduce as much as possible the manual intervention. Is this just a dream or it is feasible: execute the profiler on tables and get near 100% associated fields and accurate?


Thanks

John