Delphix Products

Expand all | Collapse all

Data Profiling - Data level vs Column level

  • 1.  Data Profiling - Data level vs Column level

    Posted 02-19-2018 11:55:00 AM
    Assume a scenario that i need to profile for birth dates. Now, i write 2 RegEx's - one each at the data level and column level respectively. I link them both with the same domain. 
    Now, when I run, what will happen ?
      ---  For few columns, both data level and column level expressions will match, so which takes precedence ? OR, Delphix assigns a domain if both match ?
      ---  For other columns such as Start Date, End Date, the data level profiler expression will find matches, but will these also be assigned the same domain for birth date ?
    #Masking


  • 2.  RE: Data Profiling - Data level vs Column level

    Posted 02-19-2018 12:15:00 PM

    Hi Mayank,

    Generally speaking the profiler will be executing the list of regex defined in your profile set, form logical perspective you have to choose which option is the more relevant to identify your column filed (column name level or data column level) and associate that with your profiler.

    Now let’s assume you have both column_name_level and data_column_level defined in your profile for a column, the profiler will list the regex list and execute theme all on this order.

    Regards,


    Mouhssine 



  • 3.  RE: Data Profiling - Data level vs Column level

    Posted 02-19-2018 12:30:00 PM
    Thanks Mouhssine. In our case not all columns can be identified by names only, so i definitely need to write the RegEx at data level.
    However, the second i do that, it starts marking the columns such as start date and end date as sensitive too since they are essentially dates as well but in the same format as birth date.

    Any ideas then ?


  • 4.  RE: Data Profiling - Data level vs Column level

    Posted 02-19-2018 12:37:00 PM

    Hi,

    Here is how i will do things.

    Create your own profile set and associated it with all needed regexp algorithms (data and col) cf. https://docs.delphix.com/docs/delphix-masking/delphix-masking-engine-admin-guide/managing-profiler-s... "To add a Profiler Set".

    Once done, create a profiler job based on the new created profile set cf. https://docs.delphix.com/docs51/delphix-masking/delphix-masking-quick-start-guide/masking-engine-act... and make sure you choose the right name of the profile on profile sets filed  "Create a New Profile of Data Using the Masking Inventory"

    Regards,

    Mouhssine



  • 5.  RE: Data Profiling - Data level vs Column level

    Posted 02-19-2018 12:40:00 PM
    I understand how to create the expressions and run the job :) 
    I wanted to understand if you had any ideas to suggest to remove or reduce the false matches if i write a data level expression for birth date


  • 6.  RE: Data Profiling - Data level vs Column level

    Posted 02-19-2018 12:45:00 PM
    Hi,

    :)  This was the first part of my answer

    Create your own profile set and associated it with all needed regexps or subset of regexps

    For how cf. https://docs.delphix.com/docs/delphix-masking/delphix-masking-engine-admin-guide/managing-profiler-s... "To add a Profiler Set".

    Hope its clear now


    Regards,

    Mouhssine