Delphix Products

 View Only
  • 1.  Data level profiling (EMAIL)

    Posted 05-10-2017 02:28:00 PM

    Hi,


    I'm trying to set up profiling on masking engine at data level but can't get the job woking it terminates successfuly but the EMAIL colomns are not tagged as sensitive one.

    I'm using this regexp to profile email addresses
    \b[[:alnum:]]([-_.]?[[:alnum:]])*@[[:alnum:]]([-.]?[[:alnum:]])*\.([a-z]{2,4})\b

    The list of email address to profile are (found on medical_records and patient tables of the demo delphix schema)

    Regards,

    Mouhssine">https://d2r1vs3d9006ap.cloudfront.net/s3_images/1594641/RackMultipart20170510-13298-1i0il8l-profiled_tables_inline.PNG?1494430031">
    #DemoEnvironment
    #Masking


  • 2.  RE: Data level profiling (EMAIL)

    Posted 05-12-2017 03:37:00 PM
    Hi Mouhssine,
    Does "- \b[[a-zA-Z0-9]]([-_.]?[[a-zA-Z0-9]])*@[[a-zA-Z0-9]]([-.]?[[a-zA-Z0-9]])*\.([a-z]{2,4})\b" work?


  • 3.  RE: Data level profiling (EMAIL)

    Posted 05-12-2017 04:12:00 PM
    Hi Jaclyn, Will give it a try and keep you informed, but I think I tested it first with no success. So please bear with me time to try again and give a feed-back Mouhssine


  • 4.  RE: Data level profiling (EMAIL)

    Posted 05-14-2017 01:22:00 PM
    Hi Jaclyn,

    After testing this new regexp nothing new happen i still can't profile emails.

    Email reg


    Profile


    Profiler job


    Results (2 tables profiled)



    But no column has been identified as sensitive even we have one email column defined per table



    Regards,

    Mouhssine


  • 5.  RE: Data level profiling (EMAIL)

    Posted 05-16-2017 12:22:00 PM
    Hi Jaclyn,

    I fixed it.

    After discussing with kersten about this issue he gaves me some great advices, and foud that my profiling dont tag the column because of the sampling algorithm it uses.

    To be more clear, the profiler will relay on a configuration file that fixes some key values (NO_OF_ROWS and PERCENTAGE_REQUIRED=80). So this mean that it will look for NO_OF_ROWS that defaults to 100 insied the columns to profile and should find out opf theme at least 80% that matches the regexp defined.

    I finally updates my tables to get 100 recored and used this regexp "\b[[a-zA-Z0-9]]([-_.]?[[a-zA-Z0-9]])*@[[a-zA-Z0-9]]([-.]?[[a-zA-Z0-9]])*\.([a-z]{2,4})\b" and voilà things works magically.

    Regards,

    Mouhssine


  • 6.  RE: Data level profiling (EMAIL)
    Best Answer

    Posted 05-16-2017 07:47:00 PM
    I'm glad it's all worked out!