Delphix Products

 View Only
  • 1.  Delphix support for "Delta Lake" (delta.io)

    Posted 07-17-2023 08:07:00 AM

    Hi 
    Is anyone familiar with or have tried using Delphix with Delta lake (delta.io) and/or Hadoop? 
    We're exploring use cases , what goes well, what does not. 
    I imagine it'd be mostly about Continuous Compliance (masking)... 
    Please share - thank you.



    ------------------------------
    Arkadiusz Slawek
    Senior Solution Delivery Manager
    Credit Suisse AG
    ------------------------------


  • 2.  RE: Delphix support for "Delta Lake" (delta.io)

    Posted 07-20-2023 02:03:00 AM

    Any IN PLACE masking will be very slow because those DB platforms are not built for updates > they are built for fast select & insert.

    Performance of On The Fly jobs will be drastically better (select from A and insert into B).

    Another option is Extract-Mask-Load approach > e.g. get the data files out of the DB and mask as files (depending what storage format used, you can use either in-place or on-the-fly).



    ------------------------------
    Tino Pironti
    Masking SME
    Technical Manager
    Delphix
    ------------------------------



  • 3.  RE: Delphix support for "Delta Lake" (delta.io)

    Posted 07-20-2023 03:22:00 AM

    Hi Tino, 
    Thank you for the response.
    In some Delphix materials it was suggested that the masking could work better on the original data sources, prior to loading into any data lake.
    What is your opinion about it? 



    ------------------------------
    Arkadiusz Slawek
    Senior Solution Delivery Manager
    Credit Suisse AG
    ------------------------------



  • 4.  RE: Delphix support for "Delta Lake" (delta.io)

    Posted 07-24-2023 02:28:00 AM

    Yes, some customers do this approach, for certain rules of privacy legislation it is even mandatory (Right to be forgotten) as records must be removed or masked after a certain time of inactivity. The process of getting data from the various applications into the data lake is already an ETL process so adding the masking into the process isn't a big effort. Obviously the algorithms used to do the masking should be deterministic and preserve uniqueness.



    ------------------------------
    Tino Pironti
    Masking SME
    Technical Manager
    Delphix
    ------------------------------