Delphix Products

 View Only
Expand all | Collapse all

I want to mask a Unique Identifier column and which is the best algorithm for this?

Mayank Ahluwalia

Mayank Ahluwalia04-01-2019 08:37:00 AM

  • 1.  I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 03-29-2019 07:34:00 AM
    I want to mask a column whose datatype is Unique Identifier (Not Null) but have repeatable values and the data is in the format of "586A3557-2243-4C06-BABE-004F2810BD2C" . I thought of trying SM algo but maximum length of each segment is only 4 in this algo. Any suggestions please!

    Kind Regards,
    Rajesh
    #Masking


  • 2.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?
    Best Answer

    Posted 03-29-2019 07:42:00 AM
    Hi Rajesh,

    The Segmented Mapping would fit perfectly for you as long as the total length of data never exceeds 36 characters.

    Looking at the example you gave, you can create 8 segments of length 4 each, alphanumeric and ignore the character '-'.

    Regards,
    Mayank


  • 3.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?
    Best Answer

    Posted 03-29-2019 12:52:00 PM
    Similar to the above scenario i have data which of data type varbinary - sample date "0x01000000C6934392AC8527F35257EDD0A414ADB9C58B6224B785FB1BD33694D4AE89797B" which would be the best algorithm.


  • 4.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?
    Best Answer

    Posted 03-29-2019 02:18:00 PM
    Hi Jayashree,

    I think the best one for your case will be the "Binary Lookup Algorithm".

    Regards,

    Mouhssine


  • 5.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 03-29-2019 02:23:00 PM
    Thanks Mouhssine. I missed mentioning in my earlier post that the column also has a PK/FK constraint attached. Please suggest.

    Thanks,
    Jayashree


  • 6.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 03-29-2019 08:12:00 PM
    If you need unique values post masking, one solution i can think of is to cast varbinary into varchar...masking with Segmented mapping and then update the masked varchar values by casting them back into varbinary


  • 7.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 03-30-2019 12:47:00 PM
    Thanks Mayank.

    Segmented Mapping can take only upto 9 segments, but the length of the above goes to more than that.
    Also once we convert varbinary to varchar, we are facing issue converting it back to varbinary. Can you please suggest if Tokenization or Mapping Algorithm would be a better choice. Your thoughts please.

    Thanks,
    Jayashree 


  • 8.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 03-30-2019 01:42:00 PM
    Hi Jayashree,

    Sorry for the delay.

    You still can use Secure lookup or mapping algorithm to mask the PK/FK column, with this requirement.

    - nomber of unique values in the lookup/mapping file have to be superior or at less equal to the number of rows in the column to be masked.

    - you will have to optimize the job regarding the size of SL file and number of rows to mask.

    I prefer the usage of SL instead of mapping algorithm.

    Regards,

    Mouhssine


  • 9.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?
    Best Answer

    Posted 03-30-2019 02:37:00 PM
    Thanks Mouhssine.

    Just to confirm we have about 38643560 rows and we have about 45000 unique rows, so i should be good creating a SL file with about 45000 or 50000 rows, which will make it unique and serve my purpose of FK/PK right?

    Sorry I am a beginner, asking so many doubts.

    Thanks,
    Jayashree




  • 10.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?
    Best Answer

    Posted 03-30-2019 03:10:00 PM
    Hi jayashree, Don’t mind we are here to help if we can :). Yes, I confirm your purpose if you create a file with 50000 it could be great this will let you 5000 more values in case of your database evolution. Regards, Mouhssine


  • 11.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 04-01-2019 07:24:00 AM
    Hi Mouhssine,

    My understanding of Secured Lookup Algorithm is that while it is deterministic and can preserve Referential Integrity, it may or may not provide uniqueness.

    So, in the above case, even if we create a lookup file with # of values > # of unique values in the column, does the Algorithm implementation guarantee uniqueness ?

    Regards,
    Mayank


  • 12.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?
    Best Answer

    Posted 04-01-2019 08:01:00 AM
    Hi Mayank,

    SL calculate a hash value with modelus so if we imagine the column values are unique and the same in SL file with the condition of having # of values > to number of values in the column we can still some how guarantee the uniqueness generated of masked value. (This mimics mapping algorithm)

    Mapping algorith guarantees a 1 to 1 mapping and works like SL, so this said SL/MAPPING could answer the need in ower case

    Regards,

    Mouhssine



  • 13.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 04-01-2019 08:37:00 AM
    Thank you for the explanation :)


  • 14.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 04-01-2019 12:14:00 PM
    Thanks for detailed explanation Mouhssine. But i am facing the below issues.

    1) I have a varbinary column varbinary(256). I tried secure Lookup but it is failing with the error
    Implicit conversion from nvarchar to varbinary is not allowed, use the convert function.

    2) I decided to try using Binary Lookup as mentioned, for which the lookup file has to be of the format xml, jpg or pdf. I created the lookup with a pdf lookup file. But this too is failing with the error. I even tried increasing the column width to varbinary(500) 

    Data Truncation.

    Data sample: 0x01000000FFFDFB622D17E4F4891B97D2785F79EA6D22B96FA45C9E07D5AFF3A2D25E6497


    Please suggest.

    Thanks,
    Jayashree


  • 15.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 04-01-2019 12:26:00 PM
    Hi Jayashree,

    I'm not masterd on SQL Server, but got the same behavior on postgres where i created an implicite cast that converts my PK colum of type uuid to varchar in order to be masked.

    You may have to look on (CAST / CONVERT) you colum to allow the masking process.

    P.S : what is the content of this colum is it a pic, pdf ....?

    Regards,

    Mouhssine


  • 16.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?

    Posted 04-01-2019 12:30:00 PM
    Hi Mouhssine,

    It is not pic or pdf. It contains a binary data sample data below. DataType is varbinary(256)

    0x01000000FFFDFB622D17E4F4891B97D2785F79EA6D22B96FA45C9E07D5AFF3A2D25E6497

    Thanks,
    Jayashree


  • 17.  RE: I want to mask a Unique Identifier column and which is the best algorithm for this?
    Best Answer

    Posted 04-01-2019 12:32:00 PM
    Hi,


    OK, so you have to look on (CAST / CONVERT) between varbinary and nvarchar to get the column masked using the SL algorithm.

    Regards,

    Mouhssine