Delphix Products

Expand all | Collapse all

What is the best algorithm for Unique values

Jump to Best Answer
  • 1.  What is the best algorithm for Unique values

    Posted 07-13-2020 08:36:00 AM

    Hi Team,


    Currently we have a requirement to Mask  Unique Values in PK column, We tried to use algorithm provided by Delphix but it didn't serve the purpose as the Algorithm is not giving unique values in column and Data Masking Job is failing.

    We tried to use NAME_TK algorithm , but it introduce Special Character due to which functionality of Application was failing.

    I created Segment Mapping algorithm but it has constraint that it can't mask more than 36 character .


    Do we have any other solution or algorithm which guarantees us uniqueness, without any special charecters and mask characters more than 36, be it alphanumeric.

    Thanks in advance.

    Ashok Kumar Athuluri
    Associate Engineer

  • 2.  RE: What is the best algorithm for Unique values

    Posted 07-13-2020 09:53:00 AM

    Hi Ashok,


    There are a lots of out of box algorithms that can solve your problems and I would suggest to check following documentation:


    General documentation

    There is a table that describes our algorithms in scary detail (especially around tokenization).

    Date Shift

    Secure Lookup


    Credit Card

    Zip 4

    Secure Shuffle



    Rahim Cetinel
    Solution Architect | Delphix Blackbelt

  • 3.  RE: What is the best algorithm for Unique values

    Posted 07-14-2020 02:15:00 AM
    To mask a PK column you cannot use the following algorithms:
    - SL SecureLookup > SL do not preserve uniqueness
    - Tokenisation > Restrictions with length of value fitting into column
    - SecureShuffle > if the value of PK itself contains sensitive information you should not use as shuffle does only "shuffle rows" but does not obfuscate.

    Required is an algorithm that is deterministic and preserves uniqueness like:
    - SM (use alphanumeric, restriction 36 character length and require ignore char list)
    - Mapping (create list of distinct values where count is bigger than distinct count of values from column, restriction : run only one instance)
    - SM_UNI (custom algorithm, no length restriction, auto-configuring, configure variables:  inputValue / outputValue, attached)

    Tino Pironti
    Technical Services

  • 4.  RE: What is the best algorithm for Unique values

    Posted 07-14-2020 03:49:00 AM
    Hi Tino,

    Thanks for the Detail Explanation , but SM_UNI algorithm mask only numbers and not alphabets

    The Algorithm should be in such a way that it mask both Numbers and alphabets irrespective of the String Length and maintain uniqueness.

    Do we have such algorithm or can we do some modification which satisfy above criteria?

    Siddharth Jain
    Senior Support Analyst
    Delphix Community Members

  • 5.  RE: What is the best algorithm for Unique values

    Posted 07-14-2020 04:12:00 AM
    Your statement is false. SM_UNI does support alfanumeric and in addition to standard SM it supports accented characters, cyrilic and any special characters.  There is one option to restrict it to mask only numerics but that is not active by default. Use algorithm as was attached.

    Tino Pironti
    Technical Services

  • 6.  RE: What is the best algorithm for Unique values
    Best Answer

    Posted 07-14-2020 04:19:00 AM
    Standard SM_UNI :


    Custom algorithm configuration:
    Input: inputValue
    Output: outputValue

    Explanation of options (editable in MASK step as above):

    - numeric_only true/false default=false
    If enabled it will only mask numbers within the given string. When masking data that contains only numerics it is advised to enable this feature as it will increase performance. For decimal types like float it is required to enable the option.

    - leading0 true/false default=false
    If enabled any leading consecutive zeros are ignored. Masking starts with first non-zero character. Further the feature can be used on NUMERIC columns to avoid that the result of the first masked character will become 0 - to preserve uniqueness (explanation: 0111 is same as 111 > therefore first char should never get a 0 as result of masking)

    - keep_first=N N=numeric value default=0
    If keep_first is set to 6 the masking would ignore the first 6 characters and start masking beginning with the 7. character

    - keep_last=N N=numeric value default=0
    If keep_last is set to 6 the masking would ignore the last 6 characters.

    - filter_char = false; true/false default=false
    The input is filtered and any accented chars replaced with the corresponding base char: é > e

    All options can be used in combination following the logic:


    The options are editable inside the algorithm in plain text.

    - Deterministic (!)
    - not limited to 36 characters like standard SM
    - preserves length
    - preserves case
    - preserves uniqueness
    - position specific logic
    - supports 0-9, a-z, A-Z, accented latin characters and cyrillic characters
    - characters not matching the supported character list are preserved (-_,=@$%£'"{}[]()™‹<> etc)
    - preserves punctuation / format
    - supports different styles of NULL / empty string (for example for Oracle type NUMBER)

    Tino Pironti
    Technical Services

  • 7.  RE: What is the best algorithm for Unique values

    Posted 07-14-2020 09:26:00 PM
    Edited by Siddharth Jain 07-14-2020 09:27:29 PM

    I tired to create a new algo with your attached custom Algorithm, but it was not able to mask the values.
    Am I am missing to edit in the Algorithm ?

    Siddharth Jain
    Senior Support Analyst