You need to create only the empty schema in DB, Delphix creates the schema objects (tables, procedures, etc)
Then you upload in Delphix UI: { Algorithms / your_algorithm / manage mappings } the mapping values > they get encrypted & saved in the table created by Delphix. The UI tells you how many unique values exist (duplicates get removed).
If you have 1 million unique values to mask we suggest to upload 10% more values in "manage mappings".
This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.
Original Message:
Sent: 8/23/2024 9:18:00 AM
From: Horacio Rosende
Subject: RE: Scaling Limits and Resource Allocation for Mapping Algorithm Jobs
Hi Tino.
This message is just to keep the conversation thread complete.
Following your advice, we created a database in our lab with its own schema. This schema contains a table, named "export", which has two fields (input and output). We then populated this table with 20 million records.
Subsequently, we created our own algorithm using the mapping framework with the external database connection. This process completed without any issues. However, when running the masking job, we received the following error message: algorithm output: Failed to map value. No free mappings are available
Additionally, we noticed that Delphix automatically created two additional tables named mapping_algorithm_metadata and mappings in the database.
We opened a support ticket. When it's solved, I'll report back to complete the thread
Thanks!
------------------------------
Horacio Rosende
Innovation Manager
Grupo Net S.A.
------------------------------
Original Message:
Sent: 08-22-2024 11:22:37 AM
From: Horacio Rosende
Subject: Scaling Limits and Resource Allocation for Mapping Algorithm Jobs
Hello Tino,
Excellent news! We'll try it right now.
The Mapping Algorithm file seems to have a record limit for the "index file." We were able to load a 2M record CSV file without any issues, but we received this message with a 20M record file. You can see the attached images.
Should I open a support ticket to find out what the maximum record amount is? It would be nice if this max value were included in the algorithm documentation.
As soon as I have the time, we'll try this use case with a database connection with 20M records
Thanks again!
------------------------------
Horacio Rosende
Innovation Manager
Grupo Net S.A.
Original Message:
Sent: 08-22-2024 10:49:35 AM
From: Tino Pironti
Subject: Scaling Limits and Resource Allocation for Mapping Algorithm Jobs
CharacterMapping is deterministic, e.g. for a given knout value it gives you the same result on every execution. For your usecase I can't see any reason to use mapping instead CharacterMapping.
This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.
Original Message:
Sent: 8/21/2024 3:00:00 PM
From: Horacio Rosende
Subject: RE: Scaling Limits and Resource Allocation for Mapping Algorithm Jobs
Hi Tino,
Thanks for the advice regarding the external database. Customer IDs are completely different; it's not a unique value. We'll use only one mapping algorithm for all CSV files.
We also evaluated tokenizing this data field. As far as we understand, since the tokenization algorithm is based on AES-128 encryption in CBC-CTS mode, we should always get the same result with an initialization vector set to 0, and collisions should be very rare. However, the customer prefers masking over tokenized strings.
Thanks again!
------------------------------
Horacio Rosende
Innovation Manager
Grupo Net S.A.
Original Message:
Sent: 08-21-2024 01:06:38 AM
From: Tino Pironti
Subject: Scaling Limits and Resource Allocation for Mapping Algorithm Jobs
The mapping algorithm has no fixed limit for values. If you have only one mapping algorithm with 10 million records it should work fine. If you have multiple different mapping algorithms I would suggest to use the external DB option and use one dedicated schema per algorithm.
However > if you have a unique value to mask I would suggest to use dlpx-core:CM Alpha-Numeric for String column type columns and dlpx-core:CM Numeric for numeric column types > CharacterMapping (CM) preserves uniqueness.
------------------------------
Tino Pironti
Masking SME
Technical Manager
Delphix
Original Message:
Sent: 08-20-2024 11:23:14 AM
From: Horacio Rosende
Subject: Scaling Limits and Resource Allocation for Mapping Algorithm Jobs
Hello everyone,
We are assisting a client in using the Mapping algorithm for a masking job, where they need to mask customer IDs and always obtain the same result. The required file with the pairs would have approximately 10 million records. The client plans to execute 60 jobs with CSV files of approximately 10M records and about 150 data fields.
We would like to know:
- What is the maximum number of records that the mapping file can reach?
- What should be the correct amount of memory to allocate for each job?
Thank you very much in advance.
------------------------------
Horacio GrupoNet
Innovation Manager
Grupo Net S.A.
------------------------------