Best setting to mask giant tables?

  • 0
  • 1
  • Question
  • Updated 5 months ago
  • Acknowledged
I am trying to mask tables of more than 1 million lines, I tried several configurations, but the masking is very slow, there are some recommended settings for streams, theads .. I tried to use LK but there were no gains.
Photo of Rodrigo Pelucio

Rodrigo Pelucio

  • 120 Points 100 badge 2x thumb

Posted 1 year ago

  • 0
  • 1
Photo of Gianpiero Piccolo

Gianpiero Piccolo

  • 1,526 Points 1k badge 2x thumb
Hi Rodrigo,

Tables containing 1 or 2 milions of rows are not giant.
We are masking oracle table with hundreds of milions of rows. The mean spead is rough 500k rows per minute.
What is your speed, what is your database? Did you disable triggers, indexes?

Regards.
Gianpiero
Photo of Rodrigo Pelucio

Rodrigo Pelucio

  • 120 Points 100 badge 2x thumb
Hi, Gianpiero
This Bank I am masking does not contain constraints and also triggers.
 
Speed ​​is low 3840 rows per min...
Btree indexes exist in the field I'm masking, I'm using a credit card algorithm from Delphix itself.


Thanks
Photo of Gianpiero Piccolo

Gianpiero Piccolo

  • 1,526 Points 1k badge 2x thumb
Try to drop index, then mask, then re-create it.
Photo of Mouhssine SAIDI

Mouhssine SAIDI

  • 4,782 Points 4k badge 2x thumb
Hi,

Is it a partitioned table ? How many streams / threads did you configure for the job and what memory configuration you set for the job ?

Regards,

Mouhssine
Photo of Anita Dighe

Anita Dighe

  • 60 Points
Hi,

I am facing the same issue for partitioned table. though table size is 40 million. Its a partitioned table. and its extremely slow inserting data to destination.
Photo of Mouhssine SAIDI

Mouhssine SAIDI

  • 4,782 Points 4k badge 2x thumb

Hi Anita,

My preposition for optimizing masking for your partitioned tables, is to exploit the oracle partition’s key.

The idea is creating one job per partition (using the key partition as subset condition at job creation) and run theme in parallel using script hooks or masking API if you are in 5.2.

Hope its clear for you

Regards,

Mouhssine

Photo of Gianpiero Piccolo

Gianpiero Piccolo

  • 1,526 Points 1k badge 2x thumb
Do not forget that the speed is dependent not only on the update time consuming, but also on fetch data from the DB. If you have very large tables in terms of number of columns, you have to keep in mind that data from all columns have to be downloaded from the DB server to the masking engine. In order to reduce this network throughput, you could use a custom SQL to fetch only columns that have to be masked + the logical key (or primary key if you didn't define a Logical Key).

Regards.
Gianpiero