Salut Moussine and thanks for that.
Our client understands that having "clean" data is prerequisite for masking these columns, but is reluctant to commit resources to clean the source data. They do not want the masking job itself to clean the data before masking as that will disguise where they have a problem with incorrectly formatted data in the source database. The hope was that we could at least pin-point where the masking fails, which would give them an indication of exactly where the source data needs to be fixed.
It strikes us as an obvious exception that Delphix Masking should be capturing information about to include in the masking job log. Unfortunately it does not log exceptions at this level of granularity and so we had wondered if anyone in the community had any workarounds to capture the information from the masking job or its output.
What we have done is to amend the selection criteria for the ruleset to include a binary test that verifies the format of the data and its content. If the test is passed the column is masked in segments as expected. In a separate ruleset, if the same test is failed, we are masking with a default text string.
This stops potential leakage of unmasked values, at the same time providing a fairly straightforward means to pinpoint the source data which actually needs to be fixed. The downside is that it does mean scanning the same table.column two times in order to handle the two different masking outcomes as well as having to run a post masking script to report the rows that have columns masked with the default string.
Cheers,
Chris