search for exact match doesn't produce results

  • 0
  • 1
  • Problem
  • Updated 2 years ago
  • Solved

Hello,

   if i want profiles some tables by searching exact match with this regular expression

\b[C][o][l][o][m][b][o]\b the job returns without errors but doesn't match anything at data level.

But if i query directly that same regular expression returns some data

Luigi

Photo of luigidep

luigidep

  • 622 Points 500 badge 2x thumb

Posted 2 years ago

  • 0
  • 1
Photo of Mouhssine SAIDI

Mouhssine SAIDI

  • 4,732 Points 4k badge 2x thumb
Hi luigidep,

It's a syntax type in my advice.

You're looking to profile value "Colombo", i suggest to change the regexp to spmething like

([C][o][l][o][m][b][o])

You can find some good samples here https://docs.delphix.com/pages/viewpa...

Regards,

Mouhssine
Photo of luigidep

luigidep

  • 622 Points 500 badge 2x thumb
It would be supposed with respect to this warning that Delphix when performs search at level data by default examine a subset of all rows for each table. And if Delphix doesn't find correspondence within for example the first thousand rows for each table infer that the current expression doesn't match. This scenario happens when the distribution of first name and last name values in the columns is skewed, not uniform and so we have to manually profile them we have no choice.
Photo of luigidep

luigidep

  • 622 Points 500 badge 2x thumb

Hi Mouhssine,

   also if i specify that synatx for catch the name Delphix profiler doesn't recognize it.

Neither if i edit ruleset with sql filter that match only one row containing that string.

Probably the skewed distribution of the name value in the column or probabily the relative position of the name in the column field that requires multiline matching regular expression.


Also is strange in general that profiling job returns succeeded without produce results as in this case and returns failed but with profiling report compiled.


Luigi

Photo of luigidep

luigidep

  • 622 Points 500 badge 2x thumb

Hello,

    I have solved my problem.

The job status failed was due to setting to twenty the number of streams dedicated to profile job.

Now i have updated it to two and jobs have succeeded.

Also the search for exact match produce result

Thank you for your attention.

I would like to know the value of parameter NO_OF_ROWS that is read by kettle to determine the subset row to profile for each tables in rule set. This parameter could impact the profile job activity

Luigi

(Edited)
Photo of Gianpiero Piccolo

Gianpiero Piccolo

  • 1,466 Points 1k badge 2x thumb
I'm interested in NO_OF_ROWS too. What is the default number and how can one modify it?
Thank you
Gianpiero
Photo of Mouhssine SAIDI

Mouhssine SAIDI

  • 4,732 Points 4k badge 2x thumb
Hi,

You have to contact support that will help you with that, I think they will provide and load a new configuration masking file to define this.

Regards,

Mouhssine
Photo of Mouhssine SAIDI

Mouhssine SAIDI

  • 4,732 Points 4k badge 2x thumb
Hi,

Glad that's it working now.

But just to my personal known what matching regexp did you use yours or the one provided by me.

Regards,

Mouhssine
Photo of luigidep

luigidep

  • 622 Points 500 badge 2x thumb

Hello i used this expression

([cC][oO][lL][oO][mM][bB][oO]|[fF][eE][rR][rR][aA][rR][iI]|[rR][oO][sS][sS][iI]|[bB][iI][aA][nN][cC][hH][iI]|[sS][aA][lL][aA]|[vV][iI][lL][lL][aA]|[cC][aA][tT][tT][aA][nN][eE][oO]|[bB][rR][aA][mM][bB][iI][lL][lL][aA]|[rR][iI][vV][aA]|[fF][uU][mM][aA][gG][aA][lL][lL][iI]|[gG][aA][lL][lL][iI]|[lL][oO][cC][aA][tT][eE][lL][lL][iI]|[pP][oO][zZ][zZ][iI]|[mM][aA][rR][iI][aA][nN][iI]|[rR][oO][tT][aA]|[gG][aA][tT][tT][iI]|[bB][eE][rR][eE][tT][tT][aA]|[bB][aA][rR][bB][iI][eE][rR][iI]|[pP][aA][gG][aA][nN][iI]|[fF][eE][rR][rR][aA][rR][iI][oO])


Regards,

  Luigi

(Edited)
Photo of Mouhssine SAIDI

Mouhssine SAIDI

  • 4,732 Points 4k badge 2x thumb
Hi,

Okay, so there is no \b and the expression is a variante of the one provided. Great

Regards,

Mouhssine