Replication best practices?

  • 0
  • 1
  • Question
  • Updated 2 years ago
  • Answered

Hi all, quick question about replication. In a scenario where we have lets say 4 DE. On those we have 3 diff sources that should be on all 4 DE.

Is it better to spread the each source onto Diff DE and then replicate between them? ( more complex replication) or to simplify have all in DE number 1 and replicate to the remaining?


If we go for the simpler option is there any constraints on the DE number 1 ( master ) ?


Thanks

Photo of Ruben Lemos Rodrigues Catarrunas

Posted 2 years ago

  • 0
  • 1
Photo of Vikram Kulkarni

Vikram Kulkarni, Employee

  • 60 Points
Hi,

There is no hard and fast rule to ingesting data into 1 engine over multiple engines. It would depend on what your business needs.

  If we can add all 4 sources to the EngineA and replicate it to EngineB, this would be the simplest way to do it. There are no constraints here to add all 4 to the same engine. The only requirement would be to have sufficient storage to hold all 4 sources in the same engine. 

If there are network/infrastructure/performance barriers towards ingesting data then you might want to spread them across various engines and replicate them.

The below link explains some of the use cases for replication -

https://docs.delphix.com/display/DOCS/Replication+Use+Cases


Thanks
Photo of Ranzo Taylor

Ranzo Taylor, Employee

  • 1,572 Points 1k badge 2x thumb
A typical Delphix Engine will handle multiple production Snapsyncs.  Of course, this is workload specific!!  

You can then use the simpler replication topology.  That would be the preferred approach.

One reason to consider the more complex model would be if the source change rates were very high.  You might then see CPU resource constraint on that first engine.  

But that's unlikely.  Many customers run 3+ dSources against a single engine; only about 1% of sources have very high change rates and would need their own engine.  You should try bringing on each source and monitoring the CPU in performance analytics screen after it has fully loaded and goes into incremental Snapsync mode.  You'll see the CPU increase as Snapsync compresses new blocks. Stagger your snapsync times to avoid large spikes.

It's very likely that the less complicated model (Hub and Spoke) will work for you.