Virtualization Plugins

Expand all | Collapse all

data sharing across multiple vFiles

Jump to Best Answer
  • 1.  data sharing across multiple vFiles

    Posted 08-04-2020 02:30:00 AM
    We are trying to automate the process of creating multiple instances of a database while provisioning.
    as a per of this we need to access data from previously created VFiles/VDB's.
    is it possible to achieve this. As far as I know the data created while creating a VFile/VDB remains till the VDB exists and we cannot share it another VFile/VDB.
    But I am missing anything here please let me know.

    Thanks & Regards

    ------------------------------
    Ravi Nistala
    Community Member
    TDGlobal
    ------------------------------


  • 2.  RE: data sharing across multiple vFiles

    Posted 08-04-2020 12:39:00 PM
    Edited by Tom Walsh 08-04-2020 12:45:18 PM
    Hi Ravi,

    I am not 100% sure what you are trying to do, so apologies if the following does not apply to you...


    > We are trying to automate the process of creating multiple instances of a database while provisioning.

    Do you mean something like this: "Each single VDB is composed of multiple blobs of data. Each of these blobs might be synced from a different place and/or mounted to a different place"?

    Example: you want to virtualize a "media center". Your single dsource wants to sync from two different servers: one that stores music and one that stores video. When you provision this out as a VDB, you want the virtualized music to be mounted separately (maybe even on a different host) than the virtualized video.

    You can do this with the SDK. Briefly, on the sync/dsource side, you are in full control of what happens, so you could sync different bits of data from as many different places as you'd like. On the VDB side, you can use the `mount_specification` operation to say where each part of your dataset should be mounted.


    > we need to access data from previously created VFiles/VDB's.

    Or, maybe you mean you really do want multiple separate VDBs, but some of your VDBs might needs to access data from another one of your VDBs?

    This is technically possible, although there is not really any special builtin support for it from our end. Suppose VDB2 needs to access data from VDB1. You'd just need two things:

    1) VDB1 needs to be enabled (this ensures that its data is mounted somewhere)
    2) VDB2 needs to be told where to find VDB1's mounted data (probably via the virtual source parameters)



    I am not sure if this answers your question.

    ------------------------------
    Tom Walsh
    Software Engineer
    Delphix
    ------------------------------



  • 3.  RE: data sharing across multiple vFiles

    Posted 08-04-2020 11:03:00 PM

    Hi Tom: Thank  you for your response. I meant the second case you described below. If we have 2 or more VDB/VFILES.  Then there are variables in them which we need to access from other VDB's.

    So a use case could be let us say a server has multiple instances and the instance information is stored in the VDB1/VFILE1 while provisioning. If the user wants to provision another instance of the Database then how/if the plugin can get the information from VDB1/VFILE1.

     

    Hope I am clear this time. Sorry for the confusion.

     

    Thanks & Regards

    Ravi.n

     






  • 4.  RE: data sharing across multiple vFiles

    Posted 08-05-2020 09:24:00 AM
    Edited by Tom Walsh 08-05-2020 09:24:21 AM
    So, you are provisioning VDB2 from VDB1, and you need to know some information about the VDB from which you are provisioning?

    If so, this is typically done using your snapshot definition.

    The idea works like this:
    1) In your snapshot schema, you define whatever properties your code will need in order to do a provision.
    2) In your post-snapshot operations (both linked and virtual), you would take care to "save" all relevant information about the current state of the database.
    3) In your configure and unconfigure operations, you can then access this saved information however you need.

    There are probably two key points here:
    First, it doesn't really matter whether you are provision a new VDB from an existing VDB or from a dsource. Really, the only thing your code needs to care about is that you are provisioning from a snapshot.  Whether that's a dsource snapshot or a VDB snapshot shouldn't really matter.
    Second, the snapshot information contains the state of the system as it was when the snapshot was taken.  This means that you can still safely provision new VDBs from old snapshots, even if there have since been lots of changes to the dsource/VDB you are provisioning from.


    Let me walk through an example. Let's say your DMBS allows for two different on-disk data formats: LATIN-1 and UTF-8.  Further, let's assume that the end user can change this any time they want, and that you can always query the DBMS to learn which format it is currently using.

    For a case like this, you might define a property like this in your snapshot definition:
      "diskFormat": {
        "enum": ["LATIN-1", "UTF-8"]
      }

    Then, in your post-snapshot operation, you would need to "save" this format into your snapshot data (this example shows linked source code, but you'd also have to do similar for virtual source code).  Maybe that would look something like this:
    @plugin.linked.post_snapshot()
    def linked_post_snapshot(staged_source, repository, source_config, snapshot_parameters):
    
      # Create a new snapshot data object
      snapshot = SnapshotDefinition()
    
      # Learn the current format of the data
      learn_format_command = "mydbms --get-disk-format"
      result = libs.run_bash(staged_source.connection, learn_format_command)
      format = result.stdout
    
      # Save the current format to the new data object
      snapshot.disk_format = format
    
      # <do any other work you need for your snapshot data here>
    
      return snapshot
    ​

    FInally, in your configure and reconfigure operations, you could now access this information.  The incoming "snapshot" parameter represents the state of the data at the time the snapshot was taken. That might look like this:
    @plugin.virtual.configure()
    def configure(virtual_source, repository, snapshot):
      # Learn the on-disk format of the data we are provisioning from
      format = snapshot.disk_format
    
      # Tell our DBMS about the new dataset
      command = "mydbms add_dataset --path {} --format {}".format(virtual_source.parameters.mount_path, format)
      libs.run_bash(virtual_source.connection, command)
    
      # <do any other work required here>


    So, now suppose this series of events happens:
    1) User syncs a new dsource from a LATIN-1 dataset (this creates snapshot A on the dsource)
    2) User provisions VDB1 from that new dsource (this creates snapshot B on the VDB)
    3) User changes the disk format of VDB1 to be UTF-8
    4) User provisions VDB2 from VDB1 using snapshot B

    Even though VDB1 is currently using the UTF-8 format, it had been using LATIN-1 when the snapshot was taken.  So, the new VDB2's data will be starting out in LATIN-1 format.  The above strategy would let step 4 proceed correctly, because the snapshot data will tell your plugin code about the correct format.​



    ------------------------------
    Tom Walsh
    Software Engineer
    Delphix
    ------------------------------



  • 5.  RE: data sharing across multiple vFiles

    Posted 08-06-2020 02:36:00 AM

    Hi Tom: I am trying to provision 2 VDB's/VFILES's from a single dSource but with different snapshots taken at different intervals. Below are 2 snapshots from one dsource. What I want to do is provision on vdb/vfile with June snapshot(say VDB1) and another with Aug snapshot (say VDB2).

     

    Let us say that I created a file at /tmp/folder1/file1 when I provisioned VDB1. I am storing the values (fodler1 & file1) while creating it.

    My question is :

    How will I know that I have provisioned fodler1 & file1 previously while provisioning VDB2?

    NOTE: I have made the example look simple just make it understand . but in our scenario the path will vary and is dependent upon users input.

     

    Along the same lines, is there a "global variable" declaration from a SDK perspective?

     

     

     

     






  • 6.  RE: data sharing across multiple vFiles
    Best Answer

    Posted 08-06-2020 09:57:00 AM
    Hi Ravi,

    The Delphix Engine does not give any special way for a VDB provision to learn things that happened during a different VDB's provision.

    In general VDB2 should be "standalone" and not depend on VDB1 in any way.  Having two separate VDBs "share" data/storage on the remote host would be very tricky to get right.  Here are some of the pitfalls you'd have to worry about:


    • Who is in charge of deleting /tmp/folder1/file1?  Does it get deleted with VDB1? If so, does that cause problems for VDB2?  Or do the various VDBs need to coordinate with one another so that the last-deleted VDB will handle this cleanup?
    • What happens when VDB1 is controlled by a different remote user than VDB2?  How will permissions/ownership work so that both users can access this same directory?
    • What happens when VDB1 is migrated to a different remote host?  If there is data specific to VDB1 in /tmp/folder1/file1, will this need to be moved to the new host?  If there is shared VDB1/VDB2 data, will that need to be copied? How/when will your plugin code do that?
    • Maybe you could outline what the problem is you're trying to solve?  Perhaps we could help explore other options that might work?


    Finally, global variables do not work the way you're probably hoping they do.  As with all Python code, you can declare and use global variables in your plugin code.  However, each plugin operation is run separately, and nothing is persisted across multiple operations. So, for example, you can't "save" data to a global variable in your pre_snapshot operation and then expect it to still be there during your post_snapshot operation.



    ------------------------------
    Tom Walsh
    Software Engineer
    Delphix
    ------------------------------



  • 7.  RE: data sharing across multiple vFiles

    Posted 08-07-2020 07:15:00 AM
    Thank you for the detailed response Tom. currently our requirement is such that we need information sharing between vdb's to automate the process. currently we have put the onus of that information on the user. so user will provide what information and if any validation check fails the user has to reenter again.
    currently we arr stuck at a different kind of issue. i will post a new message for it.

    ------------------------------
    Ravi Nistala
    Community Member
    TDGlobal
    ------------------------------