Tintri Cloud Connector avoids the problems described in the previous section with an architecture designed to do incremental forever backups. Each incremental is stored as a separate object to avoid amplification on recovery. The first time a snapshot is replicated to the cloud, a full is performed. After that, whenever a snapshot is replicated, it’s an incremental, and the process does not depend on having the previous snapshot in the chain stored locally. Unlike other products, even having the cloud destination un-configured for some period of time does not require a new full.
There is no other natively integrated cloud replication product that can create a chain of snapshots stretching across a cluster of primary storage nodes on-premises and public or private object storage.
This is possible because a unified and common format for metadata is used for all copies of a snapshot, regardless of its location. The same set of APIs is used to manage all copies of data and the location of the copy is transparent to the management framework for the purposes of moving, copying, restoring, reporting, expiration, and analytics.
An additional way in which Tintri Cloud Connector differentiates itself from other products is data integrity. Any time data traverses networks that aren’t completely under your control, data integrity becomes even more critical. Tintri Cloud Connector maintains an end-to-end checksum on all live data. This checksum is updated as soon as a new write arrives. A checksum is preserved along with the snapshot at the time of creation.
The checksum is then replicated along with the snapshot and is re-computed to verify that it is correct. The process is performed such that the checksum on all logical bytes of an incremental can be re-computed by reading only the changed bytes to ensure efficiency. The checksum is then validated at the time of restore. These data integrity processes ensure that nothing is ever lost in translation.
Cloud Connector: efficiency at every level
Tintri Cloud Connector is designed to deliver efficiency at every level:
- It eliminates the transition points found in other solutions so that data flows with minimum impact on servers, storage, and networks.
- It operates at VM and container granularity so there’s never any extraneous data being replicated.
- It uses an incremental-forever architecture that ensures efficient backup and restore processes.
When you’re operating a data center at scale, with hundreds or thousands of virtual machines consuming resources, every percent of increased efficiency has a big impact. When Tintri introduced Cloud Connector, our goal was to create a solution for cloud data protection that was extremely efficient. The first blog in this series explained how Cloud Connector eliminates transition points, which cuts out the middleman so that data flows directly from storage to the cloud, reducing the overhead on servers and storage, and eliminating the need for data to traverse the network more than once. My previous post talked about the importance of granularity. The ability to operate directly on VMs, vDisks, or containers, rather than storage LUNs or volumes, makes your operation more storage efficient and reduces network load. In this final post, I want to explain some additional efficiency features designed into Cloud Connector.
Challenges of cloud replication
While many products support replicating a snapshot to the cloud, the format of the snapshot makes a big difference. In addition to the issue of granularity discussed last time, there are some additional points to consider:
- Does the snapshot format force each one to be a full snapshot (not synthetic-full)?
- Does the snapshot format force a full snapshot every so often?
- Is the given snapshot—delta or full—stored as a single object or a blob?
- Are snapshot blocks from multiple VM snapshots combined into a larger cloud block (object)?
All of the above are characteristics of one competing cloud replication product or another. The problems with regard to the space and performance overhead of incremental versus full snapshots are fairly obvious. Full snapshots consume more network bandwidth, take longer to replicate, and consume more storage in the cloud.
A problem also results when blocks of multiple snapshots that happen to be part of the same backup job are combined into larger cloud objects. Recovering a snapshot incurs read-amplification because you have to read back the blocks from the other snapshots that are part of the same cloud object. Recoveries are already one of the most expensive operations and reading more than needed adds overhead at all layers of the operation.