As data – and our reliance on it – grows, the way we store it becomes equally vital. Which is why more and more organizations are turning to the cloud.
Moving data to the cloud reduces your infrastructure, maintenance, and operations, and frees up valuable resources by turning capital expenses (capex) into operating expenses (opex).
But like many companies looking to migrate data to the cloud, you may still have lingering questions. Namely, how can you expect to move large quantities of data quickly, efficiently, and with as little disruption as possible?
To this end, the three major cloud service providers (CSPs) – Amazon Web Services (AWS), Azure, and Google Cloud Platform (GCP) – have new tools for online data migration to simplify sending your on-premises data to the cloud.
In this post, I’ll examine how these tools simplify and speed up the data transfer process, as well as take a closer look at each CSP’s respective tool.
Enhancing the data migration process through parallelizing writes
Older online methods of data migration like secure file transfer protocol (SFTP) only use a single thread to transfer data. While this is functional and valid, it doesn’t allow for top-of-the-line throughput, limiting the speed in which data moves to the cloud.
The newer tools, on the other hand, take advantage of parallelizing writes, or multi-thread writes.
Think of this like a highway: If you’re moving data through a single lane, you can only go so fast. By adding additional lanes, or parallel writes, you improve the write performance and, as such, decrease the time it takes to transfer data.
The chart below provides an estimate on speeds and transfer times when working with online data replication.
3 new tools from AWS, Azure, and GCP
The tool you use will depend on which CSP you’ve chosen, and while there are differences, each accelerates data transfer.
- AWS Data Sync. This data transfer service simplifies moving data between on-premises and AWS. Key features include:
- Parallelism and multi-threading, which can result in a data transfer performance increase of 10x
- An on-premises component that’s simple to deploy and easy to manage
- Transferred data is encrypted
- Azure AzCopy. AzCopy is a command line tool used to copy or sync files to Azure storage. Version 10 is the most recent. Key features include:
- Optimized to take advantage of multi-threading and parallelism, increasing data throughput when replicating data between on-premises and Azure storage
- Version 10 is supported on Windows, Linux, and Mac
- Scripts can be written to execute on schedules, data will be replicated to defined Azure storage targets
- GCP Cloud Storage. GCP has provided the gsutil command line utility to replicate or synchronize an on-premises volume to Google Cloud Storage. Key features include:
- With an existing bucket created, the gsutil utility is downloaded and configured to run once or on schedule via a script
- Using the rsync flag in gsutil ensures the data replicated to Google Storage matches the source volume exactly
These three online tools are available at no additional charge from each provider. But be sure to check for additional fees for ingress and egress before transferring large data sets in either direction.
The tools are different, but the goal is the same
Older methods of migrating data to the cloud, including copying via the console or using a third-party product, still exist, but with these new tools, CSPs are looking to reduce the operational overhead of migrating data.
In the end, your on-premises data can be sent to the cloud faster, more efficiently, and without impacting the applications or data you’re creating on premises.