AWS DataSync is a key service featured in AWS exams, designed for efficient data synchronization across different locations, including on-premises, other cloud platforms, and within AWS itself. Here's a concise summary of the key points covered in the lecture:
AWS DataSync Overview
- Purpose: Synchronizes large volumes of data across different locations.
- Use Cases: Ideal for migrating data between on-premises or other clouds into AWS, and between AWS services.
- Protocols Supported: NFS, SMB, HDFS, among others.
- Agent Requirement: An agent is necessary for connections outside of AWS (e.g., on-premises or other clouds).
Key Features
- Target AWS Services: Amazon S3 (all storage classes, including Glacier), Amazon EFS, and Amazon FSx.
- Scheduling: DataSync operations can be scheduled hourly, daily, or weekly; they are not continuous.
- Metadata and Permissions: Preserves file permissions and metadata, ensuring compliance with NFS POSIX and SMB permissions. This is critical for exams where metadata preservation is a focus.
- Performance: A single DataSync agent can handle up to 10 gigabits per second, with options to limit bandwidth usage.
Architectural Insights
- On-Premises to AWS: DataSync can synchronize data from on-premises servers using NFS or SMB protocols to AWS services like S3, EFS, or FSx.

- Bi-directional Sync: Supports synchronization from AWS back to on-premises, not just one-way.
- Network Capacity Concerns: For scenarios with limited network capacity, AWS Snowcone devices, which come with a pre-installed DataSync agent, can be used to physically transport data into AWS.
AWS Storage Services Synchronization
- Between AWS Services: DataSync can also be used to synchronize data between different AWS storage services, maintaining metadata during the process.
- Scheduled Tasks: Synchronization is based on scheduled tasks, not performed continuously.
Key Takeaways