AWS,

How to Do S3 Glacier Backup: A Complete Guide for AWS Users

s3 gracier
AWS partner dedicated to startups

AWS partner dedicated to startups

  • 2000+ Clients
  • 5+ Years of Experience
  • $10M+ saved on AWS

Amazon S3 Glacier is one of the most cost-effective ways to store data long-term on AWS. Whether you are an IT administrator setting up a compliance archive, a developer automating backup pipelines, or a cloud architect designing a disaster recovery strategy, understanding how S3 Glacier backup works from end to end can save your organization a significant amount of money without sacrificing durability or security.

This guide covers everything you need to know: the different Glacier storage classes, how to back up data directly and via lifecycle policies, how retrieval works, cost considerations, and best practices for production-grade archiving.

What Is Amazon S3 Glacier?

Amazon S3 Glacier is a family of archive storage classes within Amazon Simple Storage Service (Amazon S3), designed for data that is rarely accessed but must be retained reliably over months or years. All Glacier classes offer 99.999999999% (eleven nines) durability by storing objects redundantly across multiple AWS Availability Zones within a given AWS Region.

AWS currently offers three Glacier storage classes, each targeting a different balance between cost and retrieval speed.

S3 Glacier Instant Retrieval is designed for data accessed roughly once per quarter that still needs millisecond response times when it is retrieved. Storage costs approximately $0.004 per GB per month, and retrieval fees apply per GB accessed. There is a minimum billable object size of 128 KB and a minimum storage duration of 90 days.

S3 Glacier Flexible Retrieval (formerly just “Amazon S3 Glacier”) is for archive data accessed once or twice a year where immediate access is not required. It stores at around $0.0036 per GB per month and offers three retrieval tiers: Expedited (1 to 5 minutes, charged per GB), Standard (3 to 5 hours, charged per GB), and Bulk (5 to 12 hours, free for large datasets). Objects are not available for real-time access and require a restore request first.

S3 Glacier Deep Archive is the lowest-cost storage option in the entire AWS ecosystem, at approximately $0.00099 per GB per month, roughly $1 per TB per month. It is intended for data retained for 7 to 10 or more years, accessed very rarely. Standard retrieval takes up to 12 hours and Bulk retrieval can take up to 48 hours. AWS recommends this class for data accessed less than once a year, such as long-term regulatory archives, compliance records, and scientific research datasets.

When Should You Use S3 Glacier for Backup?

Not every dataset belongs in Glacier. The tradeoff is clear: ultra-low storage costs in exchange for retrieval latency and retrieval fees. The right use cases include:

  • Compliance and regulatory archives: Industries like healthcare (HIPAA), finance (SEC Rule 17a-4), and legal services often require retaining records for 7 to 10 years or more. S3 Glacier Deep Archive is designed precisely for this.
  • Long-term database backups: Daily or weekly database snapshots that are unlikely to be needed, but must exist for disaster recovery or point-in-time restore requirements.
  • Media archives: Video production companies, news organizations, and broadcasters that retain raw footage or finished assets for years.
  • Log retention: Application and infrastructure logs that must be kept for auditing purposes but are almost never queried.
  • Scientific and research data: Genomic sequences, satellite imagery, and simulation outputs that are generated once and referred to rarely.

If you need to retrieve data within seconds or access it more than once a month, S3 Standard-Infrequent Access or S3 Intelligent-Tiering is a better choice. Glacier is for cold data, not warm data.

Method 1: Direct Upload to Glacier via the AWS CLI

The most straightforward way to send a backup to Glacier is to upload directly using the AWS Command Line Interface (AWS CLI) and specify the storage class at upload time.

Prerequisites

  • An AWS account with an IAM user or role that has s3:PutObject and s3:GetObject permissions on the target bucket.
  • The AWS CLI installed and configured (aws configure with your Access Key ID, Secret Access Key, and preferred AWS Region).
  • An existing S3 bucket. If you do not have one, create it:

bash

aws s3api create-bucket \
  --bucket my-glacier-backups \
  --region us-east-1

Enable bucket versioning to protect against accidental deletion or overwrite:

bash

aws s3api put-bucket-versioning \
  --bucket my-glacier-backups \
  --versioning-configuration Status=Enabled

Upload a Single File to S3 Glacier Flexible Retrieval

bash

aws s3 cp database-backup-2025-04-13.sql.gz \
  s3://my-glacier-backups/db/ \
  --storage-class GLACIER

Upload a Single File to S3 Glacier Deep Archive

bash

aws s3 cp legal-documents-archive.tar.gz \
  s3://my-glacier-backups/compliance/ \
  --storage-class DEEP_ARCHIVE

Sync an Entire Directory to Deep Archive

bash

aws s3 sync ./old-logs/ s3://my-glacier-backups/logs/2024/ \
  --storage-class DEEP_ARCHIVE \
  --exclude "*.tmp"

The --exclude flag prevents temporary files from being archived. The sync command only transfers files that have changed, so it is safe to schedule frequently without incurring unnecessary data transfer costs.

Upload to S3 Glacier Instant Retrieval

If you need archive-level pricing but millisecond access:

bash

aws s3 cp medical-records-q1.tar.gz \
  s3://my-glacier-backups/medical/ \
  --storage-class GLACIER_IR

Note: Ensure your AWS CLI is version 2.4.5 or later to support the GLACIER_IR storage class identifier.

Method 2: S3 Lifecycle Policies (Recommended for Production)

The best practice for most organizations is not to upload directly to Glacier, but to store data in S3 Standard first and then let lifecycle policies automatically transition objects to progressively cheaper storage classes as they age. This is the recommended approach because it avoids the operational complexity of manually choosing a storage class at upload time and ensures data is always in the most cost-effective tier for its age.

How S3 Lifecycle Policies Work

An S3 Lifecycle configuration is a set of rules attached to a bucket. Each rule specifies:

  • A filter (a prefix, object tag, or size) to scope which objects the rule applies to.
  • Transition actions that move objects to a cheaper storage class after a specified number of days.
  • Expiration actions that permanently delete objects after a specified number of days.

Creating a Lifecycle Policy via the AWS Management Console

  1. Sign in to the AWS Management Console and open the Amazon S3 console.
  2. Select your backup bucket from the bucket list.
  3. Click the Management tab.
  4. Click Create lifecycle rule.
  5. Name the rule (for example, BackupArchivePolicy) and optionally scope it to a prefix such as backups/.
  6. Under Lifecycle rule actions, enable Transition current versions of objects between storage classes.
  7. Add transition rules with the following structure (adjust days to your retention requirements):
    • After 30 days: transition to S3 Standard-Infrequent Access.
    • After 90 days: transition to S3 Glacier Flexible Retrieval.
    • After 365 days: transition to S3 Glacier Deep Archive.
  8. Optionally, add an expiration rule to delete objects after a defined retention period (for example, 2555 days or 7 years).
  9. Review and save the rule.

Creating a Lifecycle Policy via the AWS CLI

Save the following as lifecycle.json:

json

{
  "Rules": [
    {
      "ID": "BackupArchivePolicy",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "backups/"
      },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555
      }
    }
  ]
}

Apply it to your bucket:

bash

aws s3api put-bucket-lifecycle-configuration \
  --bucket my-glacier-backups \
  --lifecycle-configuration file://lifecycle.json

AWS treats lifecycle transitions asynchronously, so objects may transition a day or two after the rule threshold. This is normal behavior and not a misconfiguration.

Transitioning Noncurrent Versions

If versioning is enabled, you should also add rules for noncurrent object versions:

json

{
  "NoncurrentVersionTransitions": [
    {
      "NoncurrentDays": 30,
      "StorageClass": "STANDARD_IA"
    },
    {
      "NoncurrentDays": 365,
      "StorageClass": "GLACIER"
    }
  ],
  "NoncurrentVersionExpiration": {
    "NoncurrentDays": 2555
  }
}

This keeps your current backups in Standard storage for fast recovery, and ages older versions into Glacier automatically.

Organizing Your S3 Backup Bucket

A consistent prefix structure makes backup management dramatically simpler and allows you to write lifecycle rules that target specific types of data.

A recommended prefix convention for backups:

s3://my-glacier-backups/
  db/YYYY/MM/DD/<database-name>/
  logs/YYYY/MM/DD/<service-name>/
  configs/YYYY/MM/DD/
  compliance/YYYY/<department>/

Using date-based prefixes means you can scope lifecycle rules to specific data types and make retrieval deterministic. To quickly estimate retrieval costs before restoring a large archive, you can use AWS S3 Storage Lens or the AWS Pricing Calculator.

Enabling Compliance and Security Controls

S3 Object Lock

For regulated industries, S3 Object Lock enforces write-once-read-many (WORM) protection, preventing objects from being deleted or overwritten for a defined retention period regardless of IAM permissions.

To enable Object Lock (must be done at bucket creation):

bash

aws s3api create-bucket \
  --bucket compliance-archive \
  --object-lock-enabled-for-bucket \
  --region us-east-1

Set a default retention policy:

bash

aws s3api put-object-lock-configuration \
  --bucket compliance-archive \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "COMPLIANCE",
        "Years": 7
      }
    }
  }'

Compliance mode is stricter than Governance mode: not even the AWS root account can delete a locked object before its retention date expires.

Apply a legal hold on specific objects for litigation or regulatory investigations:

bash

aws s3api put-object-legal-hold \
  --bucket compliance-archive \
  --key evidence/contract-2024.pdf \
  --legal-hold '{"Status":"ON"}'

Server-Side Encryption

All objects stored in S3 Glacier classes can be encrypted at rest. Use AWS Key Management Service (AWS KMS) for centrally managed encryption keys:

bash

aws s3 cp sensitive-backup.tar.gz \
  s3://compliance-archive/backups/ \
  --storage-class DEEP_ARCHIVE \
  --sse aws:kms \
  --sse-kms-key-id alias/my-backup-key

You can also set a bucket-level default encryption policy so every uploaded object is encrypted automatically without relying on the client to specify it.

CloudTrail Logging

Enable AWS CloudTrail data events on your backup bucket to maintain an immutable audit log of every API call, including who uploaded, retrieved, or attempted to delete objects. This is a prerequisite for many compliance frameworks, including SOC 2, PCI DSS, and ISO 27001.

How to Restore Data from S3 Glacier

Because S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive are archival classes, objects are not available for immediate download. You must first initiate a restore request, which creates a temporary copy in S3 Standard storage. The restore request specifies how many days the temporary copy should remain available before it is automatically deleted.

Initiating a Restore Request via AWS CLI

Standard retrieval (3 to 5 hours) from Glacier Flexible Retrieval:

bash

aws s3api restore-object \
  --bucket my-glacier-backups \
  --key db/2024/12/01/prod-db.sql.gz \
  --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Standard"}}'

Bulk retrieval (5 to 12 hours, lower cost):

bash

aws s3api restore-object \
  --bucket my-glacier-backups \
  --key db/2024/12/01/prod-db.sql.gz \
  --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Bulk"}}'

Expedited retrieval (1 to 5 minutes, higher cost):

bash

aws s3api restore-object \
  --bucket my-glacier-backups \
  --key db/2024/12/01/prod-db.sql.gz \
  --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Expedited"}}'

Checking Restore Status

bash

aws s3api head-object \
  --bucket my-glacier-backups \
  --key db/2024/12/01/prod-db.sql.gz

The response will include a Restore field. When restoration is complete, it shows ongoing-request="false" along with the expiry date of the temporary copy. Once available, download the object using aws s3 cp or any S3-compatible tool.

Bulk Restore with S3 Batch Operations

If you need to restore hundreds or thousands of archived objects at once, use Amazon S3 Batch Operations instead of running individual restore commands. Batch Operations lets you define a job that targets objects returned by an S3 Inventory report or a manifest file, significantly reducing the operational overhead of large-scale recovery scenarios.

S3 Glacier Backup Cost Breakdown

Understanding the full cost model prevents billing surprises. Storage fees are only one component.

Storage ClassStorage CostMin. DurationRetrieval Time
S3 Standard~$0.023/GB/monthNoneMilliseconds
S3 Standard-IA~$0.0125/GB/month30 daysMilliseconds
S3 Glacier Instant Retrieval~$0.004/GB/month90 daysMilliseconds
S3 Glacier Flexible Retrieval~$0.0036/GB/month90 daysMinutes to hours
S3 Glacier Deep Archive~$0.00099/GB/month180 days12 to 48 hours

Prices shown are approximate US East (N. Virginia) region rates. Actual pricing varies by region.

Beyond storage, watch for these additional cost components:

Minimum storage duration charges. If you delete an object in Glacier Flexible Retrieval before 90 days, you are charged for the remaining days of the minimum duration. For Deep Archive the minimum is 180 days. Deleting many small, short-lived objects from Glacier can therefore generate unexpected charges.

Minimum billable object size. Objects smaller than 128 KB in any Glacier class are charged as if they were 128 KB. If your backup strategy creates many small files, the effective per-byte cost is much higher than the headline rate.

Retrieval fees. Expedited retrieval from Glacier Flexible Retrieval costs more per GB than Standard or Bulk retrieval. For Deep Archive, Standard retrieval costs apply for the 12-hour tier. Always plan retrievals in advance and use Bulk retrieval for non-urgent recovery.

Lifecycle transition fees. Transitioning objects to Glacier via a lifecycle rule incurs a per-1,000-requests fee. Transitioning 100 million small objects generates meaningful transition costs before you save a dollar on storage.

Temporary copy charges. During a restore, you pay both the Glacier storage rate and S3 Standard storage for the temporary copy for the number of days you specified in Days. Set Days to the minimum needed for your restore operation.

Common Mistakes to Avoid

Archiving too many small objects. The 128 KB minimum billable size means small objects are disproportionately expensive. Before archiving, consider bundling small files into archives using tar or zip to reduce object count and retrieval complexity.

Deleting objects before minimum duration. Understand the minimum storage duration for the class you are using before creating expiration rules.

Not testing restore procedures. A backup that has never been successfully restored is not a verified backup. Test a full restore from Glacier at least annually, documenting retrieval times and costs so there are no surprises during an actual incident.

Ignoring retrieval costs for compliance audits. Compliance auditors occasionally request bulk retrieval of archived records. Plan for this and budget for retrieval costs if your retention policy spans multiple years and terabytes.

Using Glacier for frequently changing data. If your data changes daily or weekly, Glacier is the wrong tier. The minimum storage duration and retrieval delay make it unsuitable for active workloads.

Integrating S3 Glacier with AWS Backup

For organizations that want centralized backup management across multiple AWS services, AWS Backup provides a managed policy engine that can back up Amazon EC2, Amazon RDS, Amazon DynamoDB, Amazon EFS, and Amazon S3 data to Glacier automatically.

AWS Backup supports lifecycle rules that move recovery points to cold storage (backed by Glacier) after a configurable number of days, and deletes them after a defined retention period. This gives you a single pane of glass for backup policies, compliance reports, and restore operations across your entire AWS footprint without writing lifecycle JSON manually for each resource.

S3 Glacier vs. Other AWS Storage Services for Backup

Understanding where Glacier fits relative to other AWS backup-related services helps you make the right architecture decision.

Amazon EBS Snapshots capture point-in-time snapshots of Amazon EC2 block storage volumes, stored in S3 internally but managed through the EBS API. They are ideal for OS-level and disk-level recovery but are not as cost-effective as Glacier for long-term retention at scale.

Amazon RDS Automated Backups provide database-level backups managed entirely by AWS, with a retention window of up to 35 days. For longer retention, export snapshots to S3 and apply a Glacier lifecycle policy.

AWS Storage Gateway Tape Gateway lets on-premises backup applications (such as Veeam, Commvault, or Veritas NetBackup) write to virtual tapes that are automatically stored in S3 Glacier Flexible Retrieval or Deep Archive. This is useful for organizations migrating away from physical tape libraries without changing existing backup software workflows.

AWS Snowball Edge is an option for migrating large on-premises datasets directly to Glacier when network transfer is impractical at the required scale.

Best Practices Summary

Write lifecycle policies rather than uploading directly to Glacier, unless the data is clearly cold from day one. Set a consistent prefix naming convention that aligns your lifecycle rules with business categories of data. Enable bucket versioning to protect against accidental deletion. Enable server-side encryption with AWS KMS for sensitive archives. Enable AWS CloudTrail data events for audit logging. Use S3 Object Lock in Compliance mode for regulated retention requirements. Test restore procedures regularly and document retrieval times. Bundle small files before archiving to minimize the per-object overhead of Glacier pricing. Monitor costs using AWS Cost Explorer and S3 Storage Lens to identify buckets where objects are in the wrong storage class for their actual access patterns.

Conclusion

S3 Glacier backup is one of the most reliable and cost-effective long-term data retention solutions available in cloud computing today. By understanding the three Glacier storage classes, the lifecycle policy model, retrieval mechanics, and the full cost structure including minimum duration charges and retrieval fees, you can design a backup architecture that keeps your data safe, meets your compliance requirements, and does not generate unnecessary AWS spend.

Whether you are moving an on-premises tape archive to the cloud, building automated retention pipelines for regulatory compliance, or simply ensuring that years of database backups are preserved without the cost of keeping them in standard storage, S3 Glacier provides the durability, security, and pricing model to do it at scale.

Share this article:
Not sure which S3 Glacier tier is right for your workload?
Cloudvisor helps AWS teams cut storage costs without the guesswork. Get a free cost review and find out where you are overpaying.
Get in touch