Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3_object fails to copy in AWS when source is larger than 5GiB #2117

Open
1 task done
colin-nolan opened this issue May 29, 2024 · 1 comment
Open
1 task done

s3_object fails to copy in AWS when source is larger than 5GiB #2117

colin-nolan opened this issue May 29, 2024 · 1 comment
Assignees
Labels
bug This issue/PR relates to a bug

Comments

@colin-nolan
Copy link

colin-nolan commented May 29, 2024

Summary

amazon.aws.s3_object fails to copy files within AWS when they are larger than 5GiB. The use-case where we encountered this issue was when copying between buckets (mode: copy with copy_src set) - but it likely effects all copy usage.

I'd guess the switch to a multi-part upload strategy is required for files over 5GiB.

Issue Type

Bug Report

Component Name

s3_object

Ansible Version

$ ansible --version
ansible [core 2.16.7]
  config file = <redacted>/ansible/ansible.cfg
  configured module search path = ['<redacted>/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = <redacted>/ansible/.venv/lib/python3.12/site-packages/ansible
  ansible collection location = <redacted>/.ansible/collections:/usr/share/ansible/collections
  executable location = <redacted>/ansible/.venv/bin/ansible
  python version = 3.12.3 (main, Apr  9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] (<redacted>/ansible/.venv/bin/python)
  jinja version = 3.1.4
  libyaml = True

Collection Versions

$ ansible-galaxy collection list
# <redacted>/.ansible/collections/ansible_collections
Collection                               Version
---------------------------------------- -------
amazon.aws                               8.0.0  

# <redacted>/ansible/.venv/lib/python3.12/site-packages/ansible_collections
Collection                               Version
---------------------------------------- -------
amazon.aws                               7.6.0  
ansible.netcommon                        5.3.0  
ansible.posix                            1.5.4  
ansible.utils                            2.12.0 
ansible.windows                          2.3.0  
arista.eos                               6.2.2  
awx.awx                                  23.9.0 
azure.azcollection                       1.19.0 
check_point.mgmt                         5.2.3  
chocolatey.chocolatey                    1.5.1  
cisco.aci                                2.9.0  
cisco.asa                                4.0.3  
cisco.dnac                               6.13.3 
cisco.intersight                         2.0.9  
cisco.ios                                5.3.0  
cisco.iosxr                              6.1.1  
cisco.ise                                2.9.1  
cisco.meraki                             2.18.1 
cisco.mso                                2.6.0  
cisco.nxos                               5.3.0  
cisco.ucs                                1.10.0 
cloud.common                             2.1.4  
cloudscale_ch.cloud                      2.3.1  
community.aws                            7.2.0  
community.azure                          2.0.0  
community.ciscosmb                       1.0.9  
community.crypto                         2.20.0 
community.digitalocean                   1.26.0 
community.dns                            2.9.1  
community.docker                         3.10.1 
community.general                        8.6.1  
community.grafana                        1.8.0  
community.hashi_vault                    6.2.0  
community.hrobot                         1.9.2  
community.library_inventory_filtering_v1 1.0.1  
community.libvirt                        1.3.0  
community.mongodb                        1.7.4  
community.mysql                          3.9.0  
community.network                        5.0.2  
community.okd                            2.3.0  
community.postgresql                     3.4.1  
community.proxysql                       1.5.1  
community.rabbitmq                       1.3.0  
community.routeros                       2.15.0 
community.sap                            2.0.0  
community.sap_libs                       1.4.2  
community.sops                           1.6.7  
community.vmware                         4.4.0  
community.windows                        2.2.0  
community.zabbix                         2.4.0  
containers.podman                        1.13.0 
cyberark.conjur                          1.2.2  
cyberark.pas                             1.0.25 
dellemc.enterprise_sonic                 2.4.0  
dellemc.openmanage                       8.7.0  
dellemc.powerflex                        2.4.0  
dellemc.unity                            1.7.1  
f5networks.f5_modules                    1.28.0 
fortinet.fortimanager                    2.5.0  
fortinet.fortios                         2.3.6  
frr.frr                                  2.0.2  
gluster.gluster                          1.0.2  
google.cloud                             1.3.0  
grafana.grafana                          2.2.5  
hetzner.hcloud                           2.5.0  
hpe.nimble                               1.1.4  
ibm.qradar                               2.1.0  
ibm.spectrum_virtualize                  2.0.0  
ibm.storage_virtualize                   2.3.1  
infinidat.infinibox                      1.4.5  
infoblox.nios_modules                    1.6.1  
inspur.ispim                             2.2.1  
inspur.sm                                2.3.0  
junipernetworks.junos                    5.3.1  
kaytus.ksmanage                          1.2.1  
kubernetes.core                          2.4.2  
lowlydba.sqlserver                       2.3.2  
microsoft.ad                             1.5.0  
netapp.aws                               21.7.1 
netapp.azure                             21.10.1
netapp.cloudmanager                      21.22.1
netapp.elementsw                         21.7.0 
netapp.ontap                             22.11.0
netapp.storagegrid                       21.12.0
netapp.um_info                           21.8.1 
netapp_eseries.santricity                1.4.0  
netbox.netbox                            3.18.0 
ngine_io.cloudstack                      2.3.0  
ngine_io.exoscale                        1.1.0  
openstack.cloud                          2.2.0  
openvswitch.openvswitch                  2.1.1  
ovirt.ovirt                              3.2.0  
purestorage.flasharray                   1.28.0 
purestorage.flashblade                   1.17.0 
purestorage.fusion                       1.6.1  
sensu.sensu_go                           1.14.0 
splunk.es                                2.1.2  
t_systems_mms.icinga_director            2.0.1  
telekom_mms.icinga_director              1.35.0 
theforeman.foreman                       3.15.0 
vmware.vmware_rest                       2.3.1  
vultr.cloud                              1.12.1 
vyos.vyos                                4.1.0  
wti.remote                               1.0.5  

AWS SDK versions

$ pip show boto boto3 botocore
WARNING: Package(s) not found: boto
Name: boto3
Version: 1.34.99
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: <redacted>/ansible/.venv/lib/python3.12/site-packages
Requires: botocore, jmespath, s3transfer
Required-by: 
---
Name: botocore
Version: 1.34.99
Summary: Low-level, data-driven core of boto 3.
Home-page: https://github.com/boto/botocore
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: <redacted>/ansible/.venv/lib/python3.12/site-packages
Requires: jmespath, python-dateutil, urllib3
Required-by: boto3, s3transfer

Configuration

$ ansible-config dump --only-changed
CONFIG_FILE() = <redacted>/ansible/ansible.cfg
DEFAULT_INVENTORY_PLUGIN_PATH(<redacted>/ansible/ansible.cfg) = ['<redacted>/ansible/plugins/inventory']
DUPLICATE_YAML_DICT_KEY(<redacted>/ansible/ansible.cfg) = ignore
INVENTORY_IGNORE_EXTS(<redacted>/ansible/ansible.cfg) = ["{{(REJECT_EXTS + ('.orig'", '.cfg', "'.retry'))}}"]
INVENTORY_UNPARSED_IS_FAILED(<redacted>/ansible/ansible.cfg) = True

OS / Environment

N/A

Steps to Reproduce

- amazon.aws.s3_object:
    bucket: bucket-wanting-big-file
    mode: copy
    copy_src:
      bucket: bucket-with-big-file

Expected Results

Expected to copy any files over 5GiB to the destination bucket in an idempotent manor.

Actual Results

Task failure, resulting in the traceback:

The full traceback is:
Traceback (most recent call last):
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1320, in copy_object_to_bucket
    s3.copy_object(aws_retry=True, **params)
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/retries.py", line 105, in deciding_wrapper
    return retrying_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/cloud.py", line 119, in _retry_wrapper
    return _retry_func(
           ^^^^^^^^^^^^
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/cloud.py", line 68, in _retry_func
    return func()
           ^^^^^^
  File "<redacted>/ansible/.venv/lib/python3.12/site-packages/botocore/client.py", line 565, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<redacted>/ansible/.venv/lib/python3.12/site-packages/botocore/client.py", line 1021, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1579, in main
    func(module, s3, s3_v4, s3_object_params)
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1354, in s3_object_do_copy
    updated, result = copy_object_to_bucket(
                      ^^^^^^^^^^^^^^^^^^^^^^
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1331, in copy_object_to_bucket
    raise S3ObjectFailure(
S3ObjectFailure: Failed while copying object 7G.bin from bucket None.
fatal: [staging]: FAILED! => {
    "boto3_version": "1.34.99",
    "botocore_version": "1.34.99",
    "changed": false,
    "error": {
        "code": "InvalidRequest",
        "message": "The specified copy source is larger than the maximum allowable size for a copy source: 5368709120"
    },
    "invocation": {
        "module_args": {
            "access_key": "<redacted>",
            "aws_ca_bundle": null,
            "aws_config": null,
            "bucket": "<redacted>",
            "ceph": false,
            "content": null,
            "content_base64": null,
            "copy_src": {
                "bucket": "<redacted>",
                "object": null,
                "prefix": "",
                "version_id": null
            },
            "debug_botocore_endpoint_logs": false,
            "dest": null,
            "dualstack": false,
            "encrypt": true,
            "encryption_kms_key_id": null,
            "encryption_mode": "AES256",
            "endpoint_url": null,
            "expiry": 600,
            "headers": null,
            "ignore_nonexistent_bucket": false,
            "marker": "",
            "max_keys": 1000,
            "metadata": null,
            "mode": "copy",
            "object": null,
            "overwrite": "different",
            "permission": [],
            "prefix": "",
            "profile": null,
            "purge_tags": true,
            "region": null,
            "retries": 0,
            "secret_key": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
            "session_token": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
            "sig_v4": true,
            "src": null,
            "tags": null,
            "validate_bucket_name": true,
            "validate_certs": true,
            "version": null
        }
    },
    "msg": "Failed while copying object 7G.bin from bucket None.: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120",
    "response_metadata": {
        "host_id": "<redacted>",
        "http_headers": {
            "connection": "close",
            "content-type": "application/xml",
            "date": "Wed, 29 May 2024 12:59:34 GMT",
            "server": "AmazonS3",
            "transfer-encoding": "chunked",
            "x-amz-id-2": "<redacted>",
            "x-amz-request-id": "<redacted>"
        },
        "http_status_code": 400,
        "request_id": "<redacted>",
        "retry_attempts": 0
    }
}

Code of Conduct

  • I agree to follow the Ansible Code of Conduct
@colin-nolan colin-nolan changed the title s3_object fails to copy in AWS when source is larger than 5GiB s3_object fails to copy in AWS when source is larger than 5GiB May 29, 2024
@gravesm gravesm added bug This issue/PR relates to a bug and removed needs_triage labels Jun 4, 2024
@alinabuzachis alinabuzachis self-assigned this Jun 26, 2024
@colin-nolan
Copy link
Author

@alinabuzachis many thanks for assigning on this one. I absolutely understand that you undoubtedly have a lot to do - but I just wondered if you could give an indication on whether/when this might sit on your roadmap?

Ideally, we would push a fix up from our side but we're not currently in a great position to do this. I'm trying to determine whether we should invest in a temp work-around, or just put up with manually syncing some of our larger data until a fix is in place.

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue/PR relates to a bug
Projects
None yet
Development

No branches or pull requests

3 participants