Google Cloud Storage upload with s3cmd

September 7, 2022 1 By addshore

I recently had to upload a large (1TB) file from a Wikimedia server into a Google Cloud Storage bucket. gsutil was not available to me, but s3cmd was. So here is a little how to post for uploading to a Google Cloud Storage bucket using the S3 API and s3cmd.

S3cmd is a free command line tool and client for uploading, retrieving and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol, such as Google Cloud Storage or DreamHost DreamObjects.

https://s3tools.org/s3cmd

s3cmd was already installed on my system, but if you want to get it, see the download page.

Google Cloud secret & access keys

I took guidance from a post from the Bitmovin docs with some modifications.

You need to navigate to your Bucket > Settings > Interoperability.

The easiest path here is to hit the “Create a Key” button for your user account. This will immediately generate you an access key and secret that you can use below.

Service Account (optional)

You can also create a dedicated service account for s3cmd to use which I did in my case.

Click”Create key for service account”.

You can then pick an existing service account, or create a new one, your keys will then be shown.

You need to ensure that the service account has access to perform the needed operations within the bucket. So I added Storage Legacy Bucket Owner and Storage Legacy Object Owner as this bucket is dedicated for this porpose.

If you want to access a “requester pays” bucket you’ll also need to set something like Project Billing Manager on to the service account. You can read more about the fine-grained permissions on the Google Cloud docs.

Basic usage

Without creating any sort of configuration file you can pass most arguments via command line options.

s3cmd --disable-multipart --host=storage.googleapis.com --host-bucket="%(bucket).storage.googleapis.com" --access_key XXX1 --secret_key XXX2 ./large.txt put s3://addshore-bucket/large.txt
Code language: JavaScript (javascript)

At least one post that I read said that --disable-multipart was required, hence it is included in the examples as it worked fine for me. [1]

Using a proxy

The only way that I could figure out how to use a HTTP/HTTPS proxy with s3cmd was via the config file rather than only command line arguments.

You can use s3cmd --configure which will guide you through the creation of a rather extensive ~/.s3cfg. You can find a small walkthrough of the --configure user experience in this post by Nicky Mathew. Or you can use the minimum config I have provided below.

[default] access_key = XXX1 access_token = host_base = storage.googleapis.com host_bucket = %(bucket).storage.googleapis.com proxy_host = webproxy.eqiad.wmnet proxy_port = 8080 public_url_use_https = False secret_key = XXX2
Code language: PHP (php)

From this point, the command is quite simple, as most details are read from the config file.

s3cmd --disable-multipart put ./large.txt s3://addshore-bucket/large.txt
Code language: JavaScript (javascript)