Table of Contents
Overview
In this lab, you will use gsutil to create a bucket and perform operations on objects. gsutil is a Python application that lets you access Cloud Storage from the command line. The gsutil tool has commands such as mb and cp to perform operations. Each command has a set of options that are used to customize settings further.
What you'll learn to do
Create a bucket
Copy files from a local folder to a bucket
Synchronize the contents of the local folder with the contents of the bucket
Change access control permissions on objects
Setup and requirements
Labs are timed and cannot be paused. The timer starts when you click Start Lab.
The included cloud terminal is preconfigured with the gcloud SDK.
Use the terminal to execute commands and then click Check my progress to verify your work.
Get the sample code and set variables
- In the cloud terminal session, execute the following command to download sample data for this lab from a git repository:
git clone https://github.com/GoogleCloudPlatform/training-data-analyst
- Change to the blogs directory:
cd training-data-analyst/blogs
- Set some environment variables:
PROJECT_ID=`gcloud config get-value project`
BUCKET=${PROJECT_ID}-bucket
Task 1. Create a bucket
- Run the following command to create a bucket with multi-regional storage class:
gsutil mb -c multi_regional gs://${BUCKET}
The gsutil mb command is used to create a new Google Cloud Storage bucket. The -c flag allows you to specify the storage class for the bucket, and the multi_regional storage class is designed for data that needs to be available in multiple regions (ideal for high availability and low-latency access).
Click Check my progress to verify the objective.
Create a bucket
Task 2. Upload objects to your bucket
- Run the following to copy the
endpointslambdaobject to your bucket:
gsutil -m cp -r endpointslambda gs://${BUCKET}
The gsutil -m cp -r command is used to copy files or directories to a Google Cloud Storage bucket
If you have a large number of files to transfer, you might want to use the -m option, to perform a parallel (multi-threaded/multi-processing) copy for faster performance. The -r option allows gsutil to recurse through directories.
Click Check my progress to verify the objective.
Upload objects to your bucket
Task 3. List objects
- To list objects in your bucket, execute the following command:
gsutil ls gs://${BUCKET}/*
This command lists all objects (files and directories) inside a specific bucket in Google Cloud Storage, including any nested objects.
Task 4. Sync changes with bucket
- Use the following mv command to rename and rm command to delete some files:
mv endpointslambda/Apache2_0License.txt endpointslambda/old.txt
rm endpointslambda/aeflex-endpoints/app.yaml
- Now synchronize the local changes with the bucket:
gsutil -m rsync -d -r endpointslambda gs://${BUCKET}/endpointslambda
In this command, the -d option deletes files from the target if they're missing in the source (in this case, it deletes app.yaml from the bucket). The -r option runs the command recursively on directories.
- To verify that the bucket is now in sync with your local changes, list the files in the bucket again:
gsutil ls gs://${BUCKET}/*
Click Check my progress to verify the objective.
Sync changes with bucket
Task 5. Make objects public
- To allow public access to all files under the
endpointslambdafolder in your bucket, execute the following command:
gsutil -m acl set -R -a public-read gs://${BUCKET}
The above command is used to set access control lists (ACLs) on Cloud Storage buckets or objects. This makes all the objects in a bucket publicly readable.
The -m flag enables parallel processing, which means multiple operations (like setting ACLs on many files) will be executed simultaneously, speeding up the process. The -R flag applies the ACL recursively to all objects inside the bucket. Without this, it would only apply to the bucket itself, not its contents.
- To confirm files are viewable by the public, open the following link in a new incognito or private browser window, replacing
<your-bucket-name>with the full name of your bucket, not the environment variable:
http://storage.googleapis.com/<your-bucket-name>/endpointslambda/old.txt
This URL uses the Cloud Storage API link to view the object without authentication. Learn more about accessing public data from the Accessing public data documentation.
Task 6. Copy with different storage class
- Next, copy a file with Nearline storage class instead of the bucket's default Multi-regional storage class:
gsutil cp -s nearline ghcn/ghcn_on_bq.ipynb gs://${BUCKET}
The gsutil cp command is used to copy files from one location to another, either within Cloud Storage or from a local file system to Cloud Storage and -s flag specifies the storage class for the file being uploaded.
Click Check my progress to verify the objective.
Copy with different storage class
Task 7. Check storage classes
- Run the following to check the storage classes and view other detailed information about the objects in your bucket:
gsutil ls -Lr gs://${BUCKET} | more
- Press the
spacekey to continue viewing the rest of the command's output.
The output shows that the ghcn_on_bq.ipynb object has NEARLINE storage class while the other objects have MULTI_REGIONAL storage class.
Output:
gs://qwiklabs-xxx-xxxxxxxxxxxxxxxx-bucket/ghcn_on_bq.ipynb:
Creation time: Tue, 13 Aug 2019 20:19:27 GMT
Update time: Tue, 13 Aug 2019 20:19:27 GMT
Storage class: NEARLINE
Content-Length: 980176
Content-Type: application/octet-stream
...
gs://qwiklabs-xxx-xxxxxxxxxxxxxxxx-bucket/endpointslambda/:
gs://qwiklabs-xxx-xxxxxxxxxxxxxxxx-bucket/endpointslambda/README.md:
Creation time: Tue, 13 Aug 2019 20:03:29 GMT
Update time: Tue, 13 Aug 2019 20:15:43 GMT
Storage class: MULTI_REGIONAL
Content-Length: 452
Content-Type: text/markdown
...
- You can use Ctrl + c to return to the command line.
Solution of Lab
Quick
curl -LO raw.githubusercontent.com/ePlus-DEV/storage/refs/heads/main/labs/manage-storage-configuration-using-gsutil/lab.sh
source lab.sh
