Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion benchmarks/benchmarkUtil.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def upload_config_to_gs(self, scenario):
experiment_name,
scenario.name,
self.clusterName)
command = "gsutil cp {} {} ".format(scenario.config_file_name,
command = "gcloud storage cp {} {} ".format(scenario.config_file_name,
scenario_destination_bucket_path)
Comment on lines +60 to 61
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The command string is constructed using .format() with variables scenario.config_file_name and scenario_destination_bucket_path, which are derived from YAML configuration files. Since this command is executed via execute_shell (which uses subprocess.Popen(cmd, shell=True)), it is vulnerable to command injection if the YAML content or scenario names contain shell metacharacters. It is recommended to avoid shell=True and pass arguments as a list.

execute_shell(command)

Expand Down
2 changes: 1 addition & 1 deletion benchmarks/trigger_bigbench_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def upload_results():
for file in os.listdir('/Big-Data-Benchmark-for-Big-Bench/logs/'):
if file.__contains__(".csv") or file.__contains__(".zip"):
output_path = "{}/{}/{}/".format(sys.argv[1], sys.argv[2], cluster_name)
command = "gsutil cp /Big-Data-Benchmark-for-Big-Bench/logs/{} {}{}" \
command = "gcloud storage cp /Big-Data-Benchmark-for-Big-Bench/logs/{} {}{}" \
.format(file, output_path, file)
Comment on lines +51 to 52
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The command string is constructed using .format() with file and output_path. output_path is derived from sys.argv[1] and sys.argv[2], which are untrusted command-line arguments. Since the command is executed via execute_shell (which uses subprocess.Popen(cmd, shell=True)), an attacker can inject arbitrary shell commands by providing malicious command-line arguments. It is recommended to avoid shell=True and pass arguments as a list.

execute_shell(command)

Expand Down
2 changes: 1 addition & 1 deletion benchmarks/trigger_hibench_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def upload_results():
attributes/dataproc-cluster-name") \
.read()
output_path = "{}/{}/{}/hibench.report".format(sys.argv[1], sys.argv[2], cluster_name)
cmd = "gsutil cp /HiBench/report/hibench.report {}".format(output_path)
cmd = "gcloud storage cp /HiBench/report/hibench.report {}".format(output_path)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The command string is constructed using .format() with output_path, which is derived from sys.argv[1] and sys.argv[2]. Since the command is executed via execute_shell (which uses subprocess.Popen(cmd, shell=True)), an attacker can inject arbitrary shell commands by providing malicious command-line arguments. It is recommended to avoid shell=True and pass arguments as a list.

execute_shell(cmd)


Expand Down
4 changes: 2 additions & 2 deletions gcloud/bin/audit-dpgce-create
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,8 @@ check_exists "Main Subnet ${SUBNET}" "gcloud compute networks subnets describe '
check_exists "Service Account ${GSA}" "gcloud iam service-accounts describe '${GSA}' --project='${PROJECT_ID}'"

# 4. GCS Buckets
check_exists "GCS Staging Bucket gs://${BUCKET}" "gsutil ls -b 'gs://${BUCKET}'"
check_exists "GCS Temp Bucket gs://${TEMP_BUCKET}" "gsutil ls -b 'gs://${TEMP_BUCKET}'"
check_exists "GCS Staging Bucket gs://${BUCKET}" "gcloud storage ls --buckets 'gs://${BUCKET}'"
check_exists "GCS Temp Bucket gs://${TEMP_BUCKET}" "gcloud storage ls --buckets 'gs://${TEMP_BUCKET}'"
Comment on lines +111 to +112
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The script calls check_exists with a command string that includes ${BUCKET} and ${TEMP_BUCKET}. These variables are sourced from env.json and are not sanitized. The check_exists function (defined in lib/script-utils.sh) uses eval to execute the command string. An attacker who can control the values in env.json can inject arbitrary shell commands by escaping the single quotes in the command string. It is recommended to avoid using eval and to ensure variables are properly quoted and validated.


# 5. Cloud Router
check_exists "Cloud Router ${ROUTER_NAME}" "gcloud compute routers describe '${ROUTER_NAME}' --region='${REGION}' --project='${PROJECT_ID}'"
Expand Down
4 changes: 2 additions & 2 deletions gcloud/bin/audit-dpgce-destroy
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ check_resource "Main Subnet" \
check_resource_exact "VPC Network ${NETWORK}" "gcloud compute networks describe \"${NETWORK}\" --project=\"${PROJECT_ID}\""

# 8. GCS Buckets (Optional without --force)
check_resource_exact "GCS Staging Bucket gs://${BUCKET}" "gsutil ls -b 'gs://${BUCKET}'" true
check_resource_exact "GCS Temp Bucket gs://${TEMP_BUCKET}" "gsutil ls -b 'gs://${TEMP_BUCKET}'" true
check_resource_exact "GCS Staging Bucket gs://${BUCKET}" "gcloud storage ls --buckets 'gs://${BUCKET}'" true
check_resource_exact "GCS Temp Bucket gs://${TEMP_BUCKET}" "gcloud storage ls --buckets 'gs://${TEMP_BUCKET}'" true
Comment on lines +66 to +67
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The script calls check_resource_exact with command strings containing ${BUCKET} and ${TEMP_BUCKET}. check_resource_exact uses eval (defined in lib/script-utils.sh), leading to potential command injection if the variables in env.json contain shell metacharacters. It is recommended to avoid using eval and to ensure variables are properly quoted and validated.


echo -e "\nAudit complete."
echo -e "[${YELLOW}Pass*${NC}] indicates the resource was not found (which is expected after destroy)."
Expand Down
4 changes: 2 additions & 2 deletions gcloud/bin/audit-private-create
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ check_exists "SWP Subnet ${SWP_SUBNET}" "gcloud compute networks subnets describ
check_exists "Service Account ${GSA}" "gcloud iam service-accounts describe '${GSA}' --project='${PROJECT_ID}'"

# 4. GCS Buckets
check_exists "GCS Staging Bucket gs://${BUCKET}" "gsutil ls -b 'gs://${BUCKET}'"
check_exists "GCS Temp Bucket gs://${TEMP_BUCKET}" "gsutil ls -b 'gs://${TEMP_BUCKET}'"
check_exists "GCS Staging Bucket gs://${BUCKET}" "gcloud storage ls --buckets 'gs://${BUCKET}'"
check_exists "GCS Temp Bucket gs://${TEMP_BUCKET}" "gcloud storage ls --buckets 'gs://${TEMP_BUCKET}'"
Comment on lines +41 to +42
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The script calls check_exists with command strings containing ${BUCKET} and ${TEMP_BUCKET}. check_exists uses eval (defined in lib/script-utils.sh), leading to potential command injection if the variables in env.json contain shell metacharacters. It is recommended to avoid using eval and to ensure variables are properly quoted and validated.


# 5. SWP Certificate Components
SUFFIX=${RESOURCE_SUFFIX}
Expand Down
4 changes: 2 additions & 2 deletions gcloud/bin/audit-private-destroy
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,8 @@ check_resource "Main Subnet" \
check_resource_exact "VPC Network ${NETWORK}" "gcloud compute networks describe \"${NETWORK}\" --project=\"${PROJECT_ID}\""

# 14. GCS Buckets (Optional without --force)
check_resource_exact "GCS Staging Bucket gs://${BUCKET}" "gsutil ls -b 'gs://${BUCKET}'" true
check_resource_exact "GCS Temp Bucket gs://${TEMP_BUCKET}" "gsutil ls -b 'gs://${TEMP_BUCKET}'" true
check_resource_exact "GCS Staging Bucket gs://${BUCKET}" "gcloud storage ls --buckets 'gs://${BUCKET}'" true
check_resource_exact "GCS Temp Bucket gs://${TEMP_BUCKET}" "gcloud storage ls --buckets 'gs://${TEMP_BUCKET}'" true
Comment on lines +134 to +135
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The script calls check_resource_exact with command strings containing ${BUCKET} and ${TEMP_BUCKET}. check_resource_exact uses eval (defined in lib/script-utils.sh), leading to potential command injection if the variables in env.json contain shell metacharacters. It is recommended to avoid using eval and to ensure variables are properly quoted and validated.


echo -e "\nAudit complete."
echo -e "[${YELLOW}Pass*${NC}] indicates the resource was not found (which is expected after destroy)."
Expand Down
7 changes: 3 additions & 4 deletions gcloud/bin/recreate-dpgce
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ source lib/gcp/iam.sh

# Copy latest initialization action scripts
#echo -n "copying actions to gcs bucket..."
#gsutil -m cp \
# -L action-update.log \
# -r init/* gs://${BUCKET}/dataproc-initialization-actions
#gcloud storage cp \
# --manifest-path action-update.log \
# --recursive init/* gs://${BUCKET}/dataproc-initialization-actions
#if [[ $? == 0 ]]; then
# echo "done"
#else
Expand All @@ -39,4 +39,3 @@ create_dpgce_cluster
echo "========================================"
echo "General Purpose DPGCE Cluster re-created"
echo "========================================"

3 changes: 1 addition & 2 deletions gcloud/bin/recreate-dpgke
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ source lib/env.sh
source lib/gke.sh # Assuming gke functions are here

# Copy latest initialization action scripts
gsutil -m cp -r init/* gs://${BUCKET}/dataproc-initialization-actions
gcloud storage cp --recursive init/* gs://${BUCKET}/dataproc-initialization-actions
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The variable ${BUCKET} is unquoted in the command execution. If ${BUCKET} contains shell metacharacters (e.g., ;), it can lead to command injection. It is recommended to always quote variables in shell scripts, e.g., gs://"${BUCKET}"/.


# re-create normal dataproc cluster
delete_gke_cluster
Expand All @@ -28,4 +28,3 @@ create_gke_cluster
echo "========================"
echo "DPGKE Cluster re-created"
echo "========================"

2 changes: 1 addition & 1 deletion gcloud/lib/dataproc/jobs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ function diagnose () {
DIAG_URL=$(cat "${REPRO_TMPDIR}/${log_file}" | perl -ne 'print if m{^gs://.*/diagnostic.tar.gz\s*$}')
if [[ -n "${DIAG_URL}" ]]; then
print_status " Downloading ${DIAG_URL}..."
if run_gcloud "download_diagnose.log" gsutil cp -q "${DIAG_URL}" "${REPRO_TMPDIR}/"; then
if run_gcloud "download_diagnose.log" gcloud storage cp "${DIAG_URL}" "${REPRO_TMPDIR}/"; then
report_result "Pass"
local diag_file="${REPRO_TMPDIR}/$(basename ${DIAG_URL})"
print_status " Running drproc on ${diag_file}..."
Expand Down
20 changes: 10 additions & 10 deletions gcloud/lib/gcp/gcs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ function create_bucket () {

print_status "Creating GCS Staging Bucket gs://${BUCKET}..."
local log_file="create_bucket_${BUCKET}.log"
if ! gsutil ls -b "gs://${BUCKET}" > /dev/null 2>&1 ; then
if run_gcloud "${log_file}" gsutil mb -l ${REGION} gs://${BUCKET}; then
if ! gcloud storage ls --buckets "gs://${BUCKET}" > /dev/null 2>&1 ; then
if run_gcloud "${log_file}" gcloud storage buckets create --location ${REGION} gs://${BUCKET}; then
report_result "Created"
else
report_result "Fail"
Expand All @@ -24,16 +24,16 @@ function create_bucket () {
fi
# Grant SA permissions on BUCKET
print_status " Granting Storage Admin on gs://${BUCKET}..."
if run_gcloud "${log_file}" gsutil iam ch "serviceAccount:${GSA}:roles/storage.admin" "gs://${BUCKET}"; then
if run_gcloud "${log_file}" gcloud storage buckets add-iam-policy-binding "gs://${BUCKET}" --member="serviceAccount:${GSA}" --role="roles/storage.admin"; then
report_result "Pass"
else
report_result "Fail"
fi

print_status "Creating GCS Temp Bucket gs://${TEMP_BUCKET}..."
local temp_log_file="create_bucket_${TEMP_BUCKET}.log"
if ! gsutil ls -b "gs://${TEMP_BUCKET}" > /dev/null 2>&1 ; then
if run_gcloud "${temp_log_file}" gsutil mb -l ${REGION} gs://${TEMP_BUCKET}; then
if ! gcloud storage ls --buckets "gs://${TEMP_BUCKET}" > /dev/null 2>&1 ; then
if run_gcloud "${temp_log_file}" gcloud storage buckets create --location ${REGION} gs://${TEMP_BUCKET}; then
report_result "Created"
else
report_result "Fail"
Expand All @@ -44,7 +44,7 @@ function create_bucket () {
fi
# Grant SA permissions on TEMP_BUCKET
print_status " Granting Storage Admin on gs://${TEMP_BUCKET}..."
if run_gcloud "${temp_log_file}" gsutil iam ch "serviceAccount:${GSA}:roles/storage.admin" "gs://${TEMP_BUCKET}"; then
if run_gcloud "${temp_log_file}" gcloud storage buckets add-iam-policy-binding "gs://${TEMP_BUCKET}" --member="serviceAccount:${GSA}" --role="roles/storage.admin"; then
report_result "Pass"
else
report_result "Fail"
Expand All @@ -54,7 +54,7 @@ function create_bucket () {
if [[ -d init ]] ; then
print_status "Copying init scripts to ${INIT_ACTIONS_ROOT}..."
local cp_log="copy_init_scripts.log"
if run_gcloud "${cp_log}" gsutil -m cp -r "init/*" "${INIT_ACTIONS_ROOT}/"; then
if run_gcloud "${cp_log}" gcloud storage cp --recursive "init/*" "${INIT_ACTIONS_ROOT}/"; then
report_result "Pass"
else
report_result "Fail"
Expand All @@ -66,8 +66,8 @@ function create_bucket () {
function delete_bucket () {
print_status "Deleting GCS Bucket gs://${BUCKET}..."
local log_file="delete_bucket_${BUCKET}.log"
if gsutil ls -b "gs://${BUCKET}" > /dev/null 2>&1; then
if run_gcloud "${log_file}" gsutil -m rm -r "gs://${BUCKET}"; then
if gcloud storage ls --buckets "gs://${BUCKET}" > /dev/null 2>&1; then
if run_gcloud "${log_file}" gcloud storage rm --recursive "gs://${BUCKET}"; then
report_result "Deleted"
remove_sentinel "create_bucket" "done"
else
Expand All @@ -76,5 +76,5 @@ function delete_bucket () {
else
report_result "Not Found"
fi
# gsutil -m rm -r "gs://${TEMP_BUCKET}" > /dev/null 2>&1 || true # huge cache here, not so great to lose it
# gcloud storage rm --recursive "gs://${TEMP_BUCKET}" > /dev/null 2>&1 || true # huge cache here, not so great to lose it
}
10 changes: 5 additions & 5 deletions gcloud/lib/shared-functions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -1029,12 +1029,12 @@ source lib/database-functions.sh
source lib/net-functions.sh

function create_bucket () {
if gsutil ls -b "gs://${BUCKET}" ; then
if gcloud storage ls --buckets "gs://${BUCKET}" ; then
echo "bucket already exists, skipping creation."
return
fi
set -x
gsutil mb -l ${REGION} gs://${BUCKET}
gcloud storage buckets create --location=${REGION} gs://${BUCKET}
set +x

echo "==================="
Expand All @@ -1045,7 +1045,7 @@ function create_bucket () {
if [ -d init ]
then
set -x
gsutil -m cp -r init/* gs://${BUCKET}/dataproc-initialization-actions
gcloud storage cp --recursive init/* gs://${BUCKET}/dataproc-initialization-actions
set +x
fi

Expand All @@ -1057,7 +1057,7 @@ function create_bucket () {

function delete_bucket () {
set -x
gsutil -m rm -r gs://${BUCKET}
gcloud storage rm --recursive gs://${BUCKET}
set +x

echo "bucket removed"
Expand Down Expand Up @@ -1144,7 +1144,7 @@ function diagnose {

DIAG_URL=$(echo $DIAG_OUT | perl -ne 'print if m{^gs://.*/diagnostic.tar.gz\s*$}')
mkdir -p tmp
gsutil cp -q ${DIAG_URL} tmp/
gcloud storage cp ${DIAG_URL} tmp/

if [[ ! -f venv/${CLUSTER_NAME}/pyvenv.cfg ]]; then
mkdir -p venv/
Expand Down
2 changes: 1 addition & 1 deletion spark-tensorflow/gcloud-tests/test-tf-tsv.sh
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ fi
MODEL_NAME=test_model
MODEL_VERSION=v$TEST_TIME

ORIGIN=$(gsutil ls "$JOB_DIR/**/saved_model.pb" | sed 's/\(.\)saved_model.pb/\1/g')
ORIGIN=$(gcloud storage ls "$JOB_DIR/**/saved_model.pb" | sed 's/\(.\)saved_model.pb/\1/g')

echo "Training succeeded. Creating model from saved model at $ORIGIN ..."

Expand Down
6 changes: 3 additions & 3 deletions spark-translate/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,12 @@ Google's Translation API and running on Cloud Dataproc.

4. Create a bucket:
```
gsutil mb gs://$PROJECT-bucket
gcloud storage buckets create gs://$PROJECT-bucket
```

5. Upload `words.txt` to the bucket:
```
gsutil cp ../words.txt gs://$PROJECT-bucket
gcloud storage cp ../words.txt gs://$PROJECT-bucket
```
The file `words.txt` contains the following:
```
Expand All @@ -69,7 +69,7 @@ Google's Translation API and running on Cloud Dataproc.

8. Verify that the words have been translated:
```
gsutil cat gs://$PROJECT-bucket/translated-fr/part-*
gcloud storage cat gs://$PROJECT-bucket/translated-fr/part-*
```
The output is:
```
Expand Down