zhenwei-dev-api is the AWS serverless API layer for zhenwei.dev.
The main website repository, zhenwei-dev-site, owns the React/Vite frontend and the static hosting infrastructure such as S3, CloudFront, Route53, and ACM. This repository owns API Gateway, Lambda services, API-specific IAM, observability, and CI/CD workflows for reusable backend capabilities that support the site and future automation workflows.
API endpoint targets:
- Dev: execute-api endpoint by default (custom domain optional:
https://api-dev.zhenwei.dev) - Prod:
https://api.zhenwei.dev
- Keep API application code separate from Terraform infrastructure code.
- Treat every Lambda service as an independently testable application unit.
- Package Lambda artifacts in CI before Terraform deployment.
- Deploy immutable Lambda artifacts through Terraform.
- Support separate
devandprodenvironments. - Use least-privilege IAM for every service.
- Avoid committing private files, tokens, API keys, or secrets.
- Provide reusable modules and scripts for future APIs.
Generates a short-lived S3 presigned URL for approved private files.
Initial use case: the portfolio site's CV download flow should call this API instead of serving the PDF directly from the frontend repository. The CV PDF remains in the private S3 bucket that is provisioned outside this repo, currently planned as zhenwei-private-bucket. The endpoint stays reusable for future private files by using an approved namespace such as private-downloads/ rather than a resume-specific bucket path.
Expected flow:
- Browser calls
POST /get-presigned-urlwith an approvedobjectKeyand optionalversionId. - Lambda validates the request and policy rules.
- Lambda generates a short-lived S3 presigned URL.
- Lambda returns
200JSON with the URL, file name, and expiry. - Browser starts the download using the returned URL.
Security model:
- Do not accept arbitrary bucket names from public callers.
- Accept caller-provided
objectKeyonly when it matches exact-key or prefix allow-lists. - Support optional
versionIdfor S3 object versioning; omit it to sign the latest object version. - Enforce a maximum TTL, for example 300 to 900 seconds.
- Scope the Lambda role to
s3:GetObjectonly for approved object ARNs or narrow prefixes. - Use strict CORS for the portfolio site origins.
- Add API Gateway throttling and CloudWatch alarms for abuse and error spikes.
Sends operational push notifications to the owner's device, likely through Pushover.
Initial use cases:
- Notify when GitHub Actions deployments succeed or fail.
- Notify when CloudFront invalidation completes.
- Notify when future automation workflows need owner attention.
Security model:
- This endpoint should not be publicly callable from browsers.
- Prefer machine-to-machine access from GitHub Actions using GitHub OIDC to assume an AWS IAM role, then invoke Lambda directly or call an IAM-protected API Gateway route.
- Store Pushover credentials in AWS Secrets Manager or encrypted SSM Parameter Store.
- Never store notification provider tokens in GitHub, Terraform variables, or Lambda environment variables as plaintext.
- Validate source, event type, message length, and allowed priority values.
- Apply low throttling limits, structured logging, and alarms for unexpected call volume.
Tracks a CloudFront invalidation until it completes, then sends a notification.
Initial behavior:
- Accept
distributionIdandinvalidationId. - Check CloudFront invalidation status every 30 seconds.
- Stop after a bounded timeout, currently planned as up to 5 minutes.
- Send a notification when the invalidation reaches
Completed.
Implementation phases:
- Phase 1: Lambda with a bounded polling loop.
- Phase 2: Step Functions with Wait and Check states.
Step Functions is preferred long-term because it avoids paying for Lambda idle wait time and gives clearer retries, history, and observability.
Security model:
- This endpoint should be treated as an internal automation endpoint, not a public browser API.
- Prefer GitHub OIDC plus IAM authorization for workflow-triggered calls.
- Validate that the requested distribution ID is in an allow-list for this account and environment.
- Scope the Lambda role to
cloudfront:GetInvalidationonly for approved distributions where practical. - Reuse the notification service internally rather than exposing notification details to callers.
Because this repository is public, endpoint safety must not depend on hiding route names, payload shapes, or implementation details. The design should assume public users can read the code and discover URLs. Protection should come from authentication, authorization, validation, throttling, and least privilege.
Recommended baseline:
| Endpoint | Caller | Recommended access control | Notes |
|---|---|---|---|
get-presigned-url |
Portfolio site browser | Public route with strict validation, CORS allow-list, API Gateway throttling, and CloudWatch alarms | Safe only if caller-provided objectKey values are constrained to approved exact keys or prefixes. |
send-notification |
GitHub Actions and internal services | Private machine-to-machine route using IAM auth through GitHub OIDC, or direct Lambda invoke by an assumed role | Do not rely on an API key alone for this endpoint. |
update-invalidation-status |
GitHub Actions deployment workflow | Private machine-to-machine route using IAM auth through GitHub OIDC, or direct Step Functions execution by an assumed role | Validate distribution IDs and cap runtime. |
Optional controls:
- WAF rate-based rules for public routes if baseline throttling is insufficient.
- WAF geo rules if a regional access restriction is intentionally desired.
- JWT or Lambda authorizer for future authenticated user-facing APIs.
- API keys only as a secondary control, not as the main protection for sensitive automation endpoints.
Planned structure:
zhenwei-dev-api/
terraform/
modules/
http_api/
lambda_service/
notification_channel/
step_function/
envs/
shared/
main.tf
variables.tf
outputs.tf
terraform.tfvars.example
dev/
main.tf
variables.tf
outputs.tf
terraform.tfvars.example
prod/
main.tf
variables.tf
outputs.tf
terraform.tfvars.example
services/
get-presigned-url/
src/
handler.py
tests/
requirements.txt
README.md
send-notification/
src/
handler.py
tests/
requirements.txt
README.md
update-invalidation-status/
src/
handler.py
tests/
requirements.txt
README.md
shared/
python/
utils/
aws_clients.py
logging.py
response.py
validation.py
scripts/
package_lambda.sh
run_tests.sh
.github/
workflows/
ci.yml
deploy-dev.yml
deploy-prod.yml
docs/
README.md
AGENTS.md
Layout principles:
services/contains Lambda application code and tests.shared/contains reusable Python utilities shared by services.terraform/contains Terraform only.terraform/modules/contains reusable infrastructure modules.terraform/envs/sharedowns shared resources consumed by env stacks.terraform/envs/devandterraform/envs/prodcontain environment-specific composition and variables..github/workflows/contains active GitHub Actions workflow definitions. GitHub workflow files cannot be activated from service subdirectories.
Terraform should manage API infrastructure, not build Lambda packages.
Expected Terraform responsibilities:
- API Gateway HTTP APIs, routes, integrations, stages, and custom domains.
- Lambda functions and versions or aliases.
- Lambda execution roles and least-privilege policies.
- Lambda permissions for API Gateway invocation.
- CloudWatch log groups, metrics, and alarms.
- Route53 records for
api-dev.zhenwei.devandapi.zhenwei.dev. - Regional ACM certificates for API Gateway custom domains if needed.
- Optional WAF associations if needed later.
- Optional Step Functions for invalidation tracking.
- Optional artifact bucket for Lambda zip packages.
State should be separated by stack. Example backend keys:
zhenwei-dev-api/shared/terraform.tfstatezhenwei-dev-api/dev/terraform.tfstatezhenwei-dev-api/prod/terraform.tfstate
Ownership model:
sharedowns shared resources such as artifact bucket and optional shared GitHub deploy role.devandprodown environment resources and consume shared outputs via remote state.devandprodeach manage their own service artifact SSM parameter paths.
Environment-specific values should live in Terraform variables or external secret/config stores, not hard-coded in modules.
Each service should be developed and tested independently:
- Write service code under
services/<service-name>/src. - Write unit tests under
services/<service-name>/tests. - Use local mocks or stubs for AWS SDK calls.
- Run formatting, linting, and tests in CI.
- Package the service into a zip artifact in CI.
- Upload the artifact to an S3 artifact bucket using a commit SHA or versioned path.
- Update the service SSM artifact parameter with artifact metadata (
bucket,key,source_code_hash, build metadata). - Let Terraform read the SSM artifact parameter and deploy the already-built artifact.
To avoid a chicken-and-egg cycle, bootstrap infrastructure in three stacks:
- Apply
terraform/envs/sharedfirst to create shared primitives (artifact bucket and optional shared deploy role). - Apply
terraform/envs/devandterraform/envs/prodto create environment-specific SSM artifact parameters and API resources. - Implement service code and package scripts.
- Build and upload Lambda zip artifacts to the shared artifact bucket.
- Update each environment service artifact SSM parameter with uploaded artifact metadata.
- Re-apply
terraform/envs/devorterraform/envs/prodto deploy the referenced artifact. - Run smoke tests from curl/Postman and site integration tests.
- Add GitHub workflows to automate steps 4 through 7.
Recommended workflows:
Runs on pull requests.
Responsibilities:
- Detect changed services.
- Run Python formatting, linting, and unit tests.
- Run Terraform
fmtandvalidate. - Optionally run Terraform plan for changed environments.
Runs on pushes to the dev branch and by manual dispatch.
Responsibilities:
- Package changed Lambda services.
- Upload artifacts to the artifact bucket.
- Run Terraform apply for
terraform/envs/dev. - Run smoke tests against the dev execute-api endpoint (or
api-dev.zhenwei.devif custom domain is enabled). - Send deployment notifications through the internal notification path.
Runs manually from main with a protected GitHub Environment approval.
Responsibilities:
- Deploy the approved artifact versions to prod.
- Run Terraform apply for
terraform/envs/prod. - Run smoke tests against
api.zhenwei.dev. - Send success or failure notifications.
| Environment | Domain | Deployment trigger | Purpose |
|---|---|---|---|
dev |
execute-api endpoint by default (api-dev.zhenwei.dev optional) |
Push to dev or manual dispatch |
Integration testing and site development. |
prod |
api.zhenwei.dev |
Manual workflow from main with approval |
Production portfolio and automation APIs. |
The portfolio site can point to different API URLs with environment variables, for example:
VITE_APP_ENV=dev
VITE_CV_MODE=api
VITE_CV_API_URL=https://<dev-execute-api-id>.execute-api.ap-southeast-1.amazonaws.com/get-presigned-url- Scaffold the repository structure.
- Implement
get-presigned-urlfor dev. - Add shared-stack Terraform for artifact bucket and optional shared IAM roles.
- Add env-stack Terraform for environment SSM parameter names and API resources.
- Add CI tests and Lambda artifact packaging.
- Deploy the dev endpoint and integrate the site CV download flow.
- Implement
send-notificationwith secure secret retrieval and internal access control. - Implement
update-invalidation-statuswith bounded polling. - Migrate invalidation tracking to Step Functions if the workflow becomes long-running or needs richer orchestration.