From aed04fb14d419054bf6c9cb2603ac70daaa86239 Mon Sep 17 00:00:00 2001 From: Hannah Kennedy Date: Thu, 20 Oct 2022 18:48:28 -0400 Subject: [PATCH 1/5] First pass for organizing, clarifying. --- .github/CONTRIBUTING.md | 37 +-- README.md | 218 ++++++++------ labs/advanced-labs/canary/README.md | 4 +- labs/advanced-labs/monitoring/README.md | 2 +- labs/advanced-labs/voe/README.md | 4 +- labs/azure-codespaces-setup.md | 101 ++++--- ...{inner-loop..drawio.png => inner-loop.png} | Bin ...{outer-loop..drawio.png => outer-loop.png} | Bin labs/inner-loop-flux.md | 67 ++--- labs/inner-loop.md | 265 +++++++++--------- labs/outer-loop-aks-azure.md | 168 ++++++----- labs/outer-loop-arc-gitops.md | 128 ++++----- labs/outer-loop-multi-cluster.md | 198 ++++++------- labs/outer-loop-ring-deployment.md | 101 +++---- labs/outer-loop.md | 220 +++++++-------- templates/README.md | 6 +- vm/README.md | 21 +- 17 files changed, 741 insertions(+), 799 deletions(-) rename labs/images/{inner-loop..drawio.png => inner-loop.png} (100%) rename labs/images/{outer-loop..drawio.png => outer-loop.png} (100%) diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index 17563903..a5c00a02 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -12,28 +12,33 @@ This project has adopted the [Microsoft Open Source Code of Conduct](https://ope For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. -- [Code of Conduct](#coc) -- [Issues and Bugs](#issue) -- [Feature Requests](#feature) -- [Submission Guidelines](#submit) +## Table of Contents + +- [Code of Conduct](#code-of-conduct) +- [Issues and Bugs](#issue-and-bugs) +- [Feature Requests](#feature-requests) +- [Submission Guidelines](#submission-guidelines) ## Code of Conduct Help us keep this project open and inclusive. Please read and follow our [Code of Conduct](https://opensource.microsoft.com/codeofconduct/). +Visit our [Code of Conduct](CODE_OF_CONDUCT.md) doc as well. -## Found an Issue +## Issue and Bugs If you find a bug in the source code or a mistake in the documentation, you can help us by -[submitting an issue](#submit-issue) to the GitHub Repository. Even better, you can -[submit a Pull Request](#submit-pr) with a fix. +[submitting an issue](#submitting-an-issue) to the GitHub Repository. Even better, you can +[submit a Pull Request](#submitting-a-pull-request-pr) with a fix. + +## Feature Requests -## Want a Feature +You can *request* a new feature by [submitting an issue](#submitting-an-issue) to the GitHub +Repository. -You can *request* a new feature by [submitting an issue](#submit-issue) to the GitHub -Repository. If you would like to *implement* a new feature, please submit an issue with +- **Bigger Features** - If you would like to *implement* a new feature, please submit an issue with a proposal for your work first, to be sure that we can use it. -- **Small Features** can be crafted and directly [submitted as a Pull Request](#submit-pr). +- **Small Features** can be crafted and directly [submitted as a Pull Request](#submitting-a-pull-request-pr). ## Submission Guidelines @@ -63,12 +68,10 @@ Before you submit your Pull Request (PR) consider the following guidelines: - Search the repository `https://github.com/[organization-name]/[repository-name]/pulls` for an open or closed PR that relates to your submission. You don't want to duplicate effort. - -- Make your changes in a new git fork: - -- Commit your changes using a descriptive commit message -- Push your fork to GitHub: -- In GitHub, create a pull request +- Make your changes in a new git fork. +- Commit your changes using a descriptive commit message. +- Push your fork to GitHub. +- In GitHub, create a pull request against `main` from your fork. - If we suggest changes then: - Make the required updates. - Rebase your fork and force push to your GitHub repository (this will update your Pull Request): diff --git a/README.md b/README.md index 6a837bed..c2c1ae38 100644 --- a/README.md +++ b/README.md @@ -2,27 +2,45 @@ ## Breaking Changes in 0.11.0 -- Version 0.11.0 has breaking changes from 0.10.0 -- The CLIs will automatically use the latest when you create a new Codespace -- Using the 0.11.0 or higher CLIs on an 0.10.0 branch will break -- To check the version of your CLI - - `kic -v` - - `flt -v` -- If you have an existing branch with work you want to save and create a new Codespace you can downgrade the CLI in that CS - - Make sure you're in the root of your branch repo - - `.devcontainer/cli-update.sh 0.10.0` -- To go back to the current CLI - - `.devcontainer/cli-update.sh` +>> **Warning!** Version `0.11.0` has breaking changes from `0.10.0` + +The CLIs will automatically use the latest when you create a new Codespace, so using the `0.11.0` or +higher CLIs on an 0.10.0 branch will break. To check the version of your CLI use the `-v` flag: + +- `kic -v` +- `flt -v` + +If you have an existing branch with work you want to save and create a new Codespace you can downgrade +the CLI in that codespace. + +> **NOTE**: Make sure you're in the root of your branch repo when you run the following. + +```bash +.devcontainer/cli-update.sh 0.10.0` +``` + +To go back to the current CLI: `.devcontainer/cli-update.sh`. ## Introduction -Kubernetes is hard. Getting started and set up for the first time can take weeks to get right. Managing deployments on a fleet of Kubernetes clusters on the edge brings even more challenges. +Kubernetes is hard. Getting started and set up for the first time can take weeks to get right. +Managing deployments on a fleet of Kubernetes clusters on the edge brings even more challenges. -Pilot-in-a-Box (PiB) is a `game-changer` for the end-to-end Kubernetes app development cycle from a local cluster to deployments on the edge. It reduces the initial friction and empowers the developer to get started and deployed to a dev/test environment within *minutes*. The pre-configured Codespaces environment includes a `Kubernetes` cluster and custom CLI's (`kic` and `flt`) that help streamline the initial learning curve to Kubernetes development commands. +Pilot-in-a-Box (PiB) is a `game-changer` for the end-to-end Kubernetes app development cycle from a +local cluster to deployments on the edge. It reduces the initial friction and empowers the developer +to get started and deployed to a dev/test environment within *minutes*. The pre-configured Codespaces +environment includes a `Kubernetes` cluster and custom CLI's (`kic` and `flt`) that help streamline +the initial learning curve to Kubernetes development commands. -This repo walks through the rich end-to-end developer experience in a series of labs. The labs start by walking you through creating, building, testing, and deploying an application on a local cluster ([inner-loop](./README.md#inner-loop)) with a complete CNCF observability stack. Then, the labs move on to the next step of deploying the application to a test cluster in the Cloud ([outer-loop](./README.md#outer-loop)). There are also several [advanced labs](./README.md#advanced-labs) that cover centralized monitoring, canary deployments, and targeting different devices. +This repo walks through the rich end-to-end developer experience in a series of labs. The labs start +by walking you through creating, building, testing, and deploying an application on a local cluster +([inner-loop](./README.md#inner-loop)) with a complete CNCF observability stack. Then, the labs move +on to the next step of deploying the application to a test cluster in the Cloud ([outer-loop](./README.md#outer-loop)). +There are also several [advanced labs](./README.md#advanced-labs) that cover centralized monitoring, +canary deployments, and targeting different devices. -> Note: PiB is not intended as-is for production deployments. However, some concepts covered (GitOps and Observability) are production-ready. +> Note: PiB is not intended as-is for production deployments. However, some concepts covered (GitOps +> and Observability) are production-ready. ## Prerequisites @@ -33,128 +51,144 @@ This repo walks through the rich end-to-end developer experience in a series of ## GitHub Codespaces -> Codespaces allows you to develop in a secure, configurable, and dedicated development environment in the cloud that works how and where you want it to +> Codespaces allows you to develop in a secure, configurable, and dedicated development environment +> in the cloud that works how and where you want it to. + +Check out the [GitHub Codespaces Overview](https://docs.github.com/en/codespaces). GitHub Codespaces +is available for organizations using GitHub Team or GitHub Enterprise Cloud. Codespaces are also available +as a limited beta release for individual users on GitHub Pro plans. However, the waiting list +is normally > 3 months. -- [GitHub Codespaces Overview](https://docs.github.com/en/codespaces) - - GitHub Codespaces is available for organizations using GitHub Team or GitHub Enterprise Cloud. GitHub Codespaces is also available as a limited beta release for individual users on GitHub Pro plans. - - For more information, see ["GitHub's products"](https://docs.github.com/en/get-started/learning-about-github/githubs-products) +For more information, see ["GitHub's products"](https://docs.github.com/en/get-started/learning-about-github/githubs-products). -We use GitHub Codespaces for our `inner-loop` and `outer-loop` Developer Experiences. While other DevX are available, currently, we only support GitHub Codespaces. +We use GitHub Codespaces for our `inner-loop` and `outer-loop` Developer Experiences. While other DevX +are available, currently, we only support GitHub Codespaces. -The easiest way to get GitHub Codespaces access is to setup a [GitHub Team](https://docs.github.com/en/codespaces) +The easiest way to get GitHub Codespaces access is to setup a [GitHub Team](https://docs.github.com/en/codespaces). -GitHub Codespaces is also available in beta on a limited basis for GitHub Pro users. The waiting list is normally > 3 months. +> Best Practice: as you begin projects, ensure that you have Codespaces and Azure subscriptions with +> proper permissions. -> Best Practice: as you begin projects, ensure that you have Codespaces and Azure subscriptions with proper permissions +## `inner-loop` -## inner-loop +The `inner-loop` refers to the tasks that developers do every day as part of their development process. +Generally, `inner-loop` happens on the individual developer workstation. For PiB, the inner-loop and +developer workstation is Codespaces. When a developer creates a Codespace, that is their "personal +development workstation in the cloud." As part of PiB, we have automated the creation of the developer +workstation using a repeatable, consistent, Infrastructure as Code (IaC) approach. -- `inner-loop` refers to the tasks that developers do every day as part of their development process - - Generally, `inner-loop` happens on the individual developer workstation - - For PiB, the inner-loop and developer workstation is Codespaces - - When a developer creates a Codespace, that is their "personal development workstation in the cloud" -- As part of PiB, we have automated the creation of the developer workstation using a repeatable, consistent, Infrastructure as Code approach - - We have an advanced workshop planned for customizing the Codespaces experience for your project -- With the power of Codespaces, a developer can create a consistent workstation with a few clicks in less than a minute +We have an advanced workshop planned for customizing the Codespaces experience for your project. With +the power of Codespaces, a developer can create a consistent workstation with a few clicks in less than +a minute. -## outer-loop +## `outer-loop` -- `outer-loop` refers to the tasks that developers and DevOps do as they move from dev to test to pre-prod to production - - Generally `outer-loop` happens on shared compute outside of the developer workstation - - For PiB, outer-loop uses a combination of Codespaces and `dev/test clusters` in Azure -- As part of PiB, we have automated the creation of dev/test clusters using a repeatable, consistent, Infrastructure as Code approach +The `outer-loop` refers to the tasks that developers and DevOps do as they move from dev to test to +pre-prod to production stages. Generally `outer-loop` happens on shared compute outside of the +developer workstation. For PiB, `outer-loop` uses a combination of Codespaces and `dev/test clusters` +in Azure. As part of PiB, we have automated the creation of dev/test clusters using a repeatable, +consistent, IaC approach. ## Create a Codespace -> You can use the same Codespace for any of the labs +> **NOTE**: You can use the same Codespace for any of the labs. -- From this repo - - Click the `<> Code` button - - Make sure the Codespaces tab is active - - Click `Create Codespace on main` -- After about 1 minute, you will have a GitHub Codespace running with a complete Kubernetes Developer Experience! +From this repo, click the `<> Code` button. Make sure the Codespaces tab is active. Choose +`Create Codespace on main`. After about 1 minute, you will have a GitHub Codespace running with a +complete Kubernetes developer experience! ## Note on environment variables -- Many of these tutorials make use of environment variables, using the export functionality. If you wish, you can also edit the Z shell preferences file to persist exported environment variables across terminal sessions. Just add the same "export FOO=BAR" lines to your .zshrc file. +Many of these tutorials make use of environment variables, using the export functionality. If you +wish, you can also edit the Z shell preferences file to persist exported environment variables +across terminal sessions. Just add the same "export FOO=BAR" lines to your `.zshrc` file. ```bash - nano ~/.zshrc - ``` ## Create a working branch -- Because the main branch has a branch protection rule, you need to create a working branch - - You can use the same branch for any of the labs or create a new branch per lab (add 1, 2, 3 ... to the branch name) +Because the main branch has a branch protection rule, you need to create a working branch. You can +use the same branch for any of the labs or create a new branch per lab (add 1, 2, 3 ... to the branch +name). - > 🛑 Many commands will fail in following labs if `MY_BRANCH` is not set or branch is not pushed upstream +> **NOTE**: 🛑 Many commands will fail in following labs if `MY_BRANCH` is not set or that branch is +> not pushed upstream. - ```bash - - # by default, MY_BRANCH is set to your lower case GitHub User Name - # the value can be overwritten if needed - echo $MY_BRANCH - - # create a branch - git checkout -b $MY_BRANCH +```bash +# by default, MY_BRANCH is set to your lower case GitHub User Name +# the value can be overwritten if needed +echo $MY_BRANCH - # push the branch and set the remote - git push -u origin $MY_BRANCH +# create a branch +git checkout -b $MY_BRANCH - ``` +# push the branch and set the remote +git push -u origin $MY_BRANCH +``` -- Your prompt should end like this - - /workspaces/Pilot-in-a-Box (mybranch) $ +Your prompt should end like this `/workspaces/Pilot-in-a-Box (mybranch) $ ...`. ## inner-loop Labs -- [Lab 1](./labs/inner-loop.md#pib-inner-loop): Create, build, deploy, and test a new dotnet application and observability stack on your local cluster -- [Lab 2](./labs/inner-loop-flux.md#create-a-new-cluster): Configure flux to automate the deployment process from Lab 1 +- [Lab 1](./labs/inner-loop.md#pib-inner-loop): Create, build, deploy, and test a new dotnet application + and observability stack on your local cluster. +- [Lab 2](./labs/inner-loop-flux.md#create-a-new-cluster): Configure flux to automate the deployment + process from Lab 1. ## outer-loop Labs -- [Lab 1](./labs/outer-loop.md#pib-outer-loop): Create a dev/test cluster and manage application deployments on the cluster -- [Lab 2](./labs/outer-loop-multi-cluster.md#pib-outer-loop-multi-cluster): Manage application deployments on a fleet of multiple clusters -- [Lab 3](./labs/outer-loop-ring-deployment.md#pib-outer-loop-with-ring-based-deployment): Configure ring based deployments - -- [Lab 4](./labs/azure-codespaces-setup.md#azure-subscription-and-codespaces-setup): Set up Azure subscription and Codespaces for advanced configuration +- [Lab 1](./labs/outer-loop.md#pib-outer-loop): Create a dev/test cluster and manage application + deployments on the cluster. +- [Lab 2](./labs/outer-loop-multi-cluster.md#pib-outer-loop-multi-cluster): Manage application deployments + on a fleet of multiple clusters. +- [Lab 3](./labs/outer-loop-ring-deployment.md#pib-outer-loop-with-ring-based-deployment): Configure + ring-based deployments. +- [Lab 4](./labs/azure-codespaces-setup.md#azure-subscription-and-codespaces-setup): Set up Azure + subscription and Codespaces for advanced configuration. - This is a prerequisite for the Advanced Labs ## Advanced Labs -- [Arc enabled GitOps Lab](./labs/outer-loop-arc-gitops.md#pib-outer-loop-with-arc-enabled-gitops): Deploy to dev cluster running on an Azure VM with Arc enabled GitOps -- [Canary Deployment Lab](./labs/advanced-labs/canary/README.md#pib-automated-canary-deployment-using-flagger): Use Flagger to experiment with canary deployments -- [Vision on Edge (VoE) Lab](./labs/advanced-labs/voe/README.md#pib-outer-loop-vision-on-edge-voe): Deploy a more complex app (VoE) to a fleet -- [Centralized Observability Lab](./labs/advanced-labs/monitoring/README.md#pib-centralized-monitoring): Deploy a centralized observability system with Fluent Bit, Prometheus, and Grafana to monitor fleet application deployments -- [outer-loop with AKS-IoT](./labs/advanced-labs/aks-iot/README.md#pib-outer-loop-to-aks-iot): Deploy to an AKS-IoT cluster running on an Azure VM with Arc enabled GitOps -- [outer-loop with AKS Lab](./labs/outer-loop-aks-azure.md#pib-outer-loop-to-aks-on-azure): Deploy to an AKS cluster with Arc enabled GitOps +- [Arc enabled GitOps Lab](./labs/outer-loop-arc-gitops.md#pib-outer-loop-with-arc-enabled-gitops): + Deploy to dev cluster running on an Azure VM with Arc enabled GitOps. +- [Canary Deployment Lab](./labs/advanced-labs/canary/README.md#pib-automated-canary-deployment-using-flagger): + Use Flagger to experiment with canary deployments. +- [Vision on Edge (VoE) Lab](./labs/advanced-labs/voe/README.md#pib-outer-loop-vision-on-edge-voe): + Deploy a more complex app (VoE) to a fleet. +- [Centralized Observability Lab](./labs/advanced-labs/monitoring/README.md#pib-centralized-monitoring): + Deploy a centralized observability system with Fluent Bit, Prometheus, and Grafana to monitor fleet + application deployments. +- [outer-loop with AKS-IoT](./labs/advanced-labs/aks-iot/README.md#pib-outer-loop-to-aks-iot): Deploy + to an AKS-IoT cluster running on an Azure VM with Arc enabled GitOps. +- [outer-loop with AKS Lab](./labs/outer-loop-aks-azure.md#pib-outer-loop-to-aks-on-azure): Deploy to + an AKS cluster with Arc enabled GitOps. ## Cleanup -- Once you are finished with all of the labs and experimenting, please delete your branch - - ```bash +- Once you are finished with all of the labs and experimenting, please delete your branch. - # change to the root of the repo - cd $PIB_BASE - - git pull - git add . - git commit -am "deleting branch" - git push - - # checkout main branch, delete remote, delete local - git checkout main - git push origin $MY_BRANCH --delete - git branch -D $MY_BRANCH - - ``` +```bash +# change to the root of the repo +cd $PIB_BASE + +git pull +git add . +git commit -am "deleting branch" +git push + +# checkout main branch, delete remote, delete local +git checkout main +git push origin $MY_BRANCH --delete +git branch -D $MY_BRANCH +``` ## Support -This project uses GitHub Issues to track bugs and feature requests. Please search the existing issues before filing new issues to avoid duplicates. For new issues, file your bug or feature request as a new issue. +This project uses GitHub Issues to track bugs and feature requests. Please search the existing issues +before filing new issues to avoid duplicates. For new issues, file your bug or feature request as a +new issue. See the [Contributing Guidance](.github/CONTRIBUTING.md) for more details. ## Contributing diff --git a/labs/advanced-labs/canary/README.md b/labs/advanced-labs/canary/README.md index 5dd37640..6ea60fec 100644 --- a/labs/advanced-labs/canary/README.md +++ b/labs/advanced-labs/canary/README.md @@ -157,11 +157,11 @@ flt check app prometheus ``` -- Observe canary promotion in k9s: +- Observe canary promotion in K9s: ```bash - # start k9s for the cluster + # start K9s for the cluster flt ssh $MY_CLUSTER k9s diff --git a/labs/advanced-labs/monitoring/README.md b/labs/advanced-labs/monitoring/README.md index f85d808f..1a8c338e 100644 --- a/labs/advanced-labs/monitoring/README.md +++ b/labs/advanced-labs/monitoring/README.md @@ -25,7 +25,7 @@ ## Key Vault Secrets - A PAT is required to forward logs and metrics to Grafana Cloud. -- The PAT is stored as a k8s secret on the fleet clusters. +- The PAT is stored as a K8s secret on the fleet clusters. - Before creating the secrets, a Key Vault and MI (with access to the Key Vault) must be configured. See [setup docs](/labs/azure-codespaces-setup.md) for instructions. ### Fluent Bit Secret diff --git a/labs/advanced-labs/voe/README.md b/labs/advanced-labs/voe/README.md index 59b0ef49..8e247871 100644 --- a/labs/advanced-labs/voe/README.md +++ b/labs/advanced-labs/voe/README.md @@ -69,7 +69,7 @@ az cognitiveservices account create --kind CognitiveServices --name $VOE_AZ_COG_ ### Update fleet creation script - Add the following lines to vm/setup/pre-flux.sh and replace the values in [] with the names of the resources created above. - - This will run on the fleet vm/s during setup and will create the voe namespace and required k8s secret. + - This will run on the fleet vm/s during setup and will create the `voe` namespace and required K8s secret. - The VM uses Managed Identity to retrieve the connection string values from the IoT Hub. - Do NOT run this fence! @@ -111,7 +111,7 @@ git push ```bash # before creating the cluster, make sure PIB_MI is set -# MI is required for the voe k8s secrets to be created properly +# MI is required for the voe K8s secrets to be created properly flt env PIB_MI # set MY_CLUSTER diff --git a/labs/azure-codespaces-setup.md b/labs/azure-codespaces-setup.md index ad1e959e..f28887b6 100644 --- a/labs/azure-codespaces-setup.md +++ b/labs/azure-codespaces-setup.md @@ -1,65 +1,65 @@ # Azure Subscription and Codespaces Setup -- We use Azure Managed Identity and Codespaces Secrets for credentials +We use Azure Managed Identity and Codespaces Secrets for credentials. -> Work in Progress +> ⚠️ Work in Progress ⚠️ -## Login to Azure +## Before You Begin -- Login to Azure using `az login --use-device-code` - - If you have more than one Azure subscription, select the correct subscription +The following needs to be installed: - ```bash +* [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli) +* [Github CLI](https://cli.github.com/manual/installation) - # verify your account - az account show +## Login to Azure - # list your Azure accounts - az account list -o table +Login to Azure using `az login --use-device-code`. If you have more than one Azure subscription, select +the correct subscription: - # set your Azure subscription - az account set -s mySubNameOrId +```bash +# verify your account +az account show - # verify your account - az account show +# list your Azure accounts +az account list -o table - ``` +# set your Azure subscription +az account set -s mySubNameOrId + +# verify your account +az account show +``` ## Setup -- In order to use Azure Arc, HTTPS, or DNS, you must configure your Azure subscription and Codespaces Secrets +In order to use Azure Arc, HTTPS, or DNS, you must configure your Azure subscription and Codespaces +Secrets. Read more about Codespace Secrets [here](https://docs.github.com/en/rest/codespaces/secrets#about-the-codespaces-user-secrets-api). ## Shared Personal Access Token -> Codespaces PATs expire after 8 hours -> -> Create a long-lived PAT +> **NOTE**: Codespaces PATs expire after 8 hours. -- Create a shared GitHub Personal Access Token - - Grant Repos and Packages permission - - Grant SSO permission as needed - - You can use an existing PAT with proper permissions -- Create a Codespaces Secret for the GitHub PAT +* Create a long-lived, shared GitHub Personal Access Token. +* Grant Repos and Packages permission. +* Grant SSO permission as needed. +* You can use an existing PAT with proper permissions. +* Create a Codespaces Secret for the GitHub PAT. ```bash - gh secret set PIB_PAT --body "YourSharedPAT" # list secrets gh secret list - ``` ## Create Resource Group -- We use `tld` for our resource group - - The RG may contain - - Managed Identity - - Platform Key Vault - - DNS Service +* We use `tld` for our resource group. The RG may contain: + * Managed Identity + * Platform Key Vault + * DNS Service ```bash - # change if desired export rg=tld az group create -g $rg -l westus3 @@ -74,10 +74,9 @@ gh secret list ### Create Managed identity -- Required for Azure access from the dev/test clusters +This is required for Azure access from the dev/test clusters. ```bash - # Managed Identity name export mi=pib_mi @@ -86,14 +85,14 @@ gh secret set PIB_MI --body $(az identity create --name $mi --resource-group $rg # list secrets gh secret list - ``` ## Create Shared SSH Key -- This will allow multiple users to access the clusters from the same branch - - The flt CLI uses SSH to connect to the dev/test clusters -- `.devcontainer/post-create.sh` will decrypt and save the SSH from Codespaces Secrets when a new Codespace is created +* This will allow multiple users to access the clusters from the same branch. + * The `flt` CLI uses SSH to connect to the dev/test clusters. +* `.devcontainer/post-create.sh` will decrypt and save the SSH from Codespaces Secrets when a new + Codespace is created. ```bash @@ -113,11 +112,10 @@ gh secret list ## Create Azure Key Vault -- Create Azure Key Vault from the Azure Portal -- Grant Managed Identity permissions to the Key Vault +* Create Azure Key Vault from the Azure Portal. +* Grant Managed Identity permissions to the Key Vault. ```bash - # change to your key vault name export kv=pib_kv @@ -126,18 +124,17 @@ gh secret set PIB_KEYVAULT --body $kv # list secrets gh secret list - ``` ## Create DNS Zone -- required for HTTPS -- Purchase a domain from the Azure Portal (or bring your own) -- Create a DNS Zone using PIB_DNS_RG from above -- Grant the Managed Identity access to the DNS Zone +> This is required for HTTPS! -```bash +* Purchase a domain from the Azure Portal (or bring your own). +* Create a DNS Zone using `PIB_DNS_RG` from above. +* Grant the Managed Identity access to the DNS Zone. +```bash # change to your domain export ssl=cseretail.com @@ -146,17 +143,16 @@ gh secret set PIB_SSL --body $ssl # list secrets gh secret list - ``` ## Create Service Principal -- optional -- allows login with `flt az login` using the SP credentials -- Grant SP access to Key Vault if setup +> **NOTE**: This is _optional_. -```bash +This allows login with `flt az login` using the SP credentials. Then grant SP access to Key Vault if +this was setup. +```bash # create SP id=$(az ad sp create-for-rbac \ --name pib_sp \ @@ -179,5 +175,4 @@ gh secret set AZ_SP_KEY --body $key # list secrets gh secret list - ``` diff --git a/labs/images/inner-loop..drawio.png b/labs/images/inner-loop.png similarity index 100% rename from labs/images/inner-loop..drawio.png rename to labs/images/inner-loop.png diff --git a/labs/images/outer-loop..drawio.png b/labs/images/outer-loop.png similarity index 100% rename from labs/images/outer-loop..drawio.png rename to labs/images/outer-loop.png diff --git a/labs/inner-loop-flux.md b/labs/inner-loop-flux.md index 819771af..3680f8b3 100644 --- a/labs/inner-loop-flux.md +++ b/labs/inner-loop-flux.md @@ -2,63 +2,59 @@ ## Create a New Cluster -> The k3d cluster will run `in` your Codespace - no need for an external cluster +> **NOTE**: The k3d cluster will run `in` your Codespace - no need for an external cluster. -- Use `kic` to create and verify a new k3d cluster +Use `kic` to create and verify a new k3d cluster. ```bash - # delete and create a new cluster kic cluster create # wait for pods to get to Running kic pods --watch - ``` ## Create a new .NET WebAPI -> You can skip this step if you already created MyApp +> **NOTE**: You can skip this step if you already created MyApp. -- PiB includes templates for new applications that encapsulate K8s best practices - - For the Workshop, use `MyApp` for the application name - - You can use any app name that conforms to a dotnet namespace - - PascalCase - - alpha only - - <= 20 chars -- Once created, you can browse the code in the Explorer window +PiB includes templates for new applications that encapsulate K8s best practices. -```bash +- For the Workshop, use `MyApp` for the application name. +- You can use any app name that conforms to a dotnet namespace: + - PascalCase + - alpha only + - <= 20 chars +- Once created, you can browse the code in the Explorer window. +```bash # create a new app from the dotnet web api template cd apps kic new dotnet-webapi MyApp # this is important as the CLI is "context aware" cd myapp - ``` ## Build MyApp -> Make sure you are in the apps/myapp directory +> **NOTE**: Make sure you are in the `apps/myapp` directory. -- Now that we've created a new application, the next logical step is to `build` the app -- PiB encapsulates many best practices, so, as an App Dev, you don't have to figure out how to build a secure, multi-stage Dockerfile +Now that we've created a new application, the next logical step is to `build` the app. PiB encapsulates +many best practices, so, as an App Dev, you don't have to figure out how to build a secure, multi-stage +Dockerfile. ```bash - # build the app kic build all # view docker images docker images - ``` ## Deploy MyApp and Observability with GitOps (Flux v2) -> Notice you don't have to create or edit K8s YAML files! +Notice you don't have to create or edit K8s YAML files! - This deploys - Flux @@ -69,7 +65,6 @@ docker images - WebValidate (more on this later) ```bash - # deploy flux to the cluster kic cluster flux-install @@ -78,35 +73,31 @@ kic pods --watch # check flux kic check flux - ``` ## Force Flux to Sync -- After making changes, you can force Flux to sync (reconcile) +After making changes, you can force Flux to sync (reconcile). ```bash - kic sync - ``` ## Flux Setup Files -- The Flux setup yaml is located in `apps/myapp/kic-deploy/flux` - - A `Flux source` is a git repo / branch combination - - A `Flux kustomization` is a directory within the source - - We have 3 kustomizations - - kustomization-flux watches the flux directory - - kustomization-app watches the app directory - - kustomization-monitoring watches the monitoring directory - - You want to have multiple kustomizations (or helm) - - When a kustomization fails, the entire process is aborted - - This lets "your app" break "my app" - - We generally create a kustomization per namespace for production - - Our GitOps Automation (outer-loop) automatically creates a kustomization per namespace +- The Flux setup yaml is located in `apps/myapp/kic-deploy/flux`. + - A `Flux source` is a git repo / branch combination. + - A `Flux kustomization` is a directory within the source. + - We have 3 kustomizations: + - `kustomization-flux` watches the flux directory. + - `kustomization-app` watches the app directory. + - `kustomization-monitoring` watches the monitoring directory. + - You want to have multiple kustomizations (or helm charts). + - When a kustomization fails, the entire process is aborted. This lets "your app" break "my app." + - We generally create a kustomization per namespace for production. + - Our GitOps Automation (outer-loop) automatically creates a kustomization per namespace. - View the flux setup script by running `kic cluster flux-install --show` ## Testing -- Use the [inner-loop](./inner-loop.md#check-the-k8s-pods) instructions to further test the deployment +- Use the [inner-loop](./inner-loop.md#check-the-k8s-pods) instructions to further test the deployment. diff --git a/labs/inner-loop.md b/labs/inner-loop.md index 93756e27..6c332d3d 100644 --- a/labs/inner-loop.md +++ b/labs/inner-loop.md @@ -2,58 +2,65 @@ ## Codespace Contents -- The Codespace you just created contains - - Docker and Kubernetes development tools - - A Kubernetes cluster running k3d - - A full monitoring stack - Fluent Bit, Prometheus, and Grafana - - Sample dashboards - - A custom CLI (kic) to lower the barrier of entry for new K8s developers +The Codespace you just created contains: -![images](./images/inner-loop..drawio.png) +- Docker and Kubernetes development tools. +- A Kubernetes cluster running k3d. +- A full monitoring stack - Fluent Bit, Prometheus, and Grafana. +- Sample dashboards. +- A custom CLI (kic) to lower the barrier of entry for new K8s developers. + +Inner Loop Diagram: + +![An image diagrams the simplified architecture. At the top is a purple box with "GitHub Codespaces +(inner-loop)" inside. Beneath the purple box is a green box with "Tools (Codespaces, IAC, K3S, K9s, +CLI)." Beneath the green box, there are two separated boxes: "Application" in grey-blue, "Monitoring" +in bright blue. The "Application" grouping contains three smaller boxes: "Reference App (dotnet +WebAPI)", "Synthetic Load (WebValidate)" and "GitOps (Flux)." Within the Monitoring box, there are +three smaller boxes: "Log Forwarding (Fluent Bit)", "Metrics (Prometheus)", and "Dashboards +(Grafana)". Beneath these two boxes is an orange box with "Exposed NodePorts (reference app, +Grafana, Prometheus)."](./images/inner-loop.png) ## Verify Your Working Branch -- Your prompt should end like this - - /workspaces/Pilot-in-a-Box (mybranch) $ -- If your prompt ends in `(main)` create a working branch per the instructions in the [readme](/README.md#create-a-working-branch) +Your prompt should end like this: `/workspaces/Pilot-in-a-Box (mybranch) $` + +If your prompt ends in `(main)`, create a working branch per the instructions in the +[readme](/README.md#create-a-working-branch). ## Verify k3d cluster -> The K8s cluster is running `in` your Codespace - no need for an external cluster +> **Note**: The K8s cluster is running "in" your Codespace - no need for an external cluster. -- Use `kic` to verify the k3d cluster was created successfully +Use `kic` to verify the k3d cluster was created successfully: ```bash - # check pods kic pods - ``` -- You can delete and recreate your cluster at any time - - `kic cluster create` - -## kic CLI +You can delete and recreate your cluster at any time with `kic cluster create`. -- `kic` encapsulates many of the hard concepts of K8s for the application developer -- A design requirement is that `kic` can't have any "magic" - - Anything you can do with `kic`, you can do using standard K8s tools -- `kic` is `context aware` -- To see what most commands do, simply run the command with `--show` +## `kic` CLI - ```bash +`kic` encapsulates many of the hard concepts of K8s for the application developer. A design +requirement is that `kic` can't have any "magic". That is, anything you can do with `kic`, you can +do using standard K8s tools. - kic pods --show +`kic` is "context aware"; the commands that are available depend on where you are in your directory +structure and what resources have been made/are available. To see what most commands do, simply +run the command with `--show` flag. - ``` +```bash +kic pods --show +``` -- The `kic CLI` is customizable and extensible - - We have a workshop under development as an advanced scenario +The `kic` CLI is customizable and extensible! We have a workshop under development as an advanced +scenario. Keep an eye out! -## Experiment with the `kic CLI` +## Experiment with the `kic` CLI ```bash - # run kic # notice there is not a "build" or "deploy" command # there will be later - kic is "context aware" @@ -70,160 +77,144 @@ kic events kic ns kic pods kic svc - ``` ## Create a new .NET WebAPI -> For the workshop, use `MyApp` for the application name +For the workshop, use `MyApp` for the application name. -- PiB includes templates for new applications that encapsulate K8s best practices - - You can use any app name that conforms to a dotnet namespace - - PascalCase - - alpha only - - <= 20 chars -- For the workshop, use `MyApp` for the application name -- Once created, you can browse the code in the Explorer window +PiB includes templates for new applications that encapsulate K8s best practices. You can use any app +name that conforms to a dotnet namespace convention: -```bash +- PascalCase +- alpha only +- Less than 20 characters + +Once created, you can browse the code in the Explorer window. +```bash # create a new app from the dotnet web api template cd apps kic new dotnet-webapi MyApp # this is important as the CLI is "context aware" cd myapp - ``` ## Build MyApp -- Now that we've created a new application, the next logical step is to `build` the app -- PiB encapsulates many best practices, so, as an App Dev, you don't have to figure out how to build a secure, multi-stage Dockerfile +Now that we've created a new application, the next logical step is to `build` the app. PiB encapsulates many best practices, so, as an App Dev, you don't have to figure out how to build a secure, multi-stage Dockerfile. ```bash - # build the app kic build all # view docker images docker images - ``` ## Deploy MyApp with Observability -> Notice you don't have to create or edit yaml files! +Notice you don't have to create or edit `yaml` files? So easy! -- This deploys - - MyApp - - Fluent Bit - - Grafana - - Prometheus - - WebValidate (more on this later) +The following command deploys: -```bash +- MyApp +- Fluent Bit +- Grafana +- Prometheus +- WebValidate (more on this later) +```bash # optional - see what's going to happen kic deploy all --show kic deploy all - ``` ## Check the K8s pods ```bash - kic pods # "watch" for the pods to get to Running # ctl-c to exit kic pods --watch - ``` ## Check each application -> Make sure that all pods are `Running` +Make sure that all pods are `Running`. You evaluate their statuses with `kic check [...]`. ```bash - kic check myapp kic check webv kic check prometheus kic check grafana - ``` ## Test MyApp -- A core part of the DevX is automated integration and load testing -- We use a customized version of [WebValidate](https://github.com/microsoft/webvalidate) - - The custom test files and Dockerfile are in the `apps/myapp/webv` directory - - `kic build webv` builds a custom image: `k3d-registry.localhost:5500/webv-myapp:local` +A core part of the DevX is automated integration and load testing. We use a customized version of +[WebValidate](https://github.com/microsoft/webvalidate). The custom test files and Dockerfile are in the `apps/myapp/webv` directory. + +`kic build webv` builds a custom image: `k3d-registry.localhost:5500/webv-myapp:local`. ## Integration Test -- The integration test checks valid and invalid URLs, so the 400 and 404 errors are part of the test design - - By default, results < 400 are not logged - - Add `--verbose` to the integration test to see 2xx and 3xx results +The integration test checks valid and invalid URLs, so the 400 and 404 errors are part of the test +design. By default, results < 400 are not logged. - ```bash +**Note**: Add `--verbose` to the integration test to see 2xx and 3xx results. - # run an integration test - kic test integration --verbose +```bash +# run an integration test +kic test integration --verbose +``` - ``` +The last line of `kic test integration` shows the `test errors` (should be zero): -- The last line of `kic test integration` shows the `test errors` (should be zero) - - `Test Completed Errors 0 ValidationErrorCount 0` +```bash +Test Completed Errors 0 ValidationErrorCount 0 +``` ## Load Test -- Run a 5 second load test - - Default `--duration` is 30 sec -- Note that `kic test load` does not display a summary as it's designed to run headless - - We will see the results in our `Grafana Dashboard` later in the workshop +You can run a 5 second load test. When no `--duration` flag is indicated, the default is 30 seconds. +Note that `kic test load` does not display a summary as it's designed to run **headless**. We will +see the results in our `Grafana Dashboard` later in the workshop. ```bash - kic test load --duration 5 --verbose # you can also run load.json one time kic test integration -f load.json - ``` ## `kic test` WebV Configuration -- Run the `kic test` commands with --show to view the default parameters passed to the custom webv image +Run the `kic test` commands with `--show` to view the default parameters passed to the custom webv image. ```bash - kic test integration --show # the --max-errors value should be updated kic test integration --max-errors 5 --show - ``` -- See the list of requests in the custom test files in the `apps/myapp/webv` directory +See the list of requests in the custom test files in the `apps/myapp/webv` directory using VSCode: ```bash - code $PIB_BASE/apps/myapp/webv/integration.json - ``` ## Generate Requests for Observability -- PiB includes a full observability stack "in" the Codespace - - Fluent Bit, Prometheus, Grafana -- Generate some traffic for the dashboards +PiB includes a full observability stack "in" the Codespace: Fluent Bit, Prometheus, Grafana. -```bash +You can generate some traffic for the dashboards: +```bash # copy and paste this fence into your terminal # run a load test in the background @@ -231,83 +222,81 @@ kic test load & # run several integration tests for i in {1..10}; kic test integration; - ``` ## Codespaces + NodePorts -- One of the K8s networking types is [NodePort](https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport) - - This exposes a K8s service on `localhost:nodePort` -- Codespaces is able to expose `ports` to your local browser via an `HTTPS tunnel` -- For `inner-loop` we use NodePorts to take advantage of this integration - - For `outer-loop` we use Contour as an ingress controller +One of the K8s networking types is [NodePort](https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport). +This exposes a K8s service on `localhost:nodePort`. Codespaces is able to expose `ports` to your +local browser via an `HTTPS tunnel`. For the `inner-loop` lab, we use NodePorts to take advantage +of this integration. For `outer-loop`, we use Contour as an ingress controller. ## Open MyApp in Your Browser -- Click on the `PORTS` tab at the top of your Codespaces terminal window - - Open `App (30080)` in your browser - - Right click or click the `open in browser` icon - - This will display the Swagger UI for MyApp in your local browser - - NodePorts + Codespaces handles all the port forwarding automatically! +- Click on the `PORTS` tab at the top of your Codespaces terminal window. +- Open `App (30080)` in your browser. +- Right click or click the `open in browser` icon. +- This will display the Swagger UI for MyApp in your local browser. -> There are 5 unused Ports that are forwarded (30088, 31080, 31088, 32080, and 32088). These are set up for the advanced scenario of running multiple apps simultaneously on the local cluster. +NodePorts + Codespaces handles all the port forwarding automatically! + +> **NOTE**: There are 5 unused Ports that are forwarded (30088, 31080, 31088, 32080, and 32088) +> in this PiB template. These are set up for the **advanced** scenario of running multiple apps +> simultaneously on the local cluster. ## Observability: Grafana -- Grafana is a de-facto standard for K8s dashboards -- PiB deploys a Grafana instance with custom dashboards "in" your Codespace -- This is a powerful inner-loop feature as you don't have external dependencies -- Explore the [Grafana documentation](https://grafana.com/docs/) to learn about more data sources, visualizations, and capabilities +Grafana is a de facto standard for K8s dashboards. PiB deploys a Grafana instance with custom +dashboards "in" your Codespace. This is a powerful `inner-loop` feature as you don't have external +dependencies. + +Explore the [Grafana documentation](https://grafana.com/docs/) to learn about more data sources, +visualizations, and capabilities. ### Open Grafana in Your Browser -- From the `PORTS` tab, open `Grafana (32000)` - - Username: admin - - Password: cse-labs -- Click on "General / Home" at the top of the screen and select "Application Dashboard" +From the "Ports" tab, open `Grafana (32000)`: + +- Username: `admin` +- Password: `cse-labs` + +Click on "General / Home" at the top of the screen and select "Application Dashboard." ## Observability: Prometheus -- Prometheus is a de-facto standard for K8s metrics -- PiB deploys a Prometheus instance with `custom metrics` "in" your Codespace -- This is a powerful inner-loop feature as you don't have external dependencies -- See the [Prometheus documentation](https://prometheus.io/docs/introduction/overview/) for more information +Prometheus is a de facto standard for K8s metrics. PiB deploys a Prometheus instance with custom +metrics "in" your Codespace. This is a powerful `inner-loop` feature as you don't have external dependencies. + +See the [Prometheus documentation](https://prometheus.io/docs/introduction/overview/) for more information. ### Open Prometheus in Your Browser -- From the `PORTS` tab, open `Prometheus (30000)` -- From the query window, enter `myapp` - - This will filter to your custom app metrics -- From the query window, enter `webv` - - This will filter to the WebValidate metrics +- From the "Ports" tab, open `Prometheus (30000)` +- From the query window, enter `myapp` - this will filter to your custom app metrics. +- From the query window, enter `webv` - this will filter to the WebValidate metrics. ## Observability: Fluent Bit -- Fluent Bit is a de-facto standard for K8s log forwarding - - PiB deploys a Fluent Bit instance with "in" your Codespace -- K9s is a commonly used UI that reduces the complexity of `kubectl` - - PiB deploys k9s "in" your Codespace -- This is a powerful inner-loop feature as you don't have external dependencies -- See the [Fluent Bit documentation](https://docs.fluentbit.io/manual/) for more information +Fluent Bit is a de facto standard for K8s log forwarding. PiB deploys a Fluent Bit instance within +your Codespace. K9s is a commonly used UI that reduces the complexity of `kubectl`. PiB deploys K9s +in your Codespace. This is a powerful inner-loop feature as you don't have external dependencies -### View Fluent Bit Logs in K9s +See the [Fluent Bit documentation](https://docs.fluentbit.io/manual/) for more information. -- Fluent Bit is set to forward logs to stdout for debugging -- Fluent Bit can be configured to forward to different services including Grafana Cloud or Azure Log Analytics - -- Start `k9s` from the Codespace terminal +### View Fluent Bit Logs in K9s - ```bash +Fluent Bit is set to forward logs to `stdout` for debugging. Fluent Bit can be configured to forward +to different services including Grafana Cloud or Azure Log Analytics. - k9s +Start K9s from the Codespace terminal with the `k9s` command on its own. - ``` +While in the K9s view: -- Press `0` to show all `namespaces` -- Select `fluentbit` pod and press `enter` -- Press `enter` again to see the logs -- Press `s` to Toggle AutoScroll -- Press `w` to Toggle Wrap -- Review logs that will be sent to Grafana when configured +- Press `0` to show all `namespaces`. +- Select `fluentbit` pod and press enter. +- Press enter again to see the logs. +- Press `s` to Toggle AutoScroll. +- Press `w` to Toggle Wrap. +- Review logs that will be sent to Grafana when configured. -> To exit K9s - `:q ` +> **NOTE**: To exit K9s - `:q ` diff --git a/labs/outer-loop-aks-azure.md b/labs/outer-loop-aks-azure.md index 7aa46750..42d157e3 100644 --- a/labs/outer-loop-aks-azure.md +++ b/labs/outer-loop-aks-azure.md @@ -3,7 +3,6 @@ ## Validate cluster identifier and working branch ```bash - # by default, MY_BRANCH is set to your lower case GitHub User Name # the variable is used to uniquely name your clusters # the value can be overwritten if needed @@ -12,120 +11,115 @@ echo $MY_BRANCH # make sure your branch is set and pushed remotely # commands will fail if you are in main branch git branch --show-current - ``` ## Login to Azure -- Login to Azure using `az login --use-device-code` - > Use `az login --use-device-code --tenant ` to specify a different tenant - - If you have more than one Azure subscription, select the correct subscription - - ```bash - - # verify your account - az account show +Login to Azure using `az login --use-device-code`. - # list your Azure accounts - az account list -o table +Use `az login --use-device-code --tenant ` to specify a different tenant if you have access +to more than one tenant. - # set your Azure subscription - az account set -s mySubNameOrId +If you have more than one Azure subscription, select the correct subscription: - # verify your account - az account show +```bash +# verify your account +az account show - ``` +# list your Azure accounts +az account list -o table -- Validate user role on subscription - > Make sure your RoleDefinitionName is `Contributor` or `Owner` to create resources in this lab succssfully +# set your Azure subscription +az account set -s mySubNameOrId - ```bash +# verify your account +az account show +``` - # get az user name and validate your role assignment - principal_name=$(az account show --query "user.name" --output tsv | sed -r 's/[@]+/_/g') - az role assignment list --query "[].{principalName:principalName, roleDefinitionName:roleDefinitionName, scope:scope} | [? contains(principalName,'$principal_name')]" -o table +Validate user role on the subscription. Make sure your RoleDefinitionName is `Contributor` or `Owner` +to create resources in this lab successfully. - ``` +```bash +# get az user name and validate your role assignment +principal_name=$(az account show --query "user.name" --output tsv | sed -r 's/[@]+/_/g') +az role assignment list --query "[].{principalName:principalName, roleDefinitionName:roleDefinitionName, scope:scope} | [? contains(principalName,'$principal_name')]" -o table +``` -## Create Arc enabled AKS Cluster +## Create Arc-enabled AKS Cluster ### Create AKS Cluster -> This AKS setup is insecure and intended for learning, dev and test only. For secure or production clusters, please refer [AKS Secure Baseline](https://github.com/mspnp/aks-baseline) - - ```bash - - # set MY_AKS_CLUSTER - export MY_AKS_CLUSTER=central-tx-atx-$MY_BRANCH-aks - - # set MY_RG - export MY_RG=$MY_BRANCH-rg +> ⛔️ **NOTE**: This AKS setup is **insecure** and intended for learning, dev and test only. For secure or production clusters, please refer [AKS Secure Baseline](https://github.com/mspnp/aks-baseline). - ``` - - ```bash +```bash +# set MY_AKS_CLUSTER +export MY_AKS_CLUSTER=central-tx-atx-$MY_BRANCH-aks - # create resource group - az group create --name $MY_RG --location eastus +# set MY_RG +export MY_RG=$MY_BRANCH-rg - # create AKS cluster - # this may take 3-5 mins - az aks create -g $MY_RG -n $MY_AKS_CLUSTER --enable-managed-identity --node-count 1 --generate-ssh-keys +# create resource group +az group create --name $MY_RG --location eastus - ``` +# create AKS cluster +# this may take 3-5 mins +az aks create -g $MY_RG -n $MY_AKS_CLUSTER \ +--enable-managed-identity --node-count 1 \ +--generate-ssh-keys +``` ### Arc enable the AKS Cluster - ```bash - - # install the connectedk8s Azure CLI extension - az extension add --name connectedk8s - - # get aks creds - az aks get-credentials --resource-group $MY_RG --name $MY_AKS_CLUSTER +```bash +# install the connectedk8s Azure CLI extension +az extension add --name connectedk8s - # arc enable - # this may take 2-3 mins - az connectedk8s connect --name $MY_AKS_CLUSTER --resource-group $MY_RG --location eastus +# get aks creds +az aks get-credentials --resource-group $MY_RG --name $MY_AKS_CLUSTER - ``` +# arc enable +# this may take 2-3 mins +az connectedk8s connect --name $MY_AKS_CLUSTER \ +--resource-group $MY_RG \ +--location eastus +``` ## Set up for GitOps -### Flt setup +### `flt` setup - ```bash - - # create the cluster metadata - flt create --gitops-only -c $MY_AKS_CLUSTER +```bash +# create the cluster metadata +flt create --gitops-only -c $MY_AKS_CLUSTER +``` - ``` +This will create the cluster metadata for GitOps. You will have to wait for the CI/CD steps to +complete. -- This will create the cluster metadata for GitOps -- Wait for ci-cd to complete -- run `git pull` - - You should see the yaml files created in the `clusters` directory +Run `git pull` and you should see the yaml files created in the `clusters` directory! ### Create flux secret - ```bash - - # create secret for GitOps - kubectl apply -f "clusters/$MY_AKS_CLUSTER/flux-system/namespace.yaml" - flux create secret git flux-system -n flux-system --url "$PIB_FULL_REPO" -u gitops -p "$PIB_PAT" +```bash +# create secret for GitOps +kubectl apply -f "clusters/$MY_AKS_CLUSTER/flux-system/namespace.yaml" - ``` +flux create secret git flux-system \ +-n flux-system \ +--url "$PIB_FULL_REPO" \ +-u gitops -p "$PIB_PAT" +``` ## Create GitOps configuration with Arc -- All of the values you need for Arc setup are displayed via `flt env` and starts with PIB_* +All of the values you need for Arc setup are displayed via `flt env` and start with `PIB_*`. -- From the Azure Arc Portal - - Select `Kubernetes Clusters` in left nav - - Select the cluster - - Select `GitOps` in the left nav - - Click `Create` +- Go to the Azure Arc Portal. +- Select `Kubernetes Clusters` in left navigation. +- Select the cluster. +- Select `GitOps` in the left navigation. +- Click `Create`. +- The recommended settings are: - Basics: - Configuration Name: `gitops` - Namespace: `flux-system` @@ -134,7 +128,7 @@ git branch --show-current - Source kind: `Git Repository` - Repository URL: `https://github.com/kubernetes101/pib-dev` - Reference type: `Branch` - - Branch: value of MY_BRANCH + - Branch: value of `MY_BRANCH` - Repository Type:`Private` - Authentication source: `Provide Authentication information here` - HTTPS User: `gitops` @@ -143,28 +137,25 @@ git branch --show-current - Sync interval: 1 minute - Sync timeout: 3 minutes - Kustomization - - Instance name: flux-system - - Path: ./clusters/{{cluster-name}}/flux-system + - Instance name: `flux-system` + - Path: `./clusters/{{cluster-name}}/flux-system` - Sync interval: 1 minute - Sync timeout: 3 minutes - Retry interval: 1 minute - Prune: checked - Force: checked - Depends on: leave blank - - Wait for gitops configuration to be created - - Select `Configuration objects` from the left nav of `GitOps` config - - Wait for `gitops-flux-system` to be created - - Wait for it to be `Compliant` +- Wait for GitOps configuration to be created. +- Select `Configuration objects` from the left navigation of the `GitOps` configuration. +- Wait for `gitops-flux-system` to be created, and then for it to be "Compliant." ## Check in Arc Portal -- Use the Arc Portal to check GitOps setup - - Check cluster namespaces and workloads +Use the Arc Portal to check GitOps setup. Check on the cluster namespaces and workloads. ## Deploy sample app ```bash - # make sure you're in the apps/imdb directory cd apps/imdb @@ -181,15 +172,13 @@ flt targets deploy git pull # check the clusters in Arc for the imdb workload - ``` ## Cleanup -- Once you're finished with the workshop and experimenting, delete your cluster and Azure resources +Once you're finished with the workshop and experimenting, delete your cluster and Azure resources: ```bash - # delete rg and AKS cluster az group delete -n $MY_RG @@ -208,5 +197,4 @@ cd ../.. # update the repo git commit -am "deleted fleet" git push - ``` diff --git a/labs/outer-loop-arc-gitops.md b/labs/outer-loop-arc-gitops.md index a211cb49..877cb9c0 100644 --- a/labs/outer-loop-arc-gitops.md +++ b/labs/outer-loop-arc-gitops.md @@ -17,25 +17,26 @@ git branch --show-current ## Login to Azure -- Login to Azure using `az login --use-device-code` - > Use `az login --use-device-code --tenant ` to specify a different tenant - - If you have more than one Azure subscription, select the correct subscription +Login to Azure using `az login --use-device-code`. - ```bash +Use `az login --use-device-code --tenant ` to specify a different tenant if you have access +to more than one tenant. - # verify your account - az account show +If you have more than one Azure subscription, select the correct subscription: - # list your Azure accounts - az account list -o table +```bash +# verify your account +az account show - # set your Azure subscription - az account set -s mySubNameOrId +# list your Azure accounts +az account list -o table - # verify your account - az account show +# set your Azure subscription +az account set -s mySubNameOrId - ``` +# verify your account +az account show +``` - Validate user role on subscription > Make sure your RoleDefinitionName is `Contributor` or `Owner` to create resources in this lab succssfully @@ -50,29 +51,30 @@ git branch --show-current ## Create/Set Managed Identity -- If you don't already have managed identity set in your subscription, follow [these steps](./azure-codespaces-setup.md#create-managed-identity) to create RG and MI -- Run `flt env` and make sure `PIB_MI` is set +If you don't already have managed identity set in your subscription, follow [these steps](./azure-codespaces-setup.md#create-managed-identity) +to create a resource group and managed identity. + +Run `flt env` and make sure `PIB_MI` is set. ## Create an Arc enabled Dev Cluster -```bash +> **NOTE**: The cluster name here is just an example. You can follow a different pattern. +```bash # set MY_CLUSTER export MY_CLUSTER=central-tx-atx-$MY_BRANCH # create an arc enabled cluster # it will take about 2 minutes to create the VM flt create cluster -c $MY_CLUSTER --arc - ``` ## Update Git Repo -- [CI-CD](https://github.com/kubernetes101/pib-dev/actions) generates the deployment manifests - - Wait for CI-CD to complete (usually about 30 seconds) +[CI-CD](https://github.com/kubernetes101/pib-dev/actions) generates the deployment manifests. You +will have to wait for CI-CD to complete, which usually takes about 30 seconds. ```bash - # update the git repo after ci-cd completes git pull @@ -80,13 +82,11 @@ git pull git add ips git commit -am "added ips" git push - ``` ## Verify the Cluster Setup ```bash - # check the setup for "complete" # rerun as necessary flt check setup @@ -94,69 +94,65 @@ flt check setup # optional - use the Linux watch command # press ctl-c after "complete" watch flt check setup - ``` ## Deploy IMDb App -- Deploy IMDb app to Arc enabled K3d cluster - - ```bash +Deploy IMDb app to Arc-enabled K3d cluster: - # start in the apps/imdb directory - cd $PIB_BASE/apps/imdb - - # deploy to central and west regions - flt targets add all - flt targets deploy - - ``` +```bash +# start in the apps/imdb directory +cd $PIB_BASE/apps/imdb -- Wait for [ci-cd](https://github.com/kubernetes101/pib-dev/actions) to finish -- Force cluster to sync +# deploy to central and west regions +flt targets add all +flt targets deploy +``` - ```bash +Then wait for [ci-cd](https://github.com/kubernetes101/pib-dev/actions) to finish. - # should see imdb added - git pull +You can force cluster to sync: - # force flux to reconcile - flt sync +```bash +# should see imdb added +git pull - ``` +# force flux to reconcile +flt sync +``` ## Validate IMDb app on Azure Arc -- Get Azure Arc bearer token by running - `flt az arc-token` -- Login to Azure Portal and navigate to `Azure Arc` service -- Click on `Kubernetes clusters` from the left nav and select your cluster -- Click on `Workloads` from the left nav and place bearer token retrieved earlier -- Validate the IMDb app running on the cluster +You can get Azure Arc bearer token by running `flt az arc-token`. -## Delete Your Cluster +To check the app: -- Once you're finished with the workshop and experimenting, delete your cluster +- Login to Azure Portal and navigate to `Azure Arc` service. +- Click on `Kubernetes clusters` from the left navigation and select your cluster. +- Click on `Workloads` from the left navigation and place bearer token retrieved earlier. +- Validate the IMDb app running on the cluster. - ```bash +## Delete Your Cluster - # start in the root of your repo - cd $PIB_BASE - git pull +Once you're finished with the workshop and experimenting, delete your cluster. - # delete azure resource - flt delete $MY_CLUSTER +```bash +# start in the root of your repo +cd $PIB_BASE +git pull - # remove ips file - rm ips +# delete azure resource +flt delete $MY_CLUSTER - # reset the targets - cd apps/imdb - flt targets clear - cd ../.. +# remove ips file +rm ips - # update the repo - git commit -am "deleted cluster" - git push +# reset the targets +cd apps/imdb +flt targets clear +cd ../.. - ``` +# update the repo +git commit -am "deleted cluster" +git push +``` diff --git a/labs/outer-loop-multi-cluster.md b/labs/outer-loop-multi-cluster.md index 50f70567..e7e613d8 100644 --- a/labs/outer-loop-multi-cluster.md +++ b/labs/outer-loop-multi-cluster.md @@ -3,170 +3,145 @@ ## Validate cluster identifier and working branch ```bash - # by default, MY_BRANCH is set to your lower case GitHub User Name # the variable is used to uniquely name your clusters # the value can be overwritten if needed echo $MY_BRANCH - -# make sure your branch is set and pushed remotely -# commands will fail if you are in main branch -git branch --show-current - ``` ## Login to Azure -- Login to Azure using `az login --use-device-code` - > Use `az login --use-device-code --tenant ` to specify a different tenant - - If you have more than one Azure subscription, select the correct subscription +Login to Azure using `az login --use-device-code`. - ```bash +Use `az login --use-device-code --tenant ` to specify a different tenant if you have access +to more than one tenant. - # verify your account - az account show +If you have more than one Azure subscription, select the correct subscription: - # list your Azure accounts - az account list -o table +```bash +# verify your account +az account show - # set your Azure subscription - az account set -s mySubNameOrId +# list your Azure accounts +az account list -o table - # verify your account - az account show +# set your Azure subscription +az account set -s mySubNameOrId - ``` +# verify your account +az account show +``` -- Validate user role on subscription - > Make sure your RoleDefinitionName is `Contributor` or `Owner` to create resources in this lab succssfully +Validate the user role on subscription. Make sure your RoleDefinitionName is `Contributor` or `Owner` +to create resources in this lab successfully. ```bash - # get az user name and validate your role assignment principal_name=$(az account show --query "user.name" --output tsv | sed -r 's/[@]+/_/g') az role assignment list --query "[].{principalName:principalName, roleDefinitionName:roleDefinitionName, scope:scope} | [? contains(principalName,'$principal_name')]" -o table - ``` ## Create 3 Clusters -- Use one Azure Resource Group -- Create a cluster in each region - - You can use different names as long as they are unique - - Standard Naming Format - - region (central, east, west) - - state - - city - - store_number - - ```bash +In one Azure Resource Group, create a cluster in three different regions. You can use different +names as long as they are unique. - # start in the base of the repo - cd $PIB_BASE +In our example we're using the following name format: +region (central, east, west), state, the branch name, and a "store number." - flt create \ - -g $MY_BRANCH-fleet \ - -c central-tx-$MY_BRANCH-1001 \ - -c east-ga-$MY_BRANCH-1001 \ - -c west-wa-$MY_BRANCH-1001 +```bash +# start in the base of the repo +cd $PIB_BASE - ``` +flt create \ + -g $MY_BRANCH-fleet \ + -c central-tx-$MY_BRANCH-1001 \ + -c east-ga-$MY_BRANCH-1001 \ + -c west-wa-$MY_BRANCH-1001 +``` ## Verifying the Clusters -- Update Git Repo after [CI-CD](https://github.com/kubernetes101/pib-dev/actions) is complete (usually about 30 seconds) - - ```bash - - # update the git repo after ci-cd completes - git pull +Update Git Repo after [CI-CD](https://github.com/kubernetes101/pib-dev/actions) is complete. +This usually takes about 30 seconds. - # add ips to repo - git add ips - git commit -am "added ips" - git push - - ``` - -- Verify clusters setup - - ```bash - - # check the setup for "complete" - # rerun as necessary - flt check setup - - ``` +```bash +# update the git repo after ci-cd completes +git pull -- Check Heartbeat +# add ips to repo +git add ips +git commit -am "added ips" +git push +``` - ```bash +Then verify the cluster's setup with `flt check setup`. Check the setup status for "complete", +and rerun as necessary. - # check that heartbeat is running on your cluster - flt check heartbeat +You can check the heartbeat. - # check heartbeat on clusters in specific region - flt check heartbeat --filter central +```bash +# check that heartbeat is running on your cluster +flt check heartbeat - ``` +# check heartbeat on clusters in specific region +flt check heartbeat --filter central +``` ## IMDb Deployment -- By default, the IMDb app is not deployed to any clusters -- Experiment with different deployments - - ```bash - - # start in the apps/imdb directory - cd $PIB_BASE/apps/imdb +By default, the IMDb app is not deployed to any clusters. So now we can experiment with +different deployments! - # deploy to central and west regions - flt targets add region:central region:west - flt targets deploy +```bash +# start in the apps/imdb directory +cd $PIB_BASE/apps/imdb - # wait for ci-cd to complete and update the cluster - git pull - flt sync +# deploy to central and west regions +flt targets add region:central region:west +flt targets deploy - # check the cluster for imdb - flt check app imdb +# wait for ci-cd to complete and update the cluster +git pull +flt sync - # deploy to just the east region - flt targets clear - flt targets add region:east - flt targets deploy +# check the cluster for imdb +flt check app imdb - # wait for ci-cd to complete and update the repo - git pull - flt sync +# deploy to just the east region +flt targets clear +flt targets add region:east +flt targets deploy - # check the cluster for imdb - flt check app imdb +# wait for ci-cd to complete and update the repo +git pull +flt sync - # deploy to all clusters - flt targets clear - flt targets add all - flt targets deploy +# check the cluster for imdb +flt check app imdb - # wait for ci-cd to complete and update the repo - git pull - flt sync +# deploy to all clusters +flt targets clear +flt targets add all +flt targets deploy - # check the cluster for imdb - flt check app imdb +# wait for ci-cd to complete and update the repo +git pull +flt sync - ``` +# check the cluster for imdb +flt check app imdb +``` ## Deploy Dogs-Cats App -- Dogs and cats app is a simple "voting" app for demo purposes - - Note that dogs-cats and IMDb cannot be deployed to the same cluster due to ingress conflicts - - In a production environment, you would add ingress rules for host, url, or port based routing +The "Dogs and Cats" app is a simple "voting" app for demo purposes. -> Start in the apps/imdb directory +> Note that dogs-cats and IMDb cannot be deployed to the same cluster due to ingress conflicts. -```bash +In a production environment, you would add ingress rules for host, url, or port-based routing. +```bash # start in the apps/imdb directory cd $PIB_BASE/apps/imdb @@ -189,15 +164,13 @@ flt sync flt check app imdb flt check app dogs flt curl /version - ``` ## Clean Up -- Once you are finished with the workshop, you can delete your Azure resources +Once you are finished with the workshop, you can delete your Azure resources. ```bash - # start in the base of the repo cd $PIB_BASE git pull @@ -221,5 +194,4 @@ cd ../.. # update the repo git commit -am "deleted fleet" git push - ``` diff --git a/labs/outer-loop-ring-deployment.md b/labs/outer-loop-ring-deployment.md index 121cec96..419d1218 100644 --- a/labs/outer-loop-ring-deployment.md +++ b/labs/outer-loop-ring-deployment.md @@ -1,10 +1,12 @@ # PiB outer-loop with Ring Based Deployment -- PiB includes `GitOps Automation` that uses `cluster metadata` for targeted deployments -- In this lab we will - - Create the GitOps structure for 15 clusters - - Add `ring` metadata to each cluster - - Add targets based on cluster metadata +PiB includes GitOps Automation that uses cluster metadata for targeted deployments. + +In this lab we will: + +- Create the GitOps structure for 15 clusters. +- Add `ring` metadata to each cluster. +- Add targets based on cluster metadata. ## Validate cluster identifier and working branch @@ -23,61 +25,57 @@ git branch --show-current ## Create 15 Clusters -> Note: we don't actually create the clusters, just the GitOps folders - - ```bash - - # start in the base of the repo - cd $PIB_BASE - - flt create \ - --gitops-only \ - -g $MY_BRANCH-fleet \ - -c central-tx-atx-101 \ - -c central-tx-dal-101 \ - -c central-tx-hou-101 \ - -c central-tx-ftw-101 \ - -c central-tx-san-101 \ - -c east-ga-atl-101 \ - -c east-fl-mia-101 \ - -c east-al-bham-101 \ - -c east-ms-bil-101 \ - -c east-nc-clt-101 \ - -c west-wa-sea-101 \ - -c west-nv-lv-101 \ - -c west-ca-sd-101 \ - -c west-or-pdx-101 \ - -c west-mt-bose-101 - - ``` +> **NOTE**: we don't actually create the clusters, just the GitOps folders. -## Cluster Metadata Files +```bash +# start in the base of the repo +cd $PIB_BASE - ```bash +flt create \ + --gitops-only \ + -g $MY_BRANCH-fleet \ + -c central-tx-atx-101 \ + -c central-tx-dal-101 \ + -c central-tx-hou-101 \ + -c central-tx-ftw-101 \ + -c central-tx-san-101 \ + -c east-ga-atl-101 \ + -c east-fl-mia-101 \ + -c east-al-bham-101 \ + -c east-ms-bil-101 \ + -c east-nc-clt-101 \ + -c west-wa-sea-101 \ + -c west-nv-lv-101 \ + -c west-ca-sd-101 \ + -c west-or-pdx-101 \ + -c west-mt-bose-101 +``` - ls -alF clusters/*.yaml +## Cluster Metadata Files - cat clusters/central-tx-atx-101.yaml +We can see the different metadata files and sample the contents inside one of them. - ``` +```bash +ls -alF clusters/*.yaml -## Update Git Repo +cat clusters/central-tx-atx-101.yaml +``` -- `flt create` generates GitOps files for the cluster -- [CI-CD](https://github.com/kubernetes101/pib-dev/actions) generates the deployment manifests - - Wait for CI-CD to complete (usually about 30 seconds) +## Update Git Repo - ```bash +`flt create` generates GitOps files for the cluster. [CI-CD](https://github.com/kubernetes101/pib-dev/actions) +generates the deployment manifests. - # update the git repo after ci-cd completes - git pull +Wait for CI-CD to complete, which usually takes about 30 seconds. - ``` +```bash +# update the git repo after ci-cd completes +git pull +``` ## Add Metadata to Clusters ```bash - echo "ring: 0" >> clusters/central-tx-atx-101.yaml echo "ring: 1" >> clusters/central-tx-dal-101.yaml echo "ring: 2" >> clusters/central-tx-ftw-101.yaml @@ -95,13 +93,11 @@ echo "ring: 3" >> clusters/west-or-pdx-101.yaml echo "ring: 4" >> clusters/west-wa-sea-101.yaml git add clusters - ``` ## Deploy IMDb to ring:0 ```bash - cd $PIB_BASE/apps/imdb flt targets clear flt targets add ring:0 @@ -109,41 +105,35 @@ flt targets deploy # wait for ci-cd to finish git pull - ``` ## Add ring:1 ```bash - cd $PIB_BASE/apps/imdb flt targets add ring:1 flt targets deploy # wait for ci-cd to finish git pull - ``` ## Add Central Region ```bash - cd $PIB_BASE/apps/imdb flt targets add region:central flt targets deploy # wait for ci-cd to finish git pull - ``` ## Clean Up -- Once you are finished with the workshop, you can delete your GitOps resources +Once you are finished with the workshop, you can delete your GitOps resources. ```bash - # start in the base of the repo cd $PIB_BASE git pull @@ -164,5 +154,4 @@ flt targets clear flt targets deploy cd ../.. - ``` diff --git a/labs/outer-loop.md b/labs/outer-loop.md index 4ca5b505..088745de 100644 --- a/labs/outer-loop.md +++ b/labs/outer-loop.md @@ -2,30 +2,34 @@ ## Introduction -- As part of PiB, we have automated the creation of `dev/test clusters` using a repeatable, consistent, Infrastructure as Code approach -- PiB ships a separate CLI (`flt`) for outer-loop dev/test clusters -- The dev/test clusters run `k3d` in an `Azure VM` -- `flt` connects to the dev/test VMs via SSH -- Access to the dev/test fleet can be shared between Codespaces and users - - We have an advanced workshop under development for fleet sharing +As part of PiB, we have automated the creation of dev/test clusters using a repeatable, +consistent, Infrastructure as Code approach. PiB ships a separate CLI (`flt`) for `outer-loop` dev/test clusters. -![images](./images/outer-loop..drawio.png) +The clusters run `k3d` in an Azure Virtual Machine (VM). `flt` connects to the dev/test VMs via SSH. +Access to the dev/test fleet can be shared between Codespaces and users. -## flt CLI +We have an advanced workshop under development for fleet sharing! Keep your eyes out. -- `flt` encapsulates many of the hard concepts of K8s for the application developer, tester, and ops team -- A design requirement is that `flt` can't have any "magic" - - Anything you can do with `flt`, you can do using standard K8s tools -- `flt` has rich tab completion -- `flt` is `context aware` - - Running `flt list` will return Error: unknown command "list" for "flt" -- The `flt CLI` is customizable and extensible - - We have a workshop under development as an advanced scenario +![A diagram with a simplied fleet diagram. TODO: This diagram could be clearer.](./images/outer-loop.png) -## Experiment with the `flt CLI` +## `flt` CLI -```bash +`flt` encapsulates many of the hard concepts of kubernetes (K8s) for the application developer, +tester, and ops team. A design requirement is that `flt` can't have any "magic." Anything you can +do with `flt`, you can do using standard K8s tools. + +`flt` has rich tab completion and is "context aware"; that is, depending on your location in the +directory structure and what you have already run, different commands will be available to you. + +Running `flt list` will return `Error: unknown command "list" for "flt"` when you first start, since no +fleets exist yet. TODO: CONFIRM THIS + +The `flt` CLI is customizable and extensible - we have a workshop under development as an advanced +scenario! Keep an eye out! +## Experiment with the `flt` CLI + +```bash # run flt flt @@ -35,7 +39,6 @@ flt env # Error: unknown command "list" for "flt" # context aware flt list - ``` ## Validate cluster identifier and working branch @@ -55,25 +58,26 @@ git branch --show-current ## Login to Azure -- Login to Azure using `az login --use-device-code` - > Use `az login --use-device-code --tenant ` to specify a different tenant - - If you have more than one Azure subscription, select the correct subscription +Login to Azure using `az login --use-device-code`. - ```bash +Use `az login --use-device-code --tenant ` to specify a different tenant if you have access +to more than one tenant. - # verify your account - az account show +If you have more than one Azure subscription, select the correct subscription: - # list your Azure accounts - az account list -o table +```bash +# verify your account +az account show - # set your Azure subscription - az account set -s mySubNameOrId +# list your Azure accounts +az account list -o table - # verify your account - az account show +# set your Azure subscription +az account set -s mySubNameOrId - ``` +# verify your account +az account show +``` - Validate user role on subscription > Make sure your RoleDefinitionName is `Contributor` or `Owner` to create resources in this lab succssfully @@ -89,24 +93,22 @@ git branch --show-current ## Create a Dev Cluster ```bash - # set MY_CLUSTER. export MY_CLUSTER=central-tx-atx-$MY_BRANCH # create cluster # it will take about 2 minutes to create the VM flt create cluster -c $MY_CLUSTER - ``` ## Update Git Repo -- `flt create` generates GitOps files for the cluster -- [CI-CD](https://github.com/kubernetes101/pib-dev/actions) generates the deployment manifests - - Wait for CI-CD to complete (usually about 30 seconds) +`flt create` generates GitOps files for the cluster. -```bash +A GitHub action [CI-CD](https://github.com/kubernetes101/pib-dev/actions) generates the deployment +manifests. You will need to wait for CI-CD to complete, which usually takes about 30 seconds. +```bash # update the git repo after ci-cd completes git pull @@ -114,20 +116,18 @@ git pull git add ips git commit -am "added ips" git push - ``` ## Verify the Cluster Setup -> If you get an SSH error, just retry every few seconds for the SSHD server to configure and start +> **NOTE**: If you get an SSH error, just retry every few seconds for the SSHD server to configure +> and start. -- `flt create` creates the Azure VM that hosts the k3d cluster -- Additional setup is done via the `cloud-init` script -- We have to wait for the k3d setup to complete - - This usually takes 3-4 minutes after VM setup completes +`flt create` creates the Azure VM that hosts the `k3d` cluster. Any additional setup is done via the +`cloud-init` script. We have to wait for the `k3d` setup to complete, which usually takes 3-4 minutes +after the VM setup completes. ```bash - # check the setup for "complete" # rerun as necessary flt check setup @@ -135,57 +135,50 @@ flt check setup # optional - use the Linux watch command # press ctl-c after "complete" watch flt check setup - ``` ## Force GitOps to Sync -- We use Flux v2 for GitOps -- Flux synchs on a schedule (1 minute) -- We can force Flux to sync (reconcile) immediately at any time +We use Flux v2 for GitOps. Flux syncs on a schedule of 1 minute. We can force Flux to sync (reconcile) immediately at any time. ```bash - # force flux to sync flt sync # check flux status flt check flux - ``` ## Check Heartbeat -- We deploy the `heartbeat` app to the cluster for observability -- Check the heartbeat status - - Note that heartbeat uses Let's Encrypt for ingress - - It can take up to 60 seconds for the Let's Encrypt handshake to complete - - It's normal to get a `no healthy upstream` error until the handshake is completed +We deploy the `heartbeat` app to the cluster for observability. We check the heartbeat status and +note that heartbeat uses "Let's Encrypt" for ingress. It can take up to 60 seconds for the "Let's +Encrypt" handshake to complete. It's normal to get a `no healthy upstream` error until the handshake +is completed. ```bash - # check that heartbeat is running on your cluster flt check heartbeat # another way to check flt curl /heartbeat/17 - ``` ## Deploy Reference App -- PiB provides a reference application - IMDb - - dotnet WebAPI that exposes actors, genres, and movies from a subset of the IMDb data set -- `flt` provides `GitOps Automation` for the dev/test fleet -- The `flt targets` commands control which clusters an app is deployed to - - `flt targets` is context aware, so the current directory matters -- `flt targets` can use any KV pair defined in the cluster metadata - - `flt targets add all` is a special case that deploys to all clusters in the fleet (which is 1 for this lab) - - Note that the `all` target must be the only target specified - - An advanced workshop demonstrating `ring based deployment` is under development +PiB provides a reference application IMDB, which is a dotnet WebAPI that exposes actors, genres, +and movies from a subset of the IMDb data set. -```bash +`flt` provides GitOps Automation for the dev/test fleet. The `flt targets` commands control which +clusters an app is deployed to. `flt targets` is context aware, so the current directory matters! +The command can use any KV pair defined in the cluster metadata too. + +`flt targets add all` is a special case that deploys to all clusters in the fleet (which is one fleet +for this lab). Note that the `all` target must be the only target specified. +An advanced workshop demonstrating `ring based deployment` can be found [here](outer-loop-ring-deployment.md). + +```bash # start in the apps/imdb directory cd apps/imdb @@ -208,95 +201,82 @@ flt targets add region:central # deploy the app via ci-cd and GitOps Automation flt targets deploy - ``` -## Wait for ci-cd to finish +## Wait for `ci-cd` to finish -- Check [ci-cd status](https://github.com/kubernetes101/pib-dev/actions) +Check on the [ci-cd status](https://github.com/kubernetes101/pib-dev/actions). ## Update Cluster -- Force the cluster to sync -- Update local git repo +Force the cluster to sync and then update the local git repo. ```bash - # should see imdb added git pull # force flux to reconcile flt sync - ``` ## Verify IMDb was Deployed -- You may have to retry the command a few times as the pods start +> **NOTE**: You may have to retry the command a few times as the pods start. ```bash - # check the cluster for imdb flt check app imdb - ``` ## Curl an IMDb URL -- Note that IMDb uses Let's Encrypt for ingress - - It can take up to 60 seconds for the Let's Encrypt handshake to complete - - It's normal to get a `no healthy upstream` error until the handshake is completed +Note that IMDb uses "Let's Encrypt" for ingress. It can take up to 60 seconds for the "Let's Encrypt" +handshake to complete. It's normal to get a `no healthy upstream` error until the handshake is +completed. - ```bash - - flt curl /version - - # curl additional URLs - flt curl /healthz - flt curl /readyz - flt curl /api/genres - flt curl /metrics +```bash +flt curl /version - ``` +# curl additional URLs +flt curl /healthz +flt curl /readyz +flt curl /api/genres +flt curl /metrics +``` ## Test DNS Integration -- Use http to test your external DNS name - - ```bash - - # export MY_IP - cd $PIB_BASE - export MY_IP=$(cat ips | cut -f2) +Use `http` to test your external DNS name. - http http://$MY_IP/version +```bash +# export MY_IP +cd $PIB_BASE +export MY_IP=$(cat ips | cut -f2) - http http://$MY_IP/api/genres +http http://$MY_IP/version - ``` +http http://$MY_IP/api/genres +``` ## Delete Your Cluster -- Once you're finished with the workshop and experimenting, delete your cluster - - ```bash - - # start in the root of your repo - cd $PIB_BASE - git pull - flt delete $MY_CLUSTER - rm ips - git commit -am "deleted cluster" - git push - - ``` +Once you're finished with the workshop and experimenting, delete your cluster. -- You can recreate your cluster at any time - - Note that to reuse the same name, you have to wait for the Azure RG to delete +```bash +# start in the root of your repo +cd $PIB_BASE +git pull +flt delete $MY_CLUSTER +rm ips +git commit -am "deleted cluster" +git push +``` - ```bash +You can recreate your cluster at any time. - # you should see your RG in the "deleting" state - flt az groups +> **NOTE**: To reuse the same name, you have to wait for the Azure resource group to delete. - ``` +```bash +# you should see your RG in the "deleting" state +flt az groups +``` diff --git a/templates/README.md b/templates/README.md index 092e2bb7..f3643bcf 100644 --- a/templates/README.md +++ b/templates/README.md @@ -1,7 +1,9 @@ # PiB Templates -This directory contains `Kubernetes deployment yaml` templates. These templates are used by the PiB apps so that Application Teams don't have to create custom deployment yaml for each application. +This directory contains Kubernetes deployment `yaml` templates. These templates are used by the PiB +apps so that Application Teams don't have to create custom deployment `yaml` for each application. You can add additional templates and reference from your application `app.yaml` file. -`GitOps Automation` automatically substutes the `{{gitops.*}}` values from the application and cluster metadata. GitOps Automation uses `Kustomize` to deploy the templates. +GitOps Automation automatically substitutes the `{{gitops.*}}` values from the application and +cluster metadata. GitOps Automation uses `kustomize` to deploy the templates. diff --git a/vm/README.md b/vm/README.md index 32426926..65976bf7 100644 --- a/vm/README.md +++ b/vm/README.md @@ -1,11 +1,14 @@ # PiB - VM Setup and Operations -This directory contains the setup and operations scripts used by `flt` to create and operate `dev/test clusters` - -- scripts - - These scripts get executed via SSH on the cluster - - For example: `flt check flux` executes the `vm/scripts/check-flux` bash script on each cluster in the fleet -- setup - - The `vm/setup/setup.sh` script is executed as part of `cloudinit` - - The scripts are broken into `stages` and can be customized - - The scripts run in the `pib` user context - `sudo` is available +This directory contains the setup and operations scripts used by `flt` to create and operate +dev/test clusters. + +## `/scripts` + +These scripts get executed via SSH on the cluster. For example: `flt check flux` executes the +`vm/scripts/check-flux` bash script on each cluster in the fleet. + +## `/setup` + +The `vm/setup/setup.sh` script is executed as part of `cloud-init`. The scripts are broken into +`stages` and can be customized. The scripts run in the `pib` user context, but `sudo` is available. From 779c3cba3136665a67045c4b3bb34407c5695208 Mon Sep 17 00:00:00 2001 From: Hannah Kennedy Date: Mon, 24 Oct 2022 18:36:40 -0400 Subject: [PATCH 2/5] More formatting, readability changes. --- README.md | 2 +- labs/advanced-labs/aks-iot/README.md | 212 ++++++++-------- labs/advanced-labs/canary/README.md | 234 +++++++++--------- labs/advanced-labs/monitoring/README.md | 3 +- .../monitoring/fluent-bit/README.md | 87 ++++--- .../monitoring/prometheus/README.md | 3 +- labs/advanced-labs/voe/README.md | 3 +- labs/inner-loop.md | 5 +- 8 files changed, 265 insertions(+), 284 deletions(-) diff --git a/README.md b/README.md index c2c1ae38..57236321 100644 --- a/README.md +++ b/README.md @@ -39,7 +39,7 @@ on to the next step of deploying the application to a test cluster in the Cloud There are also several [advanced labs](./README.md#advanced-labs) that cover centralized monitoring, canary deployments, and targeting different devices. -> Note: PiB is not intended as-is for production deployments. However, some concepts covered (GitOps +> **NOTE**: PiB is not intended as-is for production deployments. However, some concepts covered (GitOps > and Observability) are production-ready. ## Prerequisites diff --git a/labs/advanced-labs/aks-iot/README.md b/labs/advanced-labs/aks-iot/README.md index 7415cba8..ded08bd8 100644 --- a/labs/advanced-labs/aks-iot/README.md +++ b/labs/advanced-labs/aks-iot/README.md @@ -2,22 +2,24 @@ ## AKS-IoT is in Preview -- AKS-IoT is in preview so there's a chance these instructions will change over time -- Reach out to the soldevx team for access to AKS-IoT preview +> **NOTE**: AKS-IoT is in preview so there's a chance these instructions will +change over time! -## This document is a work in progress +Reach out to the soldevx team for access to AKS-IoT preview. TODO: How? + +## PLEASE NOTE: This document is a work in progress ## AKS-IoT Setup - Create a new Azure VM with Windows 10 - - You can also use your Windows 10 or Windows 11 computer + - You can also use your local Windows 10 or Windows 11 computer - Run Windows Update -- Install Hyper-V - - Requires reboot -- Install git CLI -- Install gh CLI -- Install az CLI -- Install VS Code +- Install Hyper-V - Note: This requires a reboot! +- Install, if not already on your machine: + - [git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) + - [Github CLI](https://cli.github.com/manual/installation) + - [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli) + - [VS Code](https://code.visualstudio.com/Download) - Install Chocolatey - `Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))` @@ -25,90 +27,88 @@ - Install Helm - `choco install kubernetes-helm` -## Update bootstrap/aksiot-userconfig.json - -- Azure - - ResourceGroupName - - ClusterName - -## Set permanent Env Vars +## Update bootstrap/aksiot-userconfig.json TODO: Does this exist? -- Update with your values +From the Azure portal, add these values to the file: - ```powershell +- ResourceGroupName +- ClusterName - setx AZ_TENANT yourTenant - setx AZ_SP_ID yourServicePrincipal - setx AZ_SP_KEY yourSPKey - setx PAT yourPAT - setx PIB_PAT %PAT% - setx GITHUB_TOKEN %PAT% +## Set permanent Env Vars - setx PIB_CLUSTER your-cluster-name-101 - setx PIB_RESOURCE_GROUP yourRG - setx PIB_FULL_REPO https://github.com/yourOrg/yourRepo - setx PIB_BRANCH yourBranch +Update with your environment to include your values: - ``` +```powershell +setx AZ_TENANT yourTenant +setx AZ_SP_ID yourServicePrincipal +setx AZ_SP_KEY yourSPKey +setx PAT yourPAT +setx PIB_PAT %PAT% +setx GITHUB_TOKEN %PAT% + +setx PIB_CLUSTER your-cluster-name-101 +setx PIB_RESOURCE_GROUP yourRG +setx PIB_FULL_REPO https://github.com/yourOrg/yourRepo +setx PIB_BRANCH yourBranch +``` -- You will need to exit and start a new shell after running the setx commands +> **NOTE**: You will need to exit and start a new shell after running the `setx` commands. ## Install AKS-IoT -- todo - update with download instructions - - ```powershell +- TODO: update with download instructions - # when prompted for git credentials, use your PAT to avoid 2FA setup / issues +```powershell +# when prompted for git credentials, use your PAT to avoid 2FA setup / issues - # start in the directory you copied the file share to - git clone https://github.com/kubernetes101/pib-dev +# start in the directory you copied the file share to +git clone https://github.com/kubernetes101/pib-dev - # install the msi - cd bin - AksIot-K3s.msi - cd .. +# install the msi +cd bin +AksIot-K3s.msi +cd .. - ``` +``` - Set PiB Base to current directory - ```powershell +```powershell - cd pilot-in-a-box +cd pilot-in-a-box - cd +cd - setx PIB_BASE +setx PIB_BASE - ``` +``` - From AKS-IoT/bootstrap directory - - These commands must be run from the AKS-IoT Powershell +- These commands must be run from the AKS-IoT Powershell - ```powershell +```powershell - # start elevated shell - # (optional) create a shortcut on your desktop - LaunchPrompt.cmd +# start elevated shell +# (optional) create a shortcut on your desktop +LaunchPrompt.cmd - # add az cli extensions - az extension add --upgrade --name connectedk8s - az extension add --upgrade --name k8s-configuration - az extension add --upgrade --name k8s-extension - az extension add --upgrade --name k8s-configuration +# add az cli extensions +az extension add --upgrade --name connectedk8s +az extension add --upgrade --name k8s-configuration +az extension add --upgrade --name k8s-extension +az extension add --upgrade --name k8s-configuration - az provider register --namespace Microsoft.Kubernetes - az provider register --namespace Microsoft.KubernetesConfiguration - az provider register --namespace Microsoft.ExtendedLocation +az provider register --namespace Microsoft.Kubernetes +az provider register --namespace Microsoft.KubernetesConfiguration +az provider register --namespace Microsoft.ExtendedLocation - # Initialize Arc - Initialize-ArcIot +# Initialize Arc +Initialize-ArcIot - # update help (this is required later in the process) - update-help +# update help (this is required later in the process) +update-help - ``` +``` ## Create a K3s Cluster @@ -138,76 +138,64 @@ del token.txt ## Arc Enable the Cluster -> Make sure you set your Env Vars above and started a new shell +> **NOTE**: Make sure you set your Env Vars above and have started a new shell. ```powershell - # (optional) login to Azure with SP az login --service-principal --tenant $Env:AZ_TENANT --username $Env:AZ_SP_ID --password $Env:AZ_SP_KEY # connect the cluster to Arc az connectedk8s connect --name $Env:PIB_CLUSTER --resource-group $Env:PIB_RESOURCE_GROUP - ``` ## Arc Enabled GitOps -- Create GitOps config - - Copy `pilot-in-a-box/labs/advanced-labs/aks-iot/sample-cluster.txt` - - To `pilot-in-a-box/clusters/your-cluster.yaml` - - Git add, commit, push - - Wait for ci-cd to complete - -- Arc enable GitOps - - ```powershell - - az k8s-configuration flux create ` - --cluster-type connectedClusters ` - --interval 1m ` - --kind git ` - --name gitops ` - --namespace flux-system ` - --scope cluster ` - --timeout 3m ` - --https-user gitops ` - --cluster-name $Env:PIB_CLUSTER ` - --resource-group $Env:PIB_RESOURCE_GROUP ` - --url $Env:PIB_FULL_REPO ` - --branch $Env:PIB_BRANCH ` - --https-key $Env:PIB_PAT ` - --kustomization ` - name=flux-system ` - path=./clusters/$Env:PIB_CLUSTER/flux-system/listeners ` - timeout=3m ` - sync_interval=1m ` - retry_interval=1m ` - prune=true ` - force=true - - ``` +Create GitOps config based on the following: `pilot-in-a-box/labs/advanced-labs/aks-iot/sample-cluster.txt` +will become `pilot-in-a-box/clusters/your-cluster.yaml`. Then git add, commit, push. Once the +`ci-cd` Github action is complete, you can enable Arc. + +```powershell + +az k8s-configuration flux create ` + --cluster-type connectedClusters ` + --interval 1m ` + --kind git ` + --name gitops ` + --namespace flux-system ` + --scope cluster ` + --timeout 3m ` + --https-user gitops ` + --cluster-name $Env:PIB_CLUSTER ` + --resource-group $Env:PIB_RESOURCE_GROUP ` + --url $Env:PIB_FULL_REPO ` + --branch $Env:PIB_BRANCH ` + --https-key $Env:PIB_PAT ` + --kustomization ` + name=flux-system ` + path=./clusters/$Env:PIB_CLUSTER/flux-system/listeners ` + timeout=3m ` + sync_interval=1m ` + retry_interval=1m ` + prune=true ` + force=true + +``` ## Test Arc and Arc GitOps -- Open the Azure Portal -- Open Arc Blade -- Select your Cluster -- Get your Service Token - - From `AKS-IoT\bootstrap` directory - - `type servicetoken.txt` - - Copy and paste token +Open the Azure Portal and open the Arc Blade. Locate your cluster and get a Service Token. +TODO: Is this accurate?--> From `AKS-IoT\bootstrap` directory, copy the Service Token into a file +`servicetoken.txt`. ## Delete Cluster -- From the aks-iot/bootstrap dir +From within the the `aks-iot/bootstrap` directory: ```powershell - LaunchPrompt.cmd # delete the cluster Remove-AksIotNode exit - ``` diff --git a/labs/advanced-labs/canary/README.md b/labs/advanced-labs/canary/README.md index 6ea60fec..2e7e3ce9 100644 --- a/labs/advanced-labs/canary/README.md +++ b/labs/advanced-labs/canary/README.md @@ -1,12 +1,16 @@ -# PiB Automated Canary deployment using Flagger +# PiB Automated Canary Deployment Using Flagger ## Introduction -[Flagger](https://flagger.app/) is a progressive delivery tool that automates the release process for applications running on Kubernetes. It takes a Kubernetes deployment and creates a series of objects (Kubernetes deployments, ClusterIP services and Contour HTTPProxy) for an application. These objects expose the application in the cluster and drive the canary analysis and promotion. +[Flagger](https://flagger.app/) is a progressive delivery tool that automates the release process +for applications running on Kubernetes. It takes a Kubernetes deployment and creates a series of objects +(Kubernetes deployments, ClusterIP services and Contour HTTPProxy) for an application. These objects +expose the application in the cluster and drive the canary analysis and promotion. ## Lab Prerequisites -- Complete outer-loop [Lab 1](../../outer-loop.md) and skip the [Delete Your Cluster](../../outer-loop.md#delete-your-cluster) section +Complete outer-loop [Lab 1](../../outer-loop.md) **BUT** skip the [Delete Your Cluster](../../outer-loop.md#delete-your-cluster) section. The subsequent sections rely on the environment variables set in these +labs. ## Validate cluster identifier and working branch @@ -24,7 +28,6 @@ git branch --show-current ## Install Flagger ```bash - # make sure you are in canary directory cd $PIB_BASE/labs/advanced-labs/canary @@ -42,28 +45,25 @@ cd apps/flagger # check deploy targets (should be []) flt targets list -# clear the targets if not [] -flt targets clear +# if not [] then clear the targets +# flt targets clear # add all clusters as a target flt targets add all # deploy the changes flt targets deploy - ``` -### Check that your GitHub Action is running +### Check Github Action Status -- - - your action should be queued or in-progress +Check that your GitHub Action is running, either queued or in-progress. Check the action here: TODO: Is this for all builds or will it be for the presumably forked repo? -### Check deployment +### Check Deployment -- Once the action completes successfully +Once the action completes successfully, check the deployment: ```bash - # you should see flagger added to your cluster git pull @@ -75,147 +75,139 @@ flt sync # NOTE: We also deploy prometheus to scrape metrics to monitor Canary deployment flt check app flagger flt check app prometheus - ``` -## Update reference app to use Canary deployment Strategy - -- To update IMDb reference app to use canary deployment template: - - Update `apps/imdb/app.yaml` with template value
- `template: pib-service-canary` - - ```bash - - # deploy imdb with canary template - cd ../imdb - flt targets deploy - - ``` - - - Once the [github action](https://github.com/kubernetes101/pib-dev/actions) is completed, force flux to sync: - - ```bash - - # force flux to sync - # flux will sync on a schedule - this command forces it to sync now for debugging - git pull - flt sync - - ``` - - The reference app should be updated with Canary Deployment objects listed: - - ```bash +## Update Reference App to Use Canary Deployment Strategy - deployment.apps/imdb - deployment.apps/imdb-primary - deployment.apps/webv-imdb - service/imdb - service/imdb-canary - service/imdb-primary - service/webv-imdb - httpproxy.projectcontour.io/imdb +In order to update IMDb reference app to use canary deployment template, you'll need to update a few +values. In `apps/imdb/app.yaml`, change the `template` line to be `template: pib-service-canary` instead +of `template: pib-service` - ``` - - - Validate primary and canary objects in the cluster: - - ```bash - - flt ssh $MY_CLUSTER - kic pods - kic svc - kubectl get canary -n imdb - - # exit from cluster - exit +```bash +# deploy imdb with canary template +cd ../imdb +flt targets deploy +``` - ``` +Once the [github action](https://github.com/kubernetes101/pib-dev/actions) is completed, force flux +to sync: -## Observe automated canary promotion +```bash +# force flux to sync +# flux will sync on a schedule - this command forces it to sync now for debugging +git pull +flt sync +``` -- Trigger a canary deployment by updating the container image for IMDb: - - Update `apps/imdb/app.yaml` with image tag from `latest` to `beta`
- `image: ghcr.io/cse-labs/pib-imdb:beta` +The reference app should be updated with Canary Deployment objects listed: - ```bash +```bash +deployment.apps/imdb +deployment.apps/imdb-primary +deployment.apps/webv-imdb +service/imdb +service/imdb-canary +service/imdb-primary +service/webv-imdb +httpproxy.projectcontour.io/imdb +``` - # deploy imdb with updated version - cd ../imdb - flt targets deploy +Then validate primary and canary objects in the cluster: - ``` +```bash +flt ssh $MY_CLUSTER +kic pods +kic svc +kubectl get canary -n imdb - - Once the [github action](https://github.com/kubernetes101/pib-dev/actions) is completed, force flux to sync: +# exit from cluster +exit +``` - ```bash +## Observe Automated Canary Promotion - # force flux to sync - # flux will sync on a schedule - this command forces it to sync now for debugging - git pull - flt sync +Next, trigger a canary deployment by updating the container image for IMDb. In `apps/imdb/app.yaml`, +change the image tag from `latest` to `beta`: `image: ghcr.io/cse-labs/pib-imdb:beta`. - ``` +```bash +# deploy imdb with updated version +cd ../imdb +flt targets deploy +``` -- Observe canary promotion in K9s: +Once the [github action](https://github.com/kubernetes101/pib-dev/actions) is completed, force flux +to sync again: - ```bash +```bash +# force flux to sync +# flux will sync on a schedule - this command forces it to sync now for debugging +git pull +flt sync +``` - # start K9s for the cluster - flt ssh $MY_CLUSTER - k9s +You can then observe the canary promotion in K9s: - ``` +```bash +# start K9s for the cluster +flt ssh $MY_CLUSTER +k9s +``` - - Type `:canaries ` to view canary object - - Observe `status` and `weight` for canary promotion +In K9s, you can type a few commands to view and observe the objects. - > - Flagger detects the deployment version change and starts a new rollout with 20% traffic progression - > - Once canary `status` is updated to `Succeeded`, 100% of the traffic should be routed to new version +- Type `:canaries ` to view canary object. +- Observe `status` and `weight` for canary promotion. - - Press `enter` again and scroll to bottom to see events - - Press `escape` to go back - - Exit K9s: `:q ` - - Exit from cluster: `exit ` +> **NOTE**: Flagger detects the deployment version change and starts a new rollout with 20% traffic +> progression. Once a canary `status` is updated to `Succeeded`, 100% of the traffic should be routed +> to the new version. -## Monitoring Canary deployments using Grafana +To go back: -Flagger comes with a Grafana dashboard made for canary analysis. Install Grafana +- Press `enter` again and scroll to bottom to see events. +- Press `escape` to go back. +- Exit K9s: `:q `. +- Exit from cluster: `exit `. - ```bash +## Monitoring Canary Deployments Using Grafana - # cd to canary directory - cd $PIB_BASE/labs/advanced-labs/canary +Flagger comes with a Grafana dashboard made for canary analysis. Install Grafana TODO: Is this handled +in one of the apps? - # copy flagger to apps directory - cp -R ./flagger-grafana ../../../apps +```bash +# cd to canary directory +cd $PIB_BASE/labs/advanced-labs/canary - # add and commit the flagger-grafana app - cd $PIB_BASE - git add . - git commit -am "added flagger-grafana app" - git push +# copy flagger to apps directory +cp -R ./flagger-grafana ../../../apps - cd apps/flagger-grafana +# add and commit the flagger-grafana app +cd $PIB_BASE +git add . +git commit -am "added flagger-grafana app" +git push - # check deploy targets (should be []) - flt targets list +cd apps/flagger-grafana - # clear the targets if not [] - flt targets clear +# check deploy targets (should be []) +flt targets list - # add all clusters as a target - flt targets add all +# if not [], clear the targets +#flt targets clear - # deploy the changes - flt targets deploy +# add all clusters as a target +flt targets add all - ``` +# deploy the changes +flt targets deploy +``` -Once the [github action](https://github.com/kubernetes101/pib-dev/actions) is completed and flux sync is performed, navigate to grafana dashboard by appending `/grafana` to the host url in the browser tab. +Once the [github action](https://github.com/kubernetes101/pib-dev/actions) is completed and flux sync +is performed, navigate to grafana dashboard by appending `/grafana` to the host url in the browser tab. -- Grafana login info - - admin - - change-me +The default Grafana login info is a user name of `admin` with a password of `change-me`. -![Canary Dashboard](../../images/envoyCanaryDashboard.png) +![An image of an example Canary Dashboard on Grafana. There are a few graphs indicating traffic metrics. +On the left hand side of the image there are the metrics for the "Primary" Deployment, where previous +activity is visible. On the right hand side of the image, there are the graphs and metrics from the +canary deployment, which should no current activity.](../../images/envoyCanaryDashboard.png) diff --git a/labs/advanced-labs/monitoring/README.md b/labs/advanced-labs/monitoring/README.md index 1a8c338e..960a92c7 100644 --- a/labs/advanced-labs/monitoring/README.md +++ b/labs/advanced-labs/monitoring/README.md @@ -2,7 +2,8 @@ ## Introduction -- To monitor a multi-cluster fleet, we deploy a central monitoring cluster with Fluent Bit and Prometheus configured to send logs and metrics to Grafana Cloud. +To monitor a multi-cluster fleet, we deploy a central monitoring cluster with Fluent Bit and Prometheus configured to send logs and metrics to Grafana Cloud. + - The monitoring cluster runs WebValidate (WebV) to send requests to apps running on the other clusters in the fleet. - The current design has one deployment of WebV for each app. - The webv-heartbeat deployment sends requests to all of the heartbeat apps running on the fleet clusters. - Fluent Bit is configured to forward WebV logs to Grafana Loki diff --git a/labs/advanced-labs/monitoring/fluent-bit/README.md b/labs/advanced-labs/monitoring/fluent-bit/README.md index bf0298bf..352c0dec 100644 --- a/labs/advanced-labs/monitoring/fluent-bit/README.md +++ b/labs/advanced-labs/monitoring/fluent-bit/README.md @@ -4,81 +4,78 @@ ### Create Fluent Bit Secret -- The Fluent Bit deployment expects to retrieve the value for the Grafana Cloud API Key from a kubernetes secret. -- To acheive this, we store the value as a secret in Key Vault. -- Each member of the fleet retrieves the value from Key Vault during setup and creates the needed secret on the cluster. - -- Go to and log in -- Click on `My Account` - - You will get redirected to this URL -- In the left nav bar, click on `API Keys` (under Security) -- Click on `+ Add API Key` - - Name your API Key (i.e. yourName-publisher) - - Select `MetricsPublisher` as the role - - Click on `Create API Key` - - Click on `Copy to Clipboard` and save the value wherever you save your PATs - - WARNING - you will not be able to get back to this value +The Fluent Bit deployment expects to retrieve the value for the Grafana Cloud API Key from a kubernetes +(k8s) secret. In order to achieve this, we store the value as a secret in Key Vault. Each member of +the fleet retrieves the value from Key Vault during setup and creates the needed secret on the cluster. -```bash +To start you need to go to [grafana.com](https://grafana.com) and log in TODO: do you need to sign up first? +Once you login, go to the settings in `My Account`. You will get redirected to a URL with a similar +format to `https://grafana.com/orgs/yourUserName`. In the left nav bar, click on `API Keys` (under +Security), and click `+ Add API Key`. -GC_PAT="" +From here you'll need to name your API key (i.e. something like `yourName-publisher`). Choose the +`MetricsPublisher` as the role and then click `Create API Key`. + +Get the generated Personal Access Token (PAT), saving the value wherever you can retrieve it. Warning! +You will not be able to get back to this value later from Grafana! Below we'll save it as an environment +variable. +```bash +GC_PAT="" ``` -Save as Key Vault secret +You'll then save to keyvault using the AzCLI and the newly created PAT environment variable. ```bash - az keyvault secret set --vault-name $PIB_KEYVAULT --name fluent-bit-secret --value ${GC_PAT} - ``` ### Update Fluent Bit Config -- Before running Fluent Bit on your monitoring cluster, you need to update the values in `/apps/fluent-bit/app.yaml` to match your fleet and Grafana Cloud instance. -- The following values need to be set: - - jobSuffix - - lokiUrl - - lokiUser +Before running Fluent Bit on your monitoring cluster, you need to update the values in `/apps/fluent-bit/app.yaml` +to match your fleet and Grafana Cloud instance. The following values need to be set: `jobSuffix`, `lokiUrl`, +`lokiUser`. #### jobSuffix -- This value is the name of your fleet and will be used to uniquely identify the logs from this instance in Loki queries. -- For example, if your fleet name is atx-fleet, jobSuffix should be "atx". +This value is the name of your fleet and will be used to uniquely identify the logs from this instance +in Loki queries. For example, if your fleet name is `atx-fleet`, jobSuffix should be `atx`. #### lokiUser and lokiHost -- These values are located in the Grafana Cloud Portal. - - Go to the `Grafana Cloud Portal`: - - Click `Details` in the `Loki` section - - Under Grafana Data Source Settings: - - Set lokiHost to the `URL` value (remove the leading "https://" protocol) - - Set lokiUser to the `User` value +These values are located in the Grafana Cloud Portal, at a URL similar to `https://grafana.com/orgs/yourUserName`. +Click `Details` in the `Loki` section, then under the Grafana Data Source Settings, set the values. +Set `lokiHost` to the `URL` value (remove the leading "https://" protocol), and `lokiUser` to the `User` value ## Fluent Bit Configuration -- The configuration yaml file: [fluent-bit.yaml](./.gitops/dev/fluent-bit.yaml) -- For more information on Fluent Bit, see their [documentation](https://docs.fluentbit.io/manual/concepts/data-pipeline). +You can see the configuration yaml file: [fluent-bit.yaml](./.gitops/dev/fluent-bit.yaml). For more +information on Fluent Bit, see their [documentation](https://docs.fluentbit.io/manual/concepts/data-pipeline). ### Inputs -- By default, the inputs are logs from containers named webv*. -- To use logs from other apps, you will need to create a new input block and update the Path parameter to match the new app container name. +By default, the inputs are logs from containers named `webv*`. To use logs from other apps, you will +need to create a new input block and update the Path parameter to match the new app container name. ### Parsers -- This configuration uses built-in parsers (cri, docker) to parse the container logs for forwarding. +This configuration uses built-in parsers (cri, docker) to parse the container logs for forwarding. ### Filters -- This configuration uses a few filters to enrich and control the logs. - - The kubernetes filter is used to add kubernetes metatdata to the logs. - - The nest filters apply the lift operation to the logs to lift nested labels up to simplify querying. - - The type_converter and grep filters are used to ensure only error logs are forwarded. - - This is not required, but a safety net to not unintentionally forward all logs if a verbose flag is accidentally set on a deployment. +This configuration uses a few filters to enrich and control the logs. The kubernetes filter is used +to add kubernetes metatdata to the logs. The nest filters apply the lift operation to the logs to lift +nested labels up to simplify querying. The `type_converter` and `grep` filters are used to ensure only +error logs are forwarded. + +This is not required, but a safety net to not unintentionally forward all logs if a verbose flag is +accidentally set on a deployment. ### Outputs -- With the default configuration provided here, the Fluent Bit instance will forward the processed logs from webv to Grafana Loki. -- To forward logs from other apps, you will need to create a new output block and update the Match, Labels, label-keys, and remove-keys to reflect the naming and log structure of the new app. -- Leverage the Fluent Bit output plugin [documentation](https://docs.fluentbit.io/manual/pipeline/outputs) to explore different output options. +With the default configuration provided here, the Fluent Bit instance will forward the processed logs +from `webv` to Grafana Loki. To forward logs from other apps, you will need to create a new output +block and update the Match, Labels, label-keys, and remove-keys to reflect the naming and log structure +of the new app. + +Leverage the Fluent Bit output plugin [documentation](https://docs.fluentbit.io/manual/pipeline/outputs) to explore different output options. diff --git a/labs/advanced-labs/monitoring/prometheus/README.md b/labs/advanced-labs/monitoring/prometheus/README.md index afe5b4e9..b7841a18 100644 --- a/labs/advanced-labs/monitoring/prometheus/README.md +++ b/labs/advanced-labs/monitoring/prometheus/README.md @@ -2,7 +2,8 @@ ## Grafana Cloud Configuration -- The Prometheus deployment expects to retrieve the value for the Grafana Cloud API Key from a kubernetes secret. +The Prometheus deployment expects to retrieve the value for the Grafana Cloud API Key from a kubernetes secret. + - To acheive this, we store the value as a secret in Key Vault. - Each member of the fleet retrieves the value from Key Vault during setup and creates the needed secret on the cluster. diff --git a/labs/advanced-labs/voe/README.md b/labs/advanced-labs/voe/README.md index 8e247871..38e7665d 100644 --- a/labs/advanced-labs/voe/README.md +++ b/labs/advanced-labs/voe/README.md @@ -2,7 +2,8 @@ ## Introduction -- VoE is an open-source tool that builds vision-based intelligent edge solutions using Machine Learning. Visit this [repo](https://github.com/Azure-Samples/azure-intelligent-edge-patterns/tree/master/factory-ai-vision) for more information. +VoE is an open-source tool that builds vision-based intelligent edge solutions using Machine Learning. Visit this [repo](https://github.com/Azure-Samples/azure-intelligent-edge-patterns/tree/master/factory-ai-vision) for more information. + - Deploying VoE is very similar process to deploying other applications in the outer-loop labs, but there are a few dependencies that must be configured first. > Note: The VoE app is being deprecated and will be replaced by a new version. It is used here for demonstrative purposes to show how to deploy a more complex app with PiB. diff --git a/labs/inner-loop.md b/labs/inner-loop.md index 6c332d3d..0277713f 100644 --- a/labs/inner-loop.md +++ b/labs/inner-loop.md @@ -156,7 +156,8 @@ kic check grafana ## Test MyApp A core part of the DevX is automated integration and load testing. We use a customized version of -[WebValidate](https://github.com/microsoft/webvalidate). The custom test files and Dockerfile are in the `apps/myapp/webv` directory. +[WebValidate](https://github.com/microsoft/webvalidate). The custom test files and Dockerfile are in +the `apps/myapp/webv` directory. `kic build webv` builds a custom image: `k3d-registry.localhost:5500/webv-myapp:local`. @@ -165,7 +166,7 @@ A core part of the DevX is automated integration and load testing. We use a cust The integration test checks valid and invalid URLs, so the 400 and 404 errors are part of the test design. By default, results < 400 are not logged. -**Note**: Add `--verbose` to the integration test to see 2xx and 3xx results. +> **NOTE**: Add `--verbose` to the integration test to see 2xx and 3xx results. ```bash # run an integration test From e8d8a7cc3db63f23363efdddd911df7df0fd778f Mon Sep 17 00:00:00 2001 From: Hannah Kennedy Date: Tue, 25 Oct 2022 18:33:35 -0400 Subject: [PATCH 3/5] Advanced labs updates. --- labs/advanced-labs/README.md | 8 + labs/advanced-labs/monitoring/README.md | 179 +++++++++--------- .../monitoring/fluent-bit/README.md | 6 +- .../monitoring/prometheus/README.md | 84 ++++---- labs/advanced-labs/voe/README.md | 134 ++++++------- 5 files changed, 202 insertions(+), 209 deletions(-) create mode 100644 labs/advanced-labs/README.md diff --git a/labs/advanced-labs/README.md b/labs/advanced-labs/README.md new file mode 100644 index 00000000..0a3d92c9 --- /dev/null +++ b/labs/advanced-labs/README.md @@ -0,0 +1,8 @@ +# Advanced Lab Table of Contents + +- [AKS-IoT](./aks-iot/README.md) +- [Canary Releases](./canary/README.md) +- [Monitoring](./monitoring/README.md) + - [Fluent Bit](./monitoring/fluent-bit/README.md) + - [Prometheus](./monitoring/prometheus/README.md) +- [More Complex App Example: Vision on Edge (VoE)](./voe/README.md) diff --git a/labs/advanced-labs/monitoring/README.md b/labs/advanced-labs/monitoring/README.md index 960a92c7..268a4adc 100644 --- a/labs/advanced-labs/monitoring/README.md +++ b/labs/advanced-labs/monitoring/README.md @@ -2,45 +2,58 @@ ## Introduction -To monitor a multi-cluster fleet, we deploy a central monitoring cluster with Fluent Bit and Prometheus configured to send logs and metrics to Grafana Cloud. +To monitor a multi-cluster fleet, we deploy a central monitoring cluster with Fluent Bit and Prometheus +configured to send logs and metrics to Grafana Cloud. -- The monitoring cluster runs WebValidate (WebV) to send requests to apps running on the other clusters in the fleet. - The current design has one deployment of WebV for each app. - - The webv-heartbeat deployment sends requests to all of the heartbeat apps running on the fleet clusters. -- Fluent Bit is configured to forward WebV logs to Grafana Loki -- Prometheus is configured to scrape WebV metrics. -- These logs and metrics are used to power a Grafana Cloud dashboard and provide insight into cluster and app availability and latency. +The monitoring cluster runs WebValidate (WebV) to send requests to apps running on the other clusters +in the fleet. The current design has one deployment of WebV for each app. The `webv-heartbeat` deployment +sends requests to all of the heartbeat apps running on the fleet clusters. + +Fluent Bit is configured to forward WebV logs to Grafana Loki. Prometheus is configured to scrape WebV +metrics. These logs and metrics are used to power a Grafana Cloud dashboard and provide insight into +cluster and app availability and latency. ## Lab Prerequisites -- Complete the 4 [outer-loop labs](/README.md#outer-loop-labs) before this one. +Complete the 4 [outer-loop labs](/README.md#outer-loop-labs) before this one **BUT** do not follow +the delete step. These subsequent sections rely on the environment variables to be set in this +lab. ## Fleet Configuration Prerequisites +You should have set up: + - Grafana Cloud Account - - You can set up a free trial Grafana Cloud Account [here](https://grafana.com/). + - a free trial Grafana Cloud Account is available [here](https://grafana.com/) - Azure subscription - Managed Identity (MI) for the fleet -- Key Vault +- Key Vault (KV) - Grant the MI access to the Key Vault ## Key Vault Secrets -- A PAT is required to forward logs and metrics to Grafana Cloud. -- The PAT is stored as a K8s secret on the fleet clusters. -- Before creating the secrets, a Key Vault and MI (with access to the Key Vault) must be configured. See [setup docs](/labs/azure-codespaces-setup.md) for instructions. +A personal access token (PAT) is required to forward logs and metrics to Grafana Cloud. The PAT is +stored as a K8s secret on the fleet clusters, and later as a KV secret. + +Before creating the secrets, a Key Vault and MI (with access to the KV) must be configured. +See the [setup docs](/labs/azure-codespaces-setup.md) for instructions. ### Fluent Bit Secret -Follow instructions [here](./fluent-bit/README.md#create-fluent-bit-secret) to create the required Fluent Bit secret in the Key Vault. +Follow instructions [here](./fluent-bit/README.md#create-fluent-bit-secret) to create the required +Fluent Bit secret in the KV. ### Prometheus Secret -Follow instructions [here](./prometheus/README.md#create-prometheus-secret) to create the required Prometheus secret in the Key Vault. +Follow instructions [here](./prometheus/README.md#create-prometheus-secret) to create the required +Prometheus secret in the KV. ### Execution -- The Key Vault secret values are retrieved (via MI) during fleet creation and stored as kubernetes secrets on each cluster in the fleet (in [azure.sh](/vm/setup/azure.sh#L36) and [pre-flux.sh](/vm/setup/pre-flux.sh#L29)). -- The logging (fluent-bit) and metrics (prometheus) namespaces are bootstrapped on each of the clusters, prior to secret creation. +The KV secret values are retrieved (via MI) during fleet creation and stored as kubernetes secrets on +each cluster in the fleet (in [azure.sh](/vm/setup/azure.sh#L36) and [pre-flux.sh](/vm/setup/pre-flux.sh#L29)). +The logging (`fluent-bit`) and metrics (`prometheus`) namespaces are bootstrapped on each of the clusters, +prior to secret creation. ## Validate working branch @@ -53,11 +66,10 @@ git branch --show-current ``` ## Deploy a Central Monitoring Cluster -> This assumes you have an existing [multi-cluster fleet](/labs/outer-loop-multi-cluster.md). -> If you do not have MI and Key Vault configured, see the setup [lab](/labs/azure-codespaces-setup.md). +> **NOTE**: This assumes you have an existing [multi-cluster fleet](/labs/outer-loop-multi-cluster.md). +> If you do not have MI and KV configured, see the setup [lab](/labs/azure-codespaces-setup.md). ```bash - # set to the name of your fleet # these commands assume the resource group for your fleet is named $FLT_NAME-fleet export FLT_NAME=yourfleetname @@ -67,18 +79,16 @@ export FLT_NAME=yourfleetname flt env flt create -g $FLT_NAME-fleet -c monitoring-$FLT_NAME - ``` ## WebV -### Create apps/webv Directory +### Create `apps/webv` Directory -- Add webv to the apps/ directory -- By default, this provides you with two deployments of webv: webv-heartbeat and webv-imdb. +Add webv to the `apps/` directory. By default, this provides you with two deployments of `webv`: +`webv-heartbeat` and `webv-imdb`. ```bash - # make sure you are in the monitoring directory cd $PIB_BASE/labs/advanced-labs/monitoring @@ -88,18 +98,19 @@ cp -aR ./webv ../../../apps git add ../../../ git commit -m "Adding webv to apps dir" git push - ``` ### Configure WebV -- Before deploying, you need to update the arguments for the webv-heartbeat and webv-imdb deployments to target the clusters in your fleet. -- Replace the server arguments (placeholders are ) with the fqdn for the clusters in your fleet. You can find these values in the cluster yaml metadata files in the clusters/ directory. - - If you are not using dns, use the cluster IP's - - +Before deploying, you need to update the arguments for the `webv-heartbeat` and `webv-imdb` deployments +to target the clusters in your fleet. To do so, replace the server arguments (placeholders are +`https://yourclustername.yourdomain.com`>`) with the fully qualified domanin name for the clusters in +your fleet. You can find these values in the cluster`yaml` metadata files in the clusters/ directory. -```yaml +If you are not using dns, use the cluster's IP: `http://yourclusterIP`. +```yaml +... args: - --sleep - "5000" @@ -116,13 +127,12 @@ git push - {{gitops.cluster.region}} - --log-format - Json - +... ``` ### Deploy WebV to the Central Monitoring Cluster ```bash - # make sure you are in the apps/webv directory cd $PIB_BASE/apps/webv @@ -136,13 +146,11 @@ flt targets clear flt targets add region:monitoring flt targets deploy - ``` -### Verify WebV was Deployed +### Verify WebV Was Deployed ```bash - # should see webv added git pull @@ -154,17 +162,15 @@ flt check app webv # should see webv pods running flt exec kic pods -f monitoring - ``` ## Fluent Bit ### Create apps/fluent-bit Directory -- Add fluent-bit to the apps/ directory +Add `fluent-bit` to the `apps/` directory: ```bash - cd $PIB_BASE/labs/advanced-labs/monitoring cp -aR ./fluent-bit ../../../apps @@ -173,17 +179,16 @@ cp -aR ./fluent-bit ../../../apps git add ../../../ git commit -m "Adding fluent-bit to apps dir" git push - ``` ### Configure Fluent Bit -Follow the instructions [here](./fluent-bit/README.md#update-fluent-bit-config) to configure the Fluent Bit deployment. +Follow the instructions [here](./fluent-bit/README.md#update-fluent-bit-config) to configure the Fluent +Bit deployment. ### Deploy Fluent Bit to the Central Monitoring Cluster ```bash - # make sure you are in the fluent-bit directory cd $PIB_BASE/apps/fluent-bit @@ -197,13 +202,11 @@ flt targets clear flt targets add region:monitoring flt targets deploy - ``` ### Verify Fluent Bit was Deployed ```bash - # should see fluent-bit added git pull @@ -215,17 +218,15 @@ flt check app fluent-bit # should see fluent-bit pod running flt exec kic pods -f monitoring - ``` ## Prometheus ### Create apps/prometheus directory -- Add prometheus to the apps/ directory +- Add `prometheus` to the `apps/` directory: ```bash - cd $PIB_BASE/labs/advanced-labs/monitoring cp -aR ./prometheus ../../../apps @@ -234,17 +235,16 @@ cp -aR ./prometheus ../../../apps git add ../../../ git commit -m "Adding prometheus to apps dir" git push - ``` ### Configure Prometheus -Follow the instructions [here](./prometheus/README.md#update-prometheus-config) to configure the Prometheus deployment. +Follow the instructions [here](./prometheus/README.md#update-prometheus-config) to configure the Prometheus +deployment. -### Update targets and deploy Prometheus to the central monitoring cluster +### Update Targets and Deploy Prometheus to the Central Monitoring Cluster ```bash - # make sure you are in the prometheus directory cd $PIB_BASE/apps/prometheus @@ -258,13 +258,11 @@ flt targets clear flt targets add region:monitoring flt targets deploy - ``` -### Verify Prometheus was Deployed +### Verify Prometheus Was Deployed ```bash - # should see prometheus added git pull @@ -276,13 +274,11 @@ flt check app prometheus # should see prometheus pod running flt exec kic pods -f monitoring - ``` ## Create Grafana Cloud Dashboard ```bash - # set to dns or no-dns depending on your fleet configuration export DASHBOARD_TYPE=dns @@ -301,54 +297,55 @@ cp dashboard-template-$DASHBOARD_TYPE.json dashboard.json sed -i "s/%%FLEET_NAME%%/${FLT_NAME}/g" dashboard.json sed -i "s/%%DOMAIN_NAME%%/${PIB_SSL}/g" dashboard.json sed -i "s/%%GRAFANA_NAME%%/${GRAFANA_NAME}/g" dashboard.json - ``` -- Copy the content in dashboard.json and import as a new dashboard in Grafana Cloud. +Copy the content in `dashboard.json` and import as a new dashboard in Grafana Cloud. ## Create Grafana Alert -- Go to Grafana Cloud > Alerting > Alert Rules -- Create a new alert (+ New Alert Rule) - - Rule type: Grafana managed alert -- Query A: - - Select grafanacloud.yourgrafananame.prom as the source from the drop down list - - Replace [your $FLT_NAME] with your fleet name and copy the query below to the query field +Go to Grafana Cloud, to `Alerting`, then `Alert Rules`. Create a new Grafana managed alert under +`+ New Alert Rule`. - ```sql +For Query A, Select `grafanacloud.yourgrafananame.prom` as the source from the drop down list. Then replace +`[your $FLT_NAME]` with your fleet name and copy the query below to the query field. + ```sql sum(rate(WebVDuration_count{status!="OK",server!="",origin_prometheus="monitoring-[your $FLT_NAME]"}[10s])) by (server,job) / sum(rate(WebVDuration_count{server!="",origin_prometheus="monitoring-[your $FLT_NAME]"}[10s])) by (server,job) * 100 - ``` -- Query B: - - Set Operation to Reduce - - Set Function to Last - - Set Input to A - - Leave Mode as Strict -- Add another Expression (+ Add expression) - - Name the Expression "More than 5% errors" - - Set Operation to Math - - Type in the Expression: $B > 5 -- Set alert condition to "More than 5% errors" -- Set alert evaluation behavior - - Set evaluate every to 30s - - Set for to 1m -- Add details for your alert - - Replace [your $FLT_NAME] with your fleet name - - Rule name: [your $FLT_NAME] App Issue - - Folder: Pick any folder - - Group: [your $FLT_NAME] - App Issue -- Click 'Save and exit' +For Query B, TODO: is this also a managed alert? +Set the following: + +- `Operation` to `Reduce` +- `Function` to `Last` +- `Input` to `A` (from above/before) TODO: +- Leave `Mode` as `Strict` + +Then add another Expression (`+ Add expression`) and name it "More than 5% errors". The settings you'll +need are: + +- `Operation` to `Math` +- Type in the Expression: `$B > 5` +- Alert condition to "More than 5% errors" +- Alert evaluation behavior to evaluate every 30 seconds +- Set for to 1m TODO: This is a conflict to the step above + +You can also add these details to your alert: + +- Replace [your $FLT_NAME] with your fleet name + - Rule name: [your $FLT_NAME] App Issue + - Folder: Pick any folder + - Group: [your $FLT_NAME] - App Issue + +Save and exit! ## "Break" an Application Deployment -- To watch the dashboard and alerts "in action", we will temporarily take down an instance of imdb +To watch the dashboard and alerts "in action", we will temporarily take down an instance of `imdb`! -### Reduce imdb Targets +### Reduce `imdb` Targets ```bash - cd $PIB_BASE/apps/imdb # show current targets @@ -362,21 +359,17 @@ flt targets clear flt target add region:central flt targets deploy - ``` ### Update Cluster ```bash - # should see imdb removed from some clusters git pull # force flux to reconcile flt sync - ``` -- Navigate to your dashboard in Grafana to see some clusters "turn red" - - You may need to hit refresh a few times - - The alert will take ~1 minute to show up +Navigate to your dashboard in Grafana to see some clusters "turn red" - You may need to hit refresh +a few times. The alert will take ~1 minute to show up. diff --git a/labs/advanced-labs/monitoring/fluent-bit/README.md b/labs/advanced-labs/monitoring/fluent-bit/README.md index 352c0dec..79e58de6 100644 --- a/labs/advanced-labs/monitoring/fluent-bit/README.md +++ b/labs/advanced-labs/monitoring/fluent-bit/README.md @@ -45,7 +45,8 @@ in Loki queries. For example, if your fleet name is `atx-fleet`, jobSuffix shoul These values are located in the Grafana Cloud Portal, at a URL similar to `https://grafana.com/orgs/yourUserName`. Click `Details` in the `Loki` section, then under the Grafana Data Source Settings, set the values. -Set `lokiHost` to the `URL` value (remove the leading "https://" protocol), and `lokiUser` to the `User` value +Set `lokiHost` to the `URL` value (remove the leading "https://" protocol), and `lokiUser` to the `User` +value. ## Fluent Bit Configuration @@ -78,4 +79,5 @@ from `webv` to Grafana Loki. To forward logs from other apps, you will need to c block and update the Match, Labels, label-keys, and remove-keys to reflect the naming and log structure of the new app. -Leverage the Fluent Bit output plugin [documentation](https://docs.fluentbit.io/manual/pipeline/outputs) to explore different output options. +Leverage the Fluent Bit output plugin [documentation](https://docs.fluentbit.io/manual/pipeline/outputs) +to explore different output options. diff --git a/labs/advanced-labs/monitoring/prometheus/README.md b/labs/advanced-labs/monitoring/prometheus/README.md index b7841a18..6e4c8ed9 100644 --- a/labs/advanced-labs/monitoring/prometheus/README.md +++ b/labs/advanced-labs/monitoring/prometheus/README.md @@ -2,78 +2,80 @@ ## Grafana Cloud Configuration -The Prometheus deployment expects to retrieve the value for the Grafana Cloud API Key from a kubernetes secret. +The Prometheus deployment expects to retrieve the value for the Grafana Cloud API Key from a kubernetes +(k8s) secret. -- To acheive this, we store the value as a secret in Key Vault. -- Each member of the fleet retrieves the value from Key Vault during setup and creates the needed secret on the cluster. +To acheive this, we store the value as a secret in Key Vault. Each member of the fleet retrieves the +value from Key Vault during setup and creates the needed secret on the cluster. ### Create Prometheus Secret -- Go to and log in -- Click on `My Account` - - You will get redirected to this URL -- Click on `Details` in the `Prometheus` section -- Click on `Generate now` under the `Password / API Key` section to generate a new password - - Name your API Key (i.e. yourName-publisher) - - Select `MetricsPublisher` as the role - - Click on `Create API Key` - - Click on `Copy to Clipboard` and save the value wherever you save your PATs - - WARNING - you will not be able to get back to this value -- In the Section `Prometheus remote_write Configuration` - - Copy the value of the password in the config +First go to `My Account` after signing into [Grafana's site](https://grafana.com); you'll be redirected +to a URL similar to `https://grafana.com/orgs/yourUserName`. In the `Prometheus` Section, go to `Details`, +and generate a new password in the `Password / API Key` subsection. -```bash +> **NOTE**: You'll need to set the following for your API key: +> +> - Name your API Key (i.e. yourName-publisher) +> - Role is `MetricsPublisher` -GC_PROM_PASSWORD="" +Once it's created, copy it and save the value wherever you save your PATs. **REMEMBER**, you will not +be able to get back to this value once you navigate away. In the Section `Prometheus remote_write +Configuration`, copy the value of the password in the config. +```bash +GC_PROM_PASSWORD="" ``` -Save as Key Vault secret +You can also save the secret as Key Vault secret: ```bash - az keyvault secret set --vault-name $PIB_KEYVAULT --name prometheus-secret --value $GC_PROM_PASSWORD - ``` ### Update Prometheus Config -- Before deploying Prometheus to your monitoring cluster, you need to update the values in `/apps/prometheus/app.yaml` to match your Grafana Cloud instance. -- The following values need to be set: - - prometheusURL - - prometheusUser +Before deploying Prometheus to your monitoring cluster, you need to update the values in `/apps/prometheus/app.yaml` +to match your Grafana Cloud instance. The following values need to be set: -#### prometheusURL and prometheusUser +- `prometheusURL` +- `prometheusUser` -These values are located in the Grafana Cloud Portal. +#### `prometheusURL` and `prometheusUser` in `app.yaml` -- Go to the `Grafana Cloud Portal`: -- Click `Details` in the `Prometheus` section -- Under Grafana Data Source Settings: - - Set prometheusURL to the `Remote Write Endpoint` value - - Set prometheusUser to the `Username / Instance ID` value +These values are located in the Grafana Cloud Portal, at the URL that should resemble +`https://grafana.com/orgs/yourUserName`. Again, go to the `Details` subsection of the `Prometheus` +section. Using the information under Grafana Data Source Settings, set `prometheusURL` to the +`Remote Write Endpoint` value and set `prometheusUser` to the `Username / Instance ID` value. +Both of these values are found in the `app.yaml` file in this same directory. ## Prometheus Configuration -- The "origin_prometheus" value in the [Prometheus configuration](./.gitops/dev/prometheus.yaml) is important as it serves as a way to uniquely identify the source of the metrics when querying in Grafana Cloud. -- By default, this will be set to the name of the store that Prometheus is deployed to. +> **NOTE**: The code snippets here are indented to the level they appear in their respective yaml +definitions. -```yaml +The `origin_prometheus` value in the [Prometheus configuration](./.gitops/dev/prometheus.yaml) is +important as it serves as a way to uniquely identify the source of the metrics when querying in Grafana +Cloud. By default, this will be set to the name of the store that Prometheus is deployed to. +```yaml +... global: scrape_interval: 5s evaluation_interval: 5s external_labels: origin_prometheus: {{gitops.cluster.name}} - +... ``` -- The scrape configs specify what targets to scrape metrics from and which metrics to keep or drop. -- There is a scrape job named "webv-heartbeat" that scrapes metrics from the webv-heartbeat app running on the cluster. -- The metrics WebVDuration_count, WebVSummary, and WebVSummary_count are the only metrics configured to be kept by default since they are the only ones being used by our dashboard queries. +The scrape configs specify what targets to scrape metrics from and which metrics to keep or drop. There +is a scrape job named `webv-heartbeat` that scrapes metrics from the `webv-heartbeat` app running on +the cluster. The metrics `WebVDuration_count`, `WebVSummary`, and `WebVSummary_count` are the only +metrics configured to be kept by default since they are the only ones being used by our dashboard +queries. ```yaml - +... scrape_configs: - job_name: 'webv-heartbeat' static_configs: @@ -82,7 +84,7 @@ These values are located in the Grafana Cloud Portal. - source_labels: [ __name__ ] regex: "WebVDuration_count|WebVSummary|WebVSummary_count" action: keep - +... ``` -- For more information on Prometheus configuration, see their [documentation](https://prometheus.io/docs/prometheus/latest/configuration/configuration/). +For more information on Prometheus configuration, see their [documentation](https://prometheus.io/docs/prometheus/latest/configuration/configuration/). diff --git a/labs/advanced-labs/voe/README.md b/labs/advanced-labs/voe/README.md index 38e7665d..a0193c1c 100644 --- a/labs/advanced-labs/voe/README.md +++ b/labs/advanced-labs/voe/README.md @@ -2,81 +2,81 @@ ## Introduction -VoE is an open-source tool that builds vision-based intelligent edge solutions using Machine Learning. Visit this [repo](https://github.com/Azure-Samples/azure-intelligent-edge-patterns/tree/master/factory-ai-vision) for more information. +Vision On Edge (VoE) is an open-source tool that builds vision-based intelligent edge solutions using +Machine Learning. Visit this [repo](https://github.com/Azure-Samples/azure-intelligent-edge-patterns/tree/master/factory-ai-vision) +for more information. -- Deploying VoE is very similar process to deploying other applications in the outer-loop labs, but there are a few dependencies that must be configured first. +Deploying VoE is very similar process to deploying other applications in the outer-loop labs, but there +are a few dependencies that must be configured first. -> Note: The VoE app is being deprecated and will be replaced by a new version. It is used here for demonstrative purposes to show how to deploy a more complex app with PiB. +> **NOTE**: ⚠️ The VoE app is being deprecated and will be replaced by a new version. It is used here +> for demonstrative purposes to show how to deploy a more complex app with PiB. ## Lab Prerequisites -- Complete the 4 [outer-loop labs](/README.md#outer-loop-labs) before this one. +Complete the 4 [outer-loop labs](/README.md#outer-loop-labs) before this one. ## Fleet Configuration Prerequisites -- There are a few dependencies and prerequisites that must be configured before deploying the VoE app. -- Azure Resources - - Azure IoT Hub - - Azure Cognitive Services -- Fleet VM Configuration - - At least 8 cores - - Kubernetes secrets +There are a few dependencies and prerequisites that must be configured before deploying the VoE app: + +We'll walk through creating the necessary Azure resources and Fleet VM configurations via CLI below. +We'll need Azure IoT Hub and Azure Cognitive Services. On your Fleet VM configuration, you should +have at least 8 cores available and familiarity with Kubernetes secrets. ### Create Azure Resources #### Define resource name variables ```bash - export VOE_HUB_NAME=voe-hub-$MY_BRANCH export VOE_RG=voe-rg-$MY_BRANCH export VOE_AZ_COG_SVC_NAME=voe-acs-$MY_BRANCH - ``` #### Login to Azure CLI ```bash - az login --use-device-code # option if your Codespace is configured with SP credentials flt az login - ``` #### Create Azure IoT Hub ```bash - # add azure-iot extension az extension add -n azure-iot az iot hub create --resource-group $VOE_RG --name $VOE_HUB_NAME - ``` #### Create Azure Cognitive Services ```bash - -# you may have to create a cognitive services multi-service account in the azure portal to fulfill the requirement to agree to the responsible AI terms for the resource -az cognitiveservices account create --kind CognitiveServices --name $VOE_AZ_COG_SVC_NAME --resource-group $VOE_RG --sku S0 --location yourlocation - +# you may have to create a cognitive services multi-service account in the azure portal to fulfill +# the requirement to agree to the responsible AI terms for the resource +# update "yourlocation" to be an available Azure region +az cognitiveservices account create --kind CognitiveServices \ + --name $VOE_AZ_COG_SVC_NAME \ + --resource-group $VOE_RG \ + --sku S0 \ + --location yourlocation ``` -### Update fleet creation script +### Update Fleet Creation Script -- Add the following lines to vm/setup/pre-flux.sh and replace the values in [] with the names of the resources created above. - - This will run on the fleet vm/s during setup and will create the `voe` namespace and required K8s secret. - - The VM uses Managed Identity to retrieve the connection string values from the IoT Hub. +Add the following lines to `vm/setup/pre-flux.sh` and replace the values in [] with the names of the +resources you created above. This will run on the fleet VMs during setup and will create the `voe` +namespace and required K8s secret. The VM uses Managed Identity to retrieve the connection string +values from the IoT Hub. -- Do NOT run this fence! +> **NOTE**: ⛔️ WARNING! ⛔️ Do NOT run this fence! Copy it only! ```bash - # add the iot extension az extension add -n azure-iot @@ -90,27 +90,27 @@ echo "TRAINING_KEY=$(az cognitiveservices account keys list -n [your-voe-acs-nam kubectl create ns voe -kubectl create secret generic azure-env --from-env-file "$HOME/.ssh/iot.env" --from-env-file="$HOME/.ssh/acs.env" -n voe - +kubectl create secret generic azure-env \ + --from-env-file "$HOME/.ssh/iot.env" \ + --from-env-file="$HOME/.ssh/acs.env" \ + -n voe ``` -- After updating the script, push your changes +After updating the script, push your changes. ```bash - git add . git commit -m "updated pre-flux.sh" git push - ``` -## Create a fleet +## Create a Fleet -- The VoE application requires at least 8 cores, this must be specified at fleet creation with the --cores flag. -- Before creating the fleet, a Managed Identity (MI) must be configured. See [setup docs](/labs/azure-codespaces-setup.md) for instructions. +The VoE application requires at least 8 cores, this must be specified at fleet creation with the `--cores` +flag. Before creating the fleet, a Managed Identity (MI) must be configured. See [setup docs](/labs/azure-codespaces-setup.md) +for instructions. ```bash - # before creating the cluster, make sure PIB_MI is set # MI is required for the voe K8s secrets to be created properly flt env PIB_MI @@ -120,25 +120,21 @@ export MY_CLUSTER=central-tx-voe-$MY_BRANCH # create your cluster flt create -c $MY_CLUSTER --cores 8 - ``` -### Add iot devices (clusters in fleet) to IoT Hub +### Add IoT Devices (Clusters in Fleet) to IoT Hub -You must run this command for each cluster in the fleet +You must run this command for each cluster in the fleet: ```bash - az iot hub device-identity create --ee -n $VOE_HUB_NAME -d $MY_CLUSTER - ``` -## Create apps/voe directory +## Create `apps/voe` directory -- Add voe to the apps/ directory +Add `voe` to the `apps/` directory: ```bash - cd $PIB_BASE/labs/advanced-labs cp -aR voe ../../apps @@ -147,13 +143,11 @@ cp -aR voe ../../apps git add ../../ git commit -m "Adding voe to apps/" git push - ``` -## Deploy the VoE app to your fleet +## Deploy the VoE App to Your Fleet ```bash - # start in the apps/voe directory cd $PIB_BASE/apps/voe @@ -170,13 +164,11 @@ flt targets add all # deploy voe via ci-cd and GitOps Automation flt targets deploy - ``` ## Verify VoE deployment ```bash - # should see voe added git pull @@ -193,42 +185,39 @@ flt exec kic pods # curl healthz and readyz endpoints flt curl /healthz flt curl /readyz - ``` -## Navigate to VoE in the browser +## Navigate to VoE in the Browser -- Get the FQDN of your cluster -- Copy and paste the FQDN into your browser - - You should get the VoE home page +Get the fully qualified domain name of your cluster, and copy it into the browser. You should get the +VoE home page! - ```bash - - # display the FQDN - echo $MY_CLUSTER.$PIB_SSL - - # if dns/ssl is not configured - # use the cluster IP - cat $PIB_BASE/clusters/$MY_CLUSTER.yaml | grep domain +```bash +# display the FQDN +echo $MY_CLUSTER.$PIB_SSL - ``` +# if dns/ssl is not configured +# use the cluster IP +cat $PIB_BASE/clusters/$MY_CLUSTER.yaml | grep domain +``` ## VoE Application -- As mentioned, the VoE application is a complex application and is made up of 6 services (CVCapture, Inference, Predict, RTSPSim, Upload, and Web). - - See the [source repo](https://github.com/Azure-Samples/azure-intelligent-edge-patterns/tree/master/factory-ai-vision) for more details. -- Because of the complexity, there is additional ingress configuration required to ensure the different services can communicate as needed. - - See [ingress yaml](/labs/advanced-labs/voe/.gitops/dev/ingressHttp.yaml) -- The Inference service is dependent on 4 of the other services, we use initContainers to enforce the ordering. - - See [inference yaml](/labs/advanced-labs/voe/.gitops/dev/inference.yaml) -- The Web, Upload, and RTSPSim services require a persistent volume claim (pvc). +As mentioned, the VoE application is a complex application and is made up of 6 services (CVCapture, +Inference, Predict, RTSPSim, Upload, and Web). See the [source repo](https://github.com/Azure-Samples/azure-intelligent-edge-patterns/tree/master/factory-ai-vision) +for more details. Because of the complexity, there is additional ingress configuration required to ensure +the different services can communicate as needed. + +See [ingress yaml](/labs/advanced-labs/voe/.gitops/dev/ingressHttp.yaml) for more details. The Inference +service is dependent on 4 of the other services, we use `initContainers` to enforce the ordering (See +[inference yaml](/labs/advanced-labs/voe/.gitops/dev/inference.yaml)). The Web, Upload, and RTSPSim +services require a persistent volume claim (pvc). ## Delete Your Cluster -- Once you're finished with the workshop and experimenting, delete your cluster +Once you're finished with the workshop and experimenting, delete your cluster. ```bash - # start in the root of your repo cd $PIB_BASE git pull @@ -236,5 +225,4 @@ flt delete $MY_CLUSTER rm ips git commit -am "deleted cluster" git push - ``` From 36f6898c19a85506edcb70caed3375f3c7e887bb Mon Sep 17 00:00:00 2001 From: Hannah Kennedy Date: Tue, 25 Oct 2022 18:34:22 -0400 Subject: [PATCH 4/5] Minor line changes. --- labs/azure-codespaces-setup.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/labs/azure-codespaces-setup.md b/labs/azure-codespaces-setup.md index f28887b6..405f8fb8 100644 --- a/labs/azure-codespaces-setup.md +++ b/labs/azure-codespaces-setup.md @@ -69,7 +69,6 @@ gh secret set PIB_DNS_RG --body $rg # list secrets gh secret list - ``` ### Create Managed identity @@ -107,7 +106,6 @@ gh secret set ID_RSA_PUB --body $(cat $HOME/.ssh/id_rsa.pub | base64 | tr -d '\n # list GitHub Secrets gh secret list - ``` ## Create Azure Key Vault From 2febe01469eed5354365d1de0aa970a1f238f61f Mon Sep 17 00:00:00 2001 From: Hannah Kennedy Date: Wed, 26 Oct 2022 18:11:21 -0400 Subject: [PATCH 5/5] More tweaks after rebasing. --- labs/advanced-labs/aks-iot/README.md | 11 ++- labs/advanced-labs/canary/README.md | 8 +-- labs/advanced-labs/monitoring/README.md | 71 +++++++------------ .../monitoring/fluent-bit/README.md | 11 +-- .../monitoring/prometheus/README.md | 2 +- labs/azure-codespaces-setup.md | 2 +- labs/inner-loop-flux.md | 17 ++--- labs/inner-loop.md | 2 +- labs/outer-loop-multi-cluster.md | 2 +- labs/outer-loop.md | 18 +++-- 10 files changed, 66 insertions(+), 78 deletions(-) diff --git a/labs/advanced-labs/aks-iot/README.md b/labs/advanced-labs/aks-iot/README.md index ded08bd8..04a64ca7 100644 --- a/labs/advanced-labs/aks-iot/README.md +++ b/labs/advanced-labs/aks-iot/README.md @@ -5,16 +5,19 @@ > **NOTE**: AKS-IoT is in preview so there's a chance these instructions will change over time! -Reach out to the soldevx team for access to AKS-IoT preview. TODO: How? +Reach out to the [soldevx](mailto:soldevx@microsoft.com) team for access to AKS-IoT preview. +TODO: Is this how they would want to be contacted? ## PLEASE NOTE: This document is a work in progress +The current instructions are run in PowerShell. + ## AKS-IoT Setup - Create a new Azure VM with Windows 10 - You can also use your local Windows 10 or Windows 11 computer - Run Windows Update -- Install Hyper-V - Note: This requires a reboot! +- Install Hyper-V - **NOTE**: This requires a reboot! - Install, if not already on your machine: - [git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) - [Github CLI](https://cli.github.com/manual/installation) @@ -27,7 +30,9 @@ Reach out to the soldevx team for access to AKS-IoT preview. TODO: How? - Install Helm - `choco install kubernetes-helm` -## Update bootstrap/aksiot-userconfig.json TODO: Does this exist? +## Update `bootstrap/aksiot-userconfig.json` + +TODO: Does this file exist yet? or is it created in a different process? From the Azure portal, add these values to the file: diff --git a/labs/advanced-labs/canary/README.md b/labs/advanced-labs/canary/README.md index 2e7e3ce9..eeb8b42e 100644 --- a/labs/advanced-labs/canary/README.md +++ b/labs/advanced-labs/canary/README.md @@ -57,7 +57,7 @@ flt targets deploy ### Check Github Action Status -Check that your GitHub Action is running, either queued or in-progress. Check the action here: TODO: Is this for all builds or will it be for the presumably forked repo? +Check that your GitHub Action is running, either queued or in-progress. Check the action here: TODO: Is this URL for all builds or will it be for a presumably forked repo? ### Check Deployment @@ -171,8 +171,8 @@ To go back: ## Monitoring Canary Deployments Using Grafana -Flagger comes with a Grafana dashboard made for canary analysis. Install Grafana TODO: Is this handled -in one of the apps? +Flagger comes with a Grafana dashboard made for canary analysis. Install Grafana following the instructions +below. ```bash # cd to canary directory @@ -193,7 +193,7 @@ cd apps/flagger-grafana flt targets list # if not [], clear the targets -#flt targets clear +# flt targets clear # add all clusters as a target flt targets add all diff --git a/labs/advanced-labs/monitoring/README.md b/labs/advanced-labs/monitoring/README.md index 268a4adc..3c012ba1 100644 --- a/labs/advanced-labs/monitoring/README.md +++ b/labs/advanced-labs/monitoring/README.md @@ -2,72 +2,57 @@ ## Introduction -To monitor a multi-cluster fleet, we deploy a central monitoring cluster with Fluent Bit and Prometheus -configured to send logs and metrics to Grafana Cloud. - -The monitoring cluster runs WebValidate (WebV) to send requests to apps running on the other clusters -in the fleet. The current design has one deployment of WebV for each app. The `webv-heartbeat` deployment -sends requests to all of the heartbeat apps running on the fleet clusters. - -Fluent Bit is configured to forward WebV logs to Grafana Loki. Prometheus is configured to scrape WebV -metrics. These logs and metrics are used to power a Grafana Cloud dashboard and provide insight into -cluster and app availability and latency. +- To monitor a multi-cluster fleet, we deploy a central monitoring cluster with Fluent Bit and Prometheus configured to send logs and metrics to Grafana Cloud. +- The monitoring cluster runs WebValidate (WebV) to send requests to apps running on the other clusters in the fleet. - The current design has one deployment of WebV for each app. + - The webv-heartbeat deployment sends requests to all of the heartbeat apps running on the fleet clusters. +- Fluent Bit is configured to forward WebV logs to Grafana Loki +- Prometheus is configured to scrape WebV metrics. +- These logs and metrics are used to power a Grafana Cloud dashboard and provide insight into cluster and app availability and latency. ## Lab Prerequisites -Complete the 4 [outer-loop labs](/README.md#outer-loop-labs) before this one **BUT** do not follow -the delete step. These subsequent sections rely on the environment variables to be set in this -lab. +- Complete the 4 [outer-loop labs](/README.md#outer-loop-labs) before this one. ## Fleet Configuration Prerequisites -You should have set up: - - Grafana Cloud Account - - a free trial Grafana Cloud Account is available [here](https://grafana.com/) + - You can set up a free trial Grafana Cloud Account [here](https://grafana.com/). - Azure subscription - Managed Identity (MI) for the fleet -- Key Vault (KV) +- Key Vault - Grant the MI access to the Key Vault ## Key Vault Secrets -A personal access token (PAT) is required to forward logs and metrics to Grafana Cloud. The PAT is -stored as a K8s secret on the fleet clusters, and later as a KV secret. - -Before creating the secrets, a Key Vault and MI (with access to the KV) must be configured. -See the [setup docs](/labs/azure-codespaces-setup.md) for instructions. +- A PAT is required to forward logs and metrics to Grafana Cloud. +- The PAT is stored as a K8s secret on the fleet clusters. +- Before creating the secrets, a Key Vault and MI (with access to the Key Vault) must be configured. See [setup docs](/labs/azure-codespaces-setup.md) for instructions. ### Fluent Bit Secret -Follow instructions [here](./fluent-bit/README.md#create-fluent-bit-secret) to create the required -Fluent Bit secret in the KV. +Follow instructions [here](./fluent-bit/README.md#create-fluent-bit-secret) to create the required Fluent Bit secret in the Key Vault. ### Prometheus Secret -Follow instructions [here](./prometheus/README.md#create-prometheus-secret) to create the required -Prometheus secret in the KV. +Follow instructions [here](./prometheus/README.md#create-prometheus-secret) to create the required Prometheus secret in the Key Vault. ### Execution -The KV secret values are retrieved (via MI) during fleet creation and stored as kubernetes secrets on -each cluster in the fleet (in [azure.sh](/vm/setup/azure.sh#L36) and [pre-flux.sh](/vm/setup/pre-flux.sh#L29)). -The logging (`fluent-bit`) and metrics (`prometheus`) namespaces are bootstrapped on each of the clusters, -prior to secret creation. +- The Key Vault secret values are retrieved (via MI) during fleet creation and stored as kubernetes secrets on each cluster in the fleet (in [azure.sh](/vm/setup/azure.sh#L36) and [pre-flux.sh](/vm/setup/pre-flux.sh#L29)). +- The logging (fluent-bit) and metrics (prometheus) namespaces are bootstrapped on each of the clusters, prior to secret creation. ## Validate working branch ```bash - # make sure your branch is set and pushed remotely # commands will fail if you are in main branch git branch --show-current - ``` + ## Deploy a Central Monitoring Cluster -> **NOTE**: This assumes you have an existing [multi-cluster fleet](/labs/outer-loop-multi-cluster.md). -> If you do not have MI and KV configured, see the setup [lab](/labs/azure-codespaces-setup.md). +> This assumes you have an existing [multi-cluster fleet](/labs/outer-loop-multi-cluster.md). +> If you do not have MI and Key Vault configured, see the setup [lab](/labs/azure-codespaces-setup.md). ```bash # set to the name of your fleet @@ -83,10 +68,10 @@ flt create -g $FLT_NAME-fleet -c monitoring-$FLT_NAME ## WebV -### Create `apps/webv` Directory +### Create apps/webv Directory -Add webv to the `apps/` directory. By default, this provides you with two deployments of `webv`: -`webv-heartbeat` and `webv-imdb`. +Now we can add `webv` to the `apps/` directory. By default, this provides you with two deployments +of `webv`: `webv-heartbeat` and `webv-imdb`. ```bash # make sure you are in the monitoring directory @@ -103,11 +88,10 @@ git push ### Configure WebV Before deploying, you need to update the arguments for the `webv-heartbeat` and `webv-imdb` deployments -to target the clusters in your fleet. To do so, replace the server arguments (placeholders are -`https://yourclustername.yourdomain.com`>`) with the fully qualified domanin name for the clusters in -your fleet. You can find these values in the cluster`yaml` metadata files in the clusters/ directory. +to target the clusters in your fleet. You'll need to replace the server placeholders like `https://yourclustername.yourdomain.com` with the full qualified domain name for the clusters in your fleet. You can find these +values in the cluster `yaml` metadata files in the `clusters/` directory. -If you are not using dns, use the cluster's IP: `http://yourclusterIP`. +If you are not using DNS, use the cluster's IP like `http://yourclusterIP`. ```yaml ... @@ -127,7 +111,6 @@ If you are not using dns, use the cluster's IP: `http://yourclusterIP`. - {{gitops.cluster.region}} - --log-format - Json -... ``` ### Deploy WebV to the Central Monitoring Cluster @@ -148,7 +131,7 @@ flt targets add region:monitoring flt targets deploy ``` -### Verify WebV Was Deployed +### Verify WebV was Deployed ```bash # should see webv added @@ -224,7 +207,7 @@ flt exec kic pods -f monitoring ### Create apps/prometheus directory -- Add `prometheus` to the `apps/` directory: +Add `prometheus` to the `apps/` directory: ```bash cd $PIB_BASE/labs/advanced-labs/monitoring diff --git a/labs/advanced-labs/monitoring/fluent-bit/README.md b/labs/advanced-labs/monitoring/fluent-bit/README.md index 79e58de6..8dc15f06 100644 --- a/labs/advanced-labs/monitoring/fluent-bit/README.md +++ b/labs/advanced-labs/monitoring/fluent-bit/README.md @@ -8,7 +8,8 @@ The Fluent Bit deployment expects to retrieve the value for the Grafana Cloud AP (k8s) secret. In order to achieve this, we store the value as a secret in Key Vault. Each member of the fleet retrieves the value from Key Vault during setup and creates the needed secret on the cluster. -To start you need to go to [grafana.com](https://grafana.com) and log in TODO: do you need to sign up first? +To start you need to go to [grafana.com](https://grafana.com) and log in. Instructions on how to sign +up are found [here](/labs/advanced-labs/monitoring/README.md#fleet-configuration-prerequisites). Once you login, go to the settings in `My Account`. You will get redirected to a URL with a similar format to `https://grafana.com/orgs/yourUserName`. In the left nav bar, click on `API Keys` (under Security), and click `+ Add API Key`. @@ -24,7 +25,7 @@ variable. GC_PAT="" ``` -You'll then save to keyvault using the AzCLI and the newly created PAT environment variable. +You'll then save to Key Vault using the AzCLI and the newly created PAT environment variable. ```bash az keyvault secret set --vault-name $PIB_KEYVAULT --name fluent-bit-secret --value ${GC_PAT} @@ -36,12 +37,12 @@ Before running Fluent Bit on your monitoring cluster, you need to update the val to match your fleet and Grafana Cloud instance. The following values need to be set: `jobSuffix`, `lokiUrl`, `lokiUser`. -#### jobSuffix +#### `jobSuffix` This value is the name of your fleet and will be used to uniquely identify the logs from this instance in Loki queries. For example, if your fleet name is `atx-fleet`, jobSuffix should be `atx`. -#### lokiUser and lokiHost +#### `lokiUser` and `lokiHost` These values are located in the Grafana Cloud Portal, at a URL similar to `https://grafana.com/orgs/yourUserName`. Click `Details` in the `Loki` section, then under the Grafana Data Source Settings, set the values. @@ -65,7 +66,7 @@ This configuration uses built-in parsers (cri, docker) to parse the container lo ### Filters This configuration uses a few filters to enrich and control the logs. The kubernetes filter is used -to add kubernetes metatdata to the logs. The nest filters apply the lift operation to the logs to lift +to add kubernetes metadata to the logs. The nest filters apply the lift operation to the logs to lift nested labels up to simplify querying. The `type_converter` and `grep` filters are used to ensure only error logs are forwarded. diff --git a/labs/advanced-labs/monitoring/prometheus/README.md b/labs/advanced-labs/monitoring/prometheus/README.md index 6e4c8ed9..e0319944 100644 --- a/labs/advanced-labs/monitoring/prometheus/README.md +++ b/labs/advanced-labs/monitoring/prometheus/README.md @@ -5,7 +5,7 @@ The Prometheus deployment expects to retrieve the value for the Grafana Cloud API Key from a kubernetes (k8s) secret. -To acheive this, we store the value as a secret in Key Vault. Each member of the fleet retrieves the +To achieve this, we store the value as a secret in Key Vault. Each member of the fleet retrieves the value from Key Vault during setup and creates the needed secret on the cluster. ### Create Prometheus Secret diff --git a/labs/azure-codespaces-setup.md b/labs/azure-codespaces-setup.md index 405f8fb8..38a78510 100644 --- a/labs/azure-codespaces-setup.md +++ b/labs/azure-codespaces-setup.md @@ -126,7 +126,7 @@ gh secret list ## Create DNS Zone -> This is required for HTTPS! +> **NOTE**: This is required for HTTPS! * Purchase a domain from the Azure Portal (or bring your own). * Create a DNS Zone using `PIB_DNS_RG` from above. diff --git a/labs/inner-loop-flux.md b/labs/inner-loop-flux.md index 3680f8b3..d57dc9f6 100644 --- a/labs/inner-loop-flux.md +++ b/labs/inner-loop-flux.md @@ -24,7 +24,7 @@ PiB includes templates for new applications that encapsulate K8s best practices. - You can use any app name that conforms to a dotnet namespace: - PascalCase - alpha only - - <= 20 chars + - <= 20 characters - Once created, you can browse the code in the Explorer window. ```bash @@ -56,13 +56,14 @@ docker images Notice you don't have to create or edit K8s YAML files! -- This deploys - - Flux - - MyApp - - Fluent Bit - - Grafana - - Prometheus - - WebValidate (more on this later) +This deploys: + +- Flux +- MyApp +- Fluent Bit +- Grafana +- Prometheus +- WebValidate (more on this later) ```bash # deploy flux to the cluster diff --git a/labs/inner-loop.md b/labs/inner-loop.md index 0277713f..df730174 100644 --- a/labs/inner-loop.md +++ b/labs/inner-loop.md @@ -30,7 +30,7 @@ If your prompt ends in `(main)`, create a working branch per the instructions in ## Verify k3d cluster -> **Note**: The K8s cluster is running "in" your Codespace - no need for an external cluster. +> **NOTE**: The K8s cluster is running "in" your Codespace - no need for an external cluster. Use `kic` to verify the k3d cluster was created successfully: diff --git a/labs/outer-loop-multi-cluster.md b/labs/outer-loop-multi-cluster.md index e7e613d8..a5733951 100644 --- a/labs/outer-loop-multi-cluster.md +++ b/labs/outer-loop-multi-cluster.md @@ -137,7 +137,7 @@ flt check app imdb The "Dogs and Cats" app is a simple "voting" app for demo purposes. -> Note that dogs-cats and IMDb cannot be deployed to the same cluster due to ingress conflicts. +> Note here that dogs-cats and IMDb cannot be deployed to the same cluster due to ingress conflicts. In a production environment, you would add ingress rules for host, url, or port-based routing. diff --git a/labs/outer-loop.md b/labs/outer-loop.md index 088745de..d750dd13 100644 --- a/labs/outer-loop.md +++ b/labs/outer-loop.md @@ -10,7 +10,7 @@ Access to the dev/test fleet can be shared between Codespaces and users. We have an advanced workshop under development for fleet sharing! Keep your eyes out. -![A diagram with a simplied fleet diagram. TODO: This diagram could be clearer.](./images/outer-loop.png) +![A diagram with a simplified fleet diagram. TODO: This diagram could be clearer.](./images/outer-loop.png) ## `flt` CLI @@ -79,16 +79,14 @@ az account set -s mySubNameOrId az account show ``` -- Validate user role on subscription - > Make sure your RoleDefinitionName is `Contributor` or `Owner` to create resources in this lab succssfully +Validate the user role on the subscription, and make sure your RoleDefinitionName is `Contributor` +or `Owner` to create resources in this lab successfully. - ```bash - - # get az user name and validate your role assignment - principal_name=$(az account show --query "user.name" --output tsv | sed -r 's/[@]+/_/g') - az role assignment list --query "[].{principalName:principalName, roleDefinitionName:roleDefinitionName, scope:scope} | [? contains(principalName,'$principal_name')]" -o table - - ``` +```bash +# get az user name and validate your role assignment +principal_name=$(az account show --query "user.name" --output tsv | sed -r 's/[@]+/_/g') +az role assignment list --query "[].{principalName:principalName, roleDefinitionName:roleDefinitionName, scope:scope} | [? contains(principalName,'$principal_name')]" -o table +``` ## Create a Dev Cluster