diff --git a/docs/kb/semgrep-ci/azure-self-hosted-ubuntu.md b/docs/kb/semgrep-ci/azure-self-hosted-ubuntu.md new file mode 100644 index 000000000..7e009568a --- /dev/null +++ b/docs/kb/semgrep-ci/azure-self-hosted-ubuntu.md @@ -0,0 +1,134 @@ +--- +tags: + - Azure Pipelines +description: Run Semgrep on self-hosted Ubuntu runners in Azure DevOps. +--- +import AzureVariables from "/src/components/procedure/_set-env-vars-azure.mdx" + +# Semgrep with self-hosted Ubuntu runners in Azure Pipelines + +Semgrep provides a [sample configuration for Azure-hosted runners](/docs/semgrep-ci/sample-ci-configs#azure-pipelines). If you use self-hosted Ubuntu Linux runners, you have significantly more control over their configuration, but as a result, they require additional preparation and configuration to run Semgrep. + +This guide adds two approaches to configuring self-hosted runners that use Ubuntu (the default self-hosted option for Azure DevOps Linux runners): + +* [Using pipx](#using-pipx) +* [Using pip with a virtual environment](#using-pip-with-a-virtual-environment) + +## Using pipx + +While the sample configuration uses `pip`, this approach uses `pipx`, which avoids issues with system-managed Python vs user-installed Python. + +### Prepare your runner + +Access the runner and execute the following commands: + +```bash +$ sudo apt update +$ sudo apt install pipx +$ pipx ensurepath +``` + +After completing the commands: + +1. Start a new shell session, so that the changes from `pipx ensurepath` are available. +2. Ensure the [Azure DevOps agent](https://learn.microsoft.com/en-us/azure/devops/pipelines/agents/linux-agent?view=azure-devops) is set up and running. + +### Create your configuration + +1. Follow the steps provided in the [sample configuration for Azure-hosted runners](/docs/semgrep-ci/sample-ci-configs#azure-pipelines). +2. Add the following snippet to the `azure-pipelines.yml` for the repository. + +```yaml +variables: +- group: Semgrep_Variables + +pool: + name: Default + +steps: +- checkout: self + clean: true + fetchDepth: 20 + persistCredentials: true +- script: | + pipx install semgrep + if [ $(Build.SourceBranchName) = "master" ]; then + echo "Semgrep full scan" + semgrep ci + elif [ $(System.PullRequest.PullRequestId) -ge 0 ]; then + echo "Semgrep diff scan" + git fetch origin master:origin/master + export SEMGREP_PR_ID=$(System.PullRequest.PullRequestId) + export SEMGREP_BASELINE_REF='origin/master' + semgrep ci + fi + env: + SEMGREP_APP_TOKEN: $(SEMGREP_APP_TOKEN) +``` + +:::info Customizing the configuration +* If your self-hosted runner [agent pool](https://learn.microsoft.com/en-us/azure/devops/pipelines/agents/pools-queues?view=azure-devops&tabs=yaml%2Cbrowser) has a different name, update the `name` key under `pool` to match the desired agent pool. +* If your default branch is not called `master`, update the references to `master` to match the name of your default branch. +::: + + + +## Using pip with a virtual environment + +### Prepare your runner + +This approach uses built-in Azure DevOps tasks, including `UsePythonVersion` and `Bash`, and uses a virtual environment to install `pip`, another approach that prevents issues with system-managed Python vs user-installed Python. + +1. Ensure you have a pre-installed and configured compatible version of Python 3, following [the instructions for UsePythonVersion for self-hosted runners](https://learn.microsoft.com/en-us/azure/devops/pipelines/tasks/reference/use-python-version-v0?view=azure-pipelines#how-can-i-configure-a-self-hosted-agent-to-use-this-task). +2. Ensure the [Azure DevOps agent](https://learn.microsoft.com/en-us/azure/devops/pipelines/agents/linux-agent?view=azure-devops) is set up and running. + +### Create your configuration + +Add the following snippet to the `azure-pipelines.yml` for the repository. + + +```yaml +variables: +- group: Semgrep_Variables + +pool: + name: Default + +steps: + - checkout: self + clean: true + persistCredentials: true + - task: UsePythonVersion@0 + displayName: 'Use Python 3.12' + inputs: + versionSpec: 3.12 + - task: Bash@3 + env: + SEMGREP_APP_TOKEN: $(SEMGREP_APP_TOKEN) + inputs: + targetType: 'inline' + script: | + python3 -m venv .venv + source .venv/bin/activate + python3 -m pip install --upgrade pip + pip install semgrep + + if [ $(Build.SourceBranchName) = "master" ]; then + export SEMGREP_BRANCH=$(Build.SourceBranchName) + echo "Semgrep full scan of master" + semgrep ci + elif [ $(System.PullRequest.PullRequestId) -ge 0 ]; then + echo "Semgrep diff scan" + git fetch origin master:origin/master + export SEMGREP_PR_ID=$(System.PullRequest.PullRequestId) + export SEMGREP_BASELINE_REF='origin/master' + semgrep ci + fi +``` + +:::info Customizing the configuration +* If your self-hosted runner [agent pool](https://learn.microsoft.com/en-us/azure/devops/pipelines/agents/pools-queues?view=azure-devops&tabs=yaml%2Cbrowser) has a different name, update the `name` key under `pool` to match the desired agent pool. +* If your default branch is not called `master`, update the references to `master` to match the name of your default branch. +::: + + diff --git a/docs/semgrep-ci/sample-ci-configs.md b/docs/semgrep-ci/sample-ci-configs.md index 3e6067fc0..6e4494abc 100644 --- a/docs/semgrep-ci/sample-ci-configs.md +++ b/docs/semgrep-ci/sample-ci-configs.md @@ -46,6 +46,8 @@ import CircleCiSemgrepOssSast from "/src/components/code_snippets/_circleci-semg import AzureSemgrepAppSast from "/src/components/code_snippets/_azure-semgrep-app-sast.mdx" import AzureSemgrepOssSast from "/src/components/code_snippets/_azure-semgrep-oss-sast.mdx" +import AzureVariables from "/src/components/procedure/_set-env-vars-azure.mdx" + import ScmFeatureReference from "/src/components/reference/_scm-feature-reference.md" @@ -88,7 +90,7 @@ If you are self-hosting your repository, you must [use a self-hosted runner](htt -The following configuration creates a CI job that runs scans depending on what products you have enabled in Semgrep AppSec Platform. +The following configuration creates a CI job that runs scans using the products and options you have enabled in Semgrep AppSec Platform. @@ -152,7 +154,7 @@ To add a Semgrep configuration snippet in your GitLab CI/CD pipeline: -The following configuration creates a CI job that runs scans depending on what products you have enabled in Semgrep AppSec Platform. +The following configuration creates a CI job that runs scans using the products and options you have enabled in Semgrep AppSec Platform. @@ -213,7 +215,7 @@ To add a Semgrep configuration snippet in your Jenkins pipeline: For SCA scans (Semgrep Supply Chain): users of Jenkins UI with the Git plugin must also set up their branch information. See [Setting up Semgrep Supply Chain with Jenkins UI](/semgrep-supply-chain/setup-jenkins-ui) for more information. ::: -The following configuration creates a CI job that runs scans depending on what products you have enabled in Semgrep AppSec Platform. +The following configuration creates a CI job that runs scans using the products and options you have enabled in Semgrep AppSec Platform. @@ -271,7 +273,7 @@ These steps can also be performed through Bitbucket's UI wizard. This UI wizard -The following configuration creates a CI job that runs scans depending on what products you have enabled in Semgrep AppSec Platform. +The following configuration creates a CI job that runs scans using the products and options you have enabled in Semgrep AppSec Platform. @@ -404,7 +406,7 @@ For the default branch and tags, CircleCI always runs the Semgrep CI job on all -The following configuration creates a CI job that runs scans depending on what products you have enabled in Semgrep AppSec Platform. +The following configuration creates a CI job that runs scans using the products and options you have enabled in Semgrep AppSec Platform. @@ -432,15 +434,13 @@ Scanning a project with the `semgrep ci` command requires the project to be vers To add Semgrep into Azure Pipelines: 1. Access the YAML pipeline editor within Azure Pipelines by following the [YAML pipeline editor](https://learn.microsoft.com/en-us/azure/devops/pipelines/get-started/yaml-pipeline-editor?view=azure-devops#edit-a-yaml-pipeline) guide. -2. Copy the relevant code snippet provided in [Sample Azure Pipelines configuration snippet](#sample-azure-pipelines-configuration-snippet) into the Azure Pipelines YAML editor. +2. Copy the code snippet provided in [Sample Azure Pipelines configuration snippet](#sample-azure-pipelines-configuration-snippet) into the Azure Pipelines YAML editor. 3. Save the code snippet. -4. Set [environment variables](https://learn.microsoft.com/en-us/azure/devops/pipelines/process/variables?view=azure-devops&tabs=yaml%2Cbatch#secret-variables). -5. Group the environment variables as a [variable group](https://learn.microsoft.com/en-us/azure/devops/pipelines/library/variable-groups?view=azure-devops&tabs=classic). -6. Optional: Create a separate CI job for diff-aware scanning, which scans only changed files in PRs or MRs, by repeating steps 1-4 and adding `SEMGREP_BASELINE_REF` as an environment variable. +4. Follow any additional instructions provided with the snippet. ### Sample Azure Pipelines configuration snippet -This configuration snippet is tested with hosted Azure runners. If you are using self-hosted runners, you may need to make adjustments to ensure that the necessary software is available. +This configuration snippet is tested with **hosted** Azure runners. If you are using self-hosted runners, you may need to make adjustments to ensure that the necessary software is available. Consult [Semgrep with self-hosted Ubuntu runners in Azure Pipelines](/docs/kb/semgrep-ci/azure-self-hosted-ubuntu) for two recommended options. -The following configuration creates a CI job that runs scans depending on what products you have enabled in Semgrep AppSec Platform. +The following configuration creates a CI job that runs scans using the products and options you have enabled in Semgrep AppSec Platform. You can **run specific product scans** by passing an argument, such as `--supply-chain`. View the [list of arguments](/getting-started/cli/#scan-using-specific-semgrep-products). + + @@ -475,7 +477,7 @@ You can customize the scan by entering custom rules or other rulesets to scan wi To run Semgrep CI on any other provider, use the `semgrep/semgrep` image, and run the `semgrep ci` command with `SEMGREP_BASELINE_REF` set for diff-aware scanning. -**Note**: If you need to use a different image than docker, install Semgrep CI by `pip install semgrep`. +**Note**: If you need to use a different Docker image or are not running in Docker, install Semgrep CI by `pip install semgrep`. By setting various [CI environment variables](/semgrep-ci/ci-environment-variables), you can run Semgrep in the following CI providers: diff --git a/src/components/code_snippets/_azure-semgrep-app-sast.mdx b/src/components/code_snippets/_azure-semgrep-app-sast.mdx index 05adf35a5..2bfe189bf 100644 --- a/src/components/code_snippets/_azure-semgrep-app-sast.mdx +++ b/src/components/code_snippets/_azure-semgrep-app-sast.mdx @@ -18,24 +18,8 @@ steps: export SEMGREP_PR_ID=$(System.PullRequest.PullRequestId) export SEMGREP_BASELINE_REF='origin/master' git fetch origin master:origin/master - semgrep ci + semgrep ci fi + env: + SEMGREP_APP_TOKEN: $(SEMGREP_APP_TOKEN) ``` - -### Setting environment variables in Azure Pipelines - -Set these variables within Azure Pipelines UI following the steps in [Environment variables](https://learn.microsoft.com/en-us/azure/devops/pipelines/process/variables?view=azure-devops&tabs=yaml%2Cbatch#secret-variables): - -* `SEMGREP_APP_TOKEN` - -Set these environment variables to troubleshoot the links to the code that generated a finding or if you are not receiving PR or MR comments: - -* `SEMGREP_JOB_URL` -* `SEMGREP_COMMIT` -* `SEMGREP_BRANCH` -* `SEMGREP_REPO_URL` -* `SEMGREP_REPO_NAME` - -Set this environment variable for diff-aware scanning: - -* `SEMGREP_BASELINE_REF`. Its value is typically your trunkline branch, such as `main` or `master`. diff --git a/src/components/code_snippets/_azure-semgrep-oss-sast.mdx b/src/components/code_snippets/_azure-semgrep-oss-sast.mdx index 150ef5c1e..648cc32e7 100644 --- a/src/components/code_snippets/_azure-semgrep-oss-sast.mdx +++ b/src/components/code_snippets/_azure-semgrep-oss-sast.mdx @@ -1,6 +1,4 @@ ```yaml -variables: -- group: Semgrep_Variables steps: - checkout: self diff --git a/src/components/procedure/_set-env-vars-azure.mdx b/src/components/procedure/_set-env-vars-azure.mdx new file mode 100644 index 000000000..b9d1d3044 --- /dev/null +++ b/src/components/procedure/_set-env-vars-azure.mdx @@ -0,0 +1,14 @@ +### Set environment variables in Azure Pipelines + +Semgrep minimally requires the variable `SEMGREP_APP_TOKEN` in order to report results to the platform, and other variables may be helpful as well. To set these variables in Azure Pipelines: + +1. Set up a [variable group](https://learn.microsoft.com/en-us/azure/devops/pipelines/library/variable-groups?view=azure-devops&tabs=classic) called `Semgrep_Variables`. +2. Set `SEMGREP_APP_TOKEN` in the variable group, following the steps for [secret variables](https://learn.microsoft.com/en-us/azure/devops/pipelines/process/set-secret-variables?view=azure-devops&tabs=yaml%2Cbash#set-a-secret-variable-in-a-variable-group). The variable is mapped into the `env` in the provided config. +3. Optional: Add the following environment variables to the group if you aren't seeing hyperlinks to the code that generated a finding, or if you are not receiving PR or MR comments. Review the use of these variables at [Environment variables for creating hyperlinks in Semgrep AppSec Platform](https://semgrep.dev/docs/semgrep-ci/ci-environment-variables#environment-variables-for-creating-hyperlinks-in-semgrep-appsec-platform).These variables are not sensitive and do not need to be secret variables. + * `SEMGREP_REPO_NAME` + * `SEMGREP_REPO_URL` + * `SEMGREP_BRANCH` + * `SEMGREP_COMMIT` + * `SEMGREP_JOB_URL` +4. Set variables for diff-aware scanning. The provided config sets `SEMGREP_PR_ID` to the system variable `System.PullRequest.PullRequestId` and `SEMGREP_BASELINE_REF` to `origin/master` within the `script` section of the config. The value of `SEMGREP_BASELINE_REF` is typically your trunk or default branch, so if you use a different branch than master, update the name accordingly. as `main` or `master`. + * If you prefer not to implement diff-aware scanning, you can skip setting these variables and remove the `elif` section of the `script` step.