Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fabric E2E Sample] Readme update #1044

Open
wants to merge 1 commit into
base: kitsune/ci_pipelines_
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions e2e_samples/fabric_dataops_sample/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,53 @@ The diagram below illustrates the complete end-to-end CI/CD process:

![Fabric CI/CD diagram](./images/fabric-cicd-option1.png)

1. Developers develop in fabric workspaces as their own Sandbox environments and commit changes into their own short-lived git branches. (i.e. <developer_name>/<branch_name>)
Copy link

@camaderal camaderal Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just quick comments:

  1. Instead of this outline:

    • Description of all pipelines
    • Testing
    • Cleanup.

    A good outline would be:

    • CI (PR Validation)
    • CI (Build Artifacts)
    • CD (Release and deploy)
  2. There are parts where it is too specific when it doesn't need to be. And some parts that is not specific enough. I think a good way to go about it is:

  • a brief description of what the pipeline does
  • a brief explanation of the broad steps. Specific details are already in the code.
  • an explanation of why we chose to do it this way.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will follow up on examples of this.

2. When changes are complete, developers raise a PR to main for review. This automatically kicks-off the [PR validation pipeline](./devops/azure-pipelines-ci-qa.yml) which:
- Runs the unit tests and lint checks for [python custom libraries](./libraries/).
- Sets up and tests an ephemeral build workspace in Fabric, requiring **interactive Azure CLI login**. It creates necessary resources like a feature workspace, custom work pool, ADLS Gen2 storage container, ADLS Gen2 Cloud connections and ADLS Shortcut. The workspace are synced to the feature git branch. These resources are created per PR and reused if they already exist. The pipeline publishes compute settings and libraries as needed, re-publishing the environment if there are changes for the environment config files or custom libraries in the PR. It also creates config files for the solution and uploads them to the ADLS Gen2 container, finally running a notebook to verify the setup.
Copy link
Contributor

@maye-msft maye-msft Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 'requiring interactive Azure CLI login. '
    I wonder if we can run it automatically as in the pipeline we may not be able to run CLI manually
  • 'It creates necessary resources like'
    It will be good if we can have a list about what resource are created
  • 're-publishing the environment if there are changes for the environment config files or custom libraries in the PR.'
    It will be checked by the script?
  • 'finally running a notebook to verify the setup.'
    Please specify the name of notebook. And is it to validate the setup or the pipeline?

3. On PR completion, the commit to main will trigger a [Build pipeline](./devops/azure-pipelines-ci-artifacts.yml), which:
- Runs the same unit tests and lint of [python custom libraries](./libraries/) as PR validation. If the tests are successful, it will publish the files as `fabric_env` artifacts along with the Fabric environment configuration YAML file.
- Create configuration files for the solution. Alongside these config files, the pipeline also includes seed data files for reference and publishes them as `ADLS` artifacts.
- The artifacts of the pipeline are organized as follows:

```plaintext

adls
├── config
│ ├── application.cfg
│ └── lakehouse_ddls.yaml
├── reference
│ ├── dim_date.csv
│ └── dim_time.csv
fabric_env
├── environment.yaml
└── custom_libraries
├── ddo_transform_standardize.py
├── ddo_transform_transform.py
└── otel_monitor_invoker.py

```

### Testing

- **Data Transformation package** - These test small pieces of functionality within your code. Data transformation code should have unit tests and can be accomplished by abstracting Data Transformation logic into packages. Unit tests along with linting are automatically executed when a PR to `main` is created or a commit to `main`.

- See here for [unit tests](./libraries/test/ddo_transform/) of Data Transformation package within the solution. The corresponding [QA Pipeline](./devops/azure-pipelines-ci-qa.yml) executes the unit tests on every PR, and [Artifacts Pipeline](./devops/azure-pipelines-ci-artifacts.yml) that executes them on every commit to `main`.

- **Ephemeral Fabric Environment** - This is a simple "Run a notebook" test. After setting up the fabric ephemeral workspace, the QA pipeline attempts to run a notebook to confirm that the setup is completed as expected. Unit tests along with linting are automatically executed when a PR to `main`.

- See here for [unit tests](./fabric/test/) within the solution. The corresponding [QA Pipeline](./devops/azure-pipelines-ci-qa.yml) executes the unit tests on every PR.

More resources:

- [pytest](https://www.bing.com/search?pglt=675&q=pytest&cvid=0511e23bc6d54e6fb70285e5935f76dd&gs_lcrp=EgRlZGdlKgYIABBFGDsyBggAEEUYOzIGCAEQABhAMgYIAhAAGEAyBggDEAAYQDIGCAQQABhAMgYIBRAAGEAyBggGEAAYQDIGCAcQRRg7MgYICBBFGDwyCAgJEOkHGPxV0gEIMTY4N2owajGoAgCwAgA&FORM=ANNAB1&PC=U531) - Test module for python

### Clean-up

- **CleanupWorkspace** - Ephemeral artifacts should be cleaned up when PR against the main branch is completed/abandoned. [QA Cleanup Pipeline](./devops/azure-pipelines-ci-qa-cleanup.yaml) would remove resources created during the [QA Pipeline](./devops/azure-pipelines-ci-qa.yml).

_**Note: Kindly configure the trigger to initiate the pipeline upon the completion or abandonment of the PR._

## How to use the sample

### High-level deployment sequence
Expand Down
Loading