Module 4 : Pipelines CI (Intégration Continue)
Structure d'un pipeline YAML
# azure-pipelines.yml
trigger:
branches:
include:
- main
- develop
paths:
include:
- src/*
- pipelines/*
pool:
vmImage: 'ubuntu-latest'
variables:
pythonVersion: '3.10'
stages:
- stage: Build
jobs:
- job: BuildJob
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(pythonVersion)'
- script: |
pip install -r requirements.txt
displayName: 'Install dependencies'
- script: |
pytest tests/ --junitxml=test-results.xml
displayName: 'Run tests'
- task: PublishTestResults@2
inputs:
testResultsFiles: 'test-results.xml'
testRunTitle: 'Python Tests'
Triggers
Types de triggers :
- CI Trigger : Déclenché sur push
- PR Trigger : Déclenché sur Pull Request
- Scheduled : Planifié (cron)
- Pipeline : Déclenché par un autre pipeline
# Trigger sur branches spécifiques
trigger:
branches:
include:
- main
- develop
- release/*
exclude:
- feature/experimental/*
# Trigger sur PR
pr:
branches:
include:
- main
- develop
# Trigger planifié
schedules:
- cron: "0 6 * * *"
displayName: Daily build
branches:
include:
- main
always: true
Variables et secrets
# Variables inline
variables:
environment: 'dev'
resourceGroup: 'rg-folab-dev'
# Variable groups (liées à Key Vault)
variables:
- group: folab-secrets
# Variables de pipeline
variables:
- name: buildConfiguration
value: 'Release'
# Utilisation
steps:
- script: echo $(environment)
- script: echo $(sql-connection-string) # Secret from variable group
Pipeline pour Synapse
# Pipeline CI pour notebooks Synapse
trigger:
branches:
include:
- main
- develop
paths:
include:
- notebooks/**
- pipelines/**
pool:
vmImage: 'ubuntu-latest'
stages:
- stage: Validate
jobs:
- job: ValidateNotebooks
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '3.10'
- script: |
pip install nbformat jsonschema
python scripts/validate_notebooks.py
displayName: 'Validate notebook syntax'
- script: |
python scripts/validate_pipeline_json.py
displayName: 'Validate pipeline definitions'
- stage: Package
dependsOn: Validate
jobs:
- job: CreateArtifact
steps:
- task: CopyFiles@2
inputs:
sourceFolder: '$(Build.SourcesDirectory)'
contents: |
notebooks/**
pipelines/**
linkedServices/**
targetFolder: '$(Build.ArtifactStagingDirectory)'
- task: PublishBuildArtifacts@1
inputs:
pathToPublish: '$(Build.ArtifactStagingDirectory)'
artifactName: 'synapse-artifacts'
Tests automatisés
# tests/test_transformations.py
import pytest
from transformations import calculate_esg_score
def test_esg_score_calculation():
data = {'environmental': 80, 'social': 70, 'governance': 90}
result = calculate_esg_score(data)
assert result == 80.0
def test_esg_score_missing_data():
with pytest.raises(ValueError):
calculate_esg_score({})
# Pipeline step
- script: |
pip install pytest pytest-cov
pytest tests/ --cov=src --cov-report=xml --junitxml=results.xml
displayName: 'Run tests with coverage'
- task: PublishCodeCoverageResults@1
inputs:
codeCoverageTool: Cobertura
summaryFileLocation: 'coverage.xml'
Agents et pools
| Agent | Usage | Coût |
|---|---|---|
| Microsoft-hosted | Builds standards | Minutes gratuites + payant |
| Self-hosted | Accès réseau interne | Infra à gérer |
# Utiliser un agent self-hosted
pool:
name: 'DataLab-Agents'
demands:
- Agent.OS -equals Linux
- python3
# Ou un agent Microsoft-hosted spécifique
pool:
vmImage: 'windows-latest' # Pour PowerShell, .NET
# vmImage: 'ubuntu-latest' # Pour Linux, Python
# vmImage: 'macOS-latest' # Pour iOS, macOS
Bonnes pratiques CI :
- Build rapide (< 10 minutes)
- Tests automatisés obligatoires
- Fail fast : arrêter au premier échec
- Cache des dépendances
- Paralléliser les jobs indépendants