Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 185 additions & 0 deletions docs/minio-to-local-storage-migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# Airbyte MinIO to Local Storage Migration Runbook

## Overview

This runbook provides step-by-step instructions for migrating Airbyte data from MinIO object storage to local filesystem storage while preserving all job logs, workload data, and connection states. The actions need to be performed before performing an Abctl local install that uses a local physical volume.

## Prerequisites

- `kubectl` configured with access to the Airbyte cluster
- `mc` (MinIO client) installed locally
- Airbyte cluster running with MinIO service accessible
- Port forwarding capability to MinIO service
- Consider taking a backup of your `~/.airbyte/abctl/data/airbyte-minio-pv/` folder.

## Migration Steps

### Step 1: Verify Cluster Status

```bash
# Check that Airbyte is running and MinIO service is available
kubectl get svc -n airbyte-abctl | grep minio
```

Expected output should show `airbyte-minio-svc` service.

### Step 2: Set Up Port Forward to MinIO

```bash
# Forward MinIO service port to localhost
kubectl port-forward -n airbyte-abctl svc/airbyte-minio-svc 9000:9000
```

Leave this running in a separate terminal window.

### Step 3: Configure MinIO Client

```bash
# Configure mc with Airbyte MinIO credentials (credentials are always the same)
mc alias set airbyte-minio http://localhost:9000 minio minio123
```

### Step 4: Verify MinIO Connection

```bash
# Test connection and list available buckets
mc ls airbyte-minio
```

Expected buckets:

- `airbyte-dev-logs/` - Contains job execution logs
- `airbyte-storage/` - Contains workload and sync data
- `state-storage/` - Contains connection state information

### Step 5: Prepare Local Storage Directory

```bash
# Ensure local storage directory exists and has correct permissions
mkdir -p ~/.airbyte/abctl/data/airbyte-local-pv
chmod 777 ~/.airbyte/abctl/data/airbyte-local-pv
```

### Step 6: Migrate Job Logs

```bash
# Mirror job logs from MinIO to local filesystem
mc mirror airbyte-minio/airbyte-dev-logs ~/.airbyte/abctl/data/airbyte-local-pv/job-logging
```

### Step 7: Migrate Workload Data

```bash
# Mirror workload data from MinIO to local filesystem
mc mirror airbyte-minio/airbyte-storage ~/.airbyte/abctl/data/airbyte-local-pv/workload
```

### Step 8: Migrate State Storage (if exists)

```bash
# Mirror state storage from MinIO to local filesystem
mc mirror airbyte-minio/state-storage ~/.airbyte/abctl/data/airbyte-local-pv/state
```

### Step 9: Verify Migration

```bash
# Check that data was successfully migrated
ls -la ~/.airbyte/abctl/data/airbyte-local-pv/

# Count migrated files
find ~/.airbyte/abctl/data/airbyte-local-pv/ -type f | wc -l

# Verify specific data types
find ~/.airbyte/abctl/data/airbyte-local-pv/ -name "*.json" | head -5
find ~/.airbyte/abctl/data/airbyte-local-pv/ -name "*sync*" | head -5
```

### Step 10: Clean Up Port Forward

```bash
# Stop the port forward process (Ctrl+C in the port forward terminal)
```

### Step 11: Move `~/.airbyte/abctl/data/airbyte-minio-pv` to another folder

## Expected Results

After successful migration, you should have:

### Directory Structure

```text
~/.airbyte/abctl/data/airbyte-local-pv/
├── job-logging/ # Migrated from airbyte-dev-logs
│ └── workspace/
│ └── [job-logs]
├── workload/ # Migrated from airbyte-storage
│ └── output/
│ └── [sync-data]
└── state/ # Migrated from state-storage (if exists)
└── [state-files]
```

### File Types Migrated

- **Job Logs**: `.json` files containing Airbyte job execution logs
- **Workload Data**: Sync outputs and intermediate processing files
- **State Files**: Connection state information for incremental syncs

## Troubleshooting

### Port Forward Issues

```bash
# If port 9000 is busy, use a different port
kubectl port-forward -n airbyte-abctl svc/airbyte-minio-svc 9001:9000

# Then update mc alias
mc alias set airbyte-minio http://localhost:9001 minio minio123
```

### Permission Issues

```bash
# Fix directory permissions if needed
sudo chown -R $(whoami) ~/.airbyte/abctl/data/airbyte-local-pv
chmod -R 755 ~/.airbyte/abctl/data/airbyte-local-pv
```

### Verification Commands

```bash
# Compare file counts between MinIO and local
mc find airbyte-minio/airbyte-dev-logs --type f | wc -l
find ~/.airbyte/abctl/data/airbyte-local-pv/job-logging -type f | wc -l

# Check for specific workspace data
mc ls airbyte-minio/airbyte-storage/output/
ls ~/.airbyte/abctl/data/airbyte-local-pv/workload/output/
```

## Rollback Plan

Copy backup to original `~/.airbyte/abctl/data/airbyte-minio-pv` location.

## Notes

- **Credentials**: MinIO username/password are always `minio`/`minio123`
- **Namespace**: Airbyte components are always in `airbyte-abctl` namespace
- **Service Names**: MinIO service is always `airbyte-minio-svc`
- **Data Integrity**: `mc mirror` preserves file structure and content
- **Incremental**: `mc mirror` can be run multiple times safely (only copies new/changed files)

## Validation Checklist

- [ ] MinIO service is accessible
- [ ] Port forward is working
- [ ] mc client connects successfully
- [ ] All three buckets are accessible
- [ ] Local storage directory exists with correct permissions
- [ ] Job logs migrated successfully
- [ ] Workload data migrated successfully
- [ ] State storage migrated (if applicable)
- [ ] File counts match expectations
- [ ] Sample files are readable and contain expected data
6 changes: 6 additions & 0 deletions internal/cmd/local/helm/airbyte_values.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ type ValuesOpts struct {
TelemetryUser string
ImagePullSecret string
DisableAuth bool
LocalStorage bool
}

func BuildAirbyteValues(ctx context.Context, opts ValuesOpts) (string, error) {
Expand All @@ -28,10 +29,15 @@ func BuildAirbyteValues(ctx context.Context, opts ValuesOpts) (string, error) {
"airbyte-bootloader.env_vars.PLATFORM_LOG_FORMAT=json",
}

if opts.LocalStorage {
vals = append(vals, "global.storage.type=local")
}

span.SetAttributes(
attribute.Bool("low-resource-mode", opts.LowResourceMode),
attribute.Bool("insecure-cookies", opts.InsecureCookies),
attribute.Bool("image-pull-secret", opts.ImagePullSecret != ""),
attribute.Bool("local-storage", opts.LocalStorage),
)

if !opts.DisableAuth {
Expand Down
19 changes: 19 additions & 0 deletions internal/cmd/local/local/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ package local
import (
"fmt"
"net/http"
"os"
"path/filepath"
"time"

"github.com/airbytehq/abctl/internal/cmd/local/docker"
Expand Down Expand Up @@ -192,3 +194,20 @@ func DefaultK8s(kubecfg, kubectx string) (k8s.Client, error) {

return &k8s.DefaultK8sClient{ClientSet: k8sClient}, nil
}

// SupportMinio checks if a MinIO persistent volume directory exists on the
// local filesystem. It returns true if the MinIO data directory exists.
// Otherwise it returns false.
func SupportMinio() (bool, error) {
minioPath := filepath.Join(paths.Data, pvMinio)
f, err := os.Stat(minioPath)
if err != nil {
if os.IsNotExist(err) {
return false, nil
}

return false, fmt.Errorf("failed to determine if minio physical volume dir exists: %w", err)
}

return f.IsDir(), nil
}
25 changes: 21 additions & 4 deletions internal/cmd/local/local/install.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,12 @@ import (
const (
// persistent volume constants, these are named to match the values given in the helm chart
pvMinio = "airbyte-minio-pv"
pvLocal = "airbyte-local-pv"
pvPsql = "airbyte-volume-db"

// persistent volume claim constants, these are named to match the values given in the helm chart
pvcMinio = "airbyte-minio-pv-claim-airbyte-minio-0"
pvcLocal = "airbyte-storage-pvc"
pvcPsql = "airbyte-volume-db-airbyte-db-0"
)

Expand All @@ -53,6 +55,7 @@ type InstallOpts struct {
Migrate bool
Hosts []string
ExtraVolumeMounts []k8s.ExtraVolumeMount
LocalStorage bool

DockerServer string
DockerUser string
Expand Down Expand Up @@ -189,9 +192,16 @@ func (c *Command) Install(ctx context.Context, opts *InstallOpts) error {
pterm.Info.Printfln("Namespace '%s' already exists", common.AirbyteNamespace)
}

if err := c.persistentVolume(ctx, common.AirbyteNamespace, pvMinio); err != nil {
return err
if opts.LocalStorage {
if err := c.persistentVolume(ctx, common.AirbyteNamespace, pvLocal); err != nil {
return err
}
} else {
if err := c.persistentVolume(ctx, common.AirbyteNamespace, pvMinio); err != nil {
return err
}
}

if err := c.persistentVolume(ctx, common.AirbyteNamespace, pvPsql); err != nil {
return err
}
Expand All @@ -205,9 +215,16 @@ func (c *Command) Install(ctx context.Context, opts *InstallOpts) error {
}
}

if err := c.persistentVolumeClaim(ctx, common.AirbyteNamespace, pvcMinio, pvMinio); err != nil {
return err
if opts.LocalStorage {
if err := c.persistentVolumeClaim(ctx, common.AirbyteNamespace, pvcLocal, pvLocal); err != nil {
return err
}
} else {
if err := c.persistentVolumeClaim(ctx, common.AirbyteNamespace, pvcMinio, pvMinio); err != nil {
return err
}
}

if err := c.persistentVolumeClaim(ctx, common.AirbyteNamespace, pvcPsql, pvPsql); err != nil {
return err
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@ global:
limits:
cpu: "3"
memory: 4Gi
storage:
type: local
11 changes: 11 additions & 0 deletions internal/cmd/local/local_install.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,22 @@ func (i *InstallCmd) InstallOpts(ctx context.Context, user string) (*local.Insta
}
}

supportMinio, err := local.SupportMinio()
if err != nil {
return nil, err
}

if supportMinio {
pterm.Warning.Println("Found MinIO physical volume. Consider migrating it to local storage (see project docs)")
}

opts := &local.InstallOpts{
HelmChartVersion: i.ChartVersion,
AirbyteChartLoc: helm.LocateLatestAirbyteChart(i.ChartVersion, i.Chart),
Secrets: i.Secret,
Migrate: i.Migrate,
Hosts: i.Host,
LocalStorage: !supportMinio,
ExtraVolumeMounts: extraVolumeMounts,
DockerServer: i.DockerServer,
DockerUser: i.DockerUsername,
Expand All @@ -70,6 +80,7 @@ func (i *InstallCmd) InstallOpts(ctx context.Context, user string) (*local.Insta
InsecureCookies: i.InsecureCookies,
LowResourceMode: i.LowResourceMode,
DisableAuth: i.DisableAuth,
LocalStorage: !supportMinio,
}

if opts.DockerAuth() {
Expand Down
1 change: 1 addition & 0 deletions internal/cmd/local/local_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ func TestInstallOpts(t *testing.T) {
expect := &local.InstallOpts{
HelmValuesYaml: string(b),
AirbyteChartLoc: "/test/path/to/chart",
LocalStorage: true,
}
opts, err := cmd.InstallOpts(context.Background(), "test-user")
if err != nil {
Expand Down