Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ draft/
.terraform.plan
.github/workflows/.artifacts/
.vercel
indexer/indexer
170 changes: 17 additions & 153 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,176 +1,40 @@
# Kadena Indexer
# Kadindexer - Kadena Indexer

This project is a monorepo that contains the following packages:

- `@kadena-indexer/indexer`: The indexer package, which is responsible for scanning and storing blocks for Kadena blockchain.
- `@kadena-indexer/terraform`: The Terraform configuration for provisioning the infrastructure required to run the indexer and the node.
- [`@kadena-indexer/indexer`](indexer/README.md): The indexer package, which is responsible for scanning and storing blocks for Kadena blockchain.
- [`@kadena-indexer/terraform`](terraform/README.md): The Terraform configuration for provisioning the infrastructure required to run the indexer and the node.
- [`@kadena-indexer/backfill`](backfill/README.md): The backfill package, which is responsible for backfilling the indexer data.

## Prerequisites
## Requirements

- [Terraform](https://www.terraform.io/downloads.html)
- [AWS CLI](https://aws.amazon.com/cli/)
- [AWS Account](https://aws.amazon.com/)
- [AWS Access Key](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys)

### Dev Container

This project is configured to run in a dev container. You can use the `Dev Containers: Open Folder in Container` command in VSCode to open the project in a dev container. This will automatically install the required dependencies and set up the environment. To use the dev container, you need to have Docker installed on your machine.

If you don't have Dev Containers installed, you can install it from the [VSCode Marketplace](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).

### Configure Environment Variables

Under the `/terraform` directory, create an `.env` file using the `.env.template` as a reference and set the environment variables accordingly.

```bash
cp terraform/.env.template terraform/.env
```

`TF_VAR_AWS_ACCESS_KEY_ID` is your AWS access key ID.
`TF_VAR_AWS_SECRET_ACCESS_KEY` is your AWS secret access key.
`TF_VAR_AWS_ACCOUNT_ID` is your AWS account ID.
`TF_VAR_AWS_USER_NAME` is the name of the user you created in AWS.
`TF_VAR_AWS_DB_USERNAME` is the username for the postgress database.
`TF_VAR_AWS_DB_PASSWORD` is the password for the postgress database.

Under the `/indexer` directory, create an `.env` file using the `.env.template` as a reference and set the environment variables accordingly.

```bash
cp indexer/.env.template indexer/.env
```

`AWS_S3_REGION` is the region where the S3 bucket is located.
`AWS_S3_BUCKET_NAME` is the name of the S3 bucket where the data will be stored.
`AWS_ACCESS_KEY_ID` is the access key ID for the S3 bucket.
`AWS_SECRET_ACCESS_KEY` is the secret access key for the S3 bucket.

`SYNC_BASE_URL` is the base URL for the Kadena node.
`SYNC_MIN_HEIGHT` is the minimum height to start syncing from.
`SYNC_FETCH_INTERVAL_IN_BLOCKS` is the interval in blocks to fetch.
`SYNC_TIME_BETWEEN_REQUESTS_IN_MS` is the time between requests in milliseconds.
`SYNC_ATTEMPTS_MAX_RETRY` is the maximum number of attempts to retry.
`SYNC_ATTEMPTS_INTERVAL_IN_MS` is the interval in milliseconds between attempts.
`SYNC_NETWORK` is the network to sync.

`DB_USERNAME` is the username for the postgress database.
`DB_PASSWORD` is the password for the postgress database.
`DB_NAME` is the name of the postgress database.
`DB_HOST` is the host for the postgress database. You have the host after the resource creation, so you can check for this information in the AWS console or in terraform output (postgres_db_host).

### Initialize Terraform

Initialize your Terraform workspace, which will download the provider and initialize it with the values provided in the terraform.`tfvars`` file.

```bash
terraform init
```

### Deploy Infrastructure

Plan and apply the Terraform configuration to provision your AWS resources:

```bash
yarn terraform plan
yarn terraform apply
```

### Destroy Infrastructure

If you want to destroy the infrastructure created, you can use the following command:

```bash
yarn terraform destroy
```
- Install dependencies
- See individual package READMEs for specific prerequisites

## Installation

Set up the indexer with the following commands:
Install dependencies with the following command:

```bash
yarn && yarn indexer build
yarn install
```

## Features
## Quick Start

### Run processing
This is the quickest way to get the indexer running.

Continuous process of streaming, headers, payloads and missing blocks from node to s3 bucket and from s3 bucket to database
Install [Docker](https://www.docker.com/).

```bash
yarn indexer dev:run
```

## Additional Commands

### Running with Docker
Fill the `.env` file in the `indexer` folder. See [Environment Variables Reference](../indexer/README.md#32-environment-variables-reference).

```bash
sudo docker build -t kadena-indexer:latest .
sudo docker run --env-file ./indexer/.env -p 3000:3000 kadena-indexer:latest
```

### Backfilling Blocks

Scan for and store historical blocks.

```bash
yarn indexer dev:backfill
```

### Streaming Blocks

Listen for new blocks and store them in real-time.

```bash
yarn indexer dev:streaming
```

### Identifying Missing Blocks

Scan for and store any blocks that were missed.

```bash
yarn indexer dev:missing
```

### Processing Headers

Start the header processing from S3 to the database.

```bash
yarn indexer dev:headers
```

### Processing Payloads

Start the payload processing from S3 to the database.

```bash
yarn indexer dev:payloads
```

## Advanced Usage

### Local Workflow Testing

For testing workflows locally, act is required. Install it using Homebrew:

```bash
brew install act
cp indexer/.env.template indexer/.env
```

### Run Terraform Workflow Manually

If you want to run the terraform workflow manually, you can use the following command:

To start all services:
```bash
yarn run-terraform-workflow
yarn indexer dev
```

### Run Indexer Workflow Manually

If you want to run the indexer workflow manually, you can use the following command:
**NOTE:** Using the image on with the composer require the database `DB_USERNAME` to default to `postgres`.

```bash
yarn run-indexer-workflow
```
15 changes: 15 additions & 0 deletions backfill/.env.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
CERT_PATH=./global-bundle.pem
SYNC_BASE_URL=https://api.chainweb.com/chainweb/0.0

CHAIN_ID=0
NETWORK=mainnet01
SYNC_MIN_HEIGHT=5370495
SYNC_FETCH_INTERVAL_IN_BLOCKS=100
SYNC_ATTEMPTS_MAX_RETRY=5
SYNC_ATTEMPTS_INTERVAL_IN_MS=500

DB_USERNAME=postgres
DB_PASSWORD=password
DB_NAME=indexer
DB_HOST=localhost
DB_PORT=5432
99 changes: 99 additions & 0 deletions backfill/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Kadena Indexer Backfill

## 1. Introduction

The Kadindexer Backfill is a utility tool designed to synchronize historical blockchain data from the Kadena network into your local database. It allows you to fetch and index past blocks and transactions, ensuring your database has a complete history of the chain. The backfill process can be configured to sync data from any specified block height, making it useful for both initial data population and recovery scenarios where data needs to be resynced from a particular point.

## 2. Prerequisites

- [Docker](https://www.docker.com/)
- Kadena Indexer PostgreSQL database running
- Network access to the Kadena network
- Running your own Kadena node

## 3. Setup

### 3.1. Starting Docker
Start Docker Desktop from command line or via IOS application.

```bash
# MacOS - Start Docker Desktop from command line
open -a Docker

# Linux - Start Docker daemon
sudo systemctl start docker
```

### 3.2. Environment Variables

| Variable | Description | Example |
|----------|-------------|---------|
| `CERT_PATH` | Path to SSL certificate bundle | `./global-bundle.pem` |
| `SYNC_BASE_URL` | Base URL for the Chainweb API | `https://api.chainweb.com/chainweb/0.0` |
| `CHAIN_ID` | ID of the chain to backfill | `0` |
| `NETWORK` | Kadena network to sync from | `mainnet01` |
| `SYNC_MIN_HEIGHT` | Starting block height for backfill | `5370495` |
| `SYNC_FETCH_INTERVAL_IN_BLOCKS` | Number of blocks to fetch in each interval | `100` |
| `SYNC_ATTEMPTS_MAX_RETRY` | Maximum number of retry attempts | `5` |
| `SYNC_ATTEMPTS_INTERVAL_IN_MS` | Interval between retry attempts in milliseconds | `500` |
| `DB_USERNAME` | PostgreSQL database username | `postgres` |
| `DB_PASSWORD` | PostgreSQL database password | `password` |
| `DB_NAME` | Name of the database | `indexer` |
| `DB_HOST` | Database host address | `localhost` |
| `DB_PORT` | Database port number | `5432` |

**NOTE:** The example Kadena node API from chainweb will not work for the indexer purpose. You will need to run your own Kadena node and set the `NODE_API_URL` to your node's API URL.

## 4. Usage

### 4.1. Start the Kadindexer services

Please refer to the [Kadena Indexer README](../indexer/README.md) for instructions on how to start the Kadindexer services.

### 4.2. Build the backfill image

Build the image:
```bash
docker build -t chainbychain -f Dockerfile .
```

### 4.3. Run the container

#### Dockerfile (Chain by Chain)
This Dockerfile is designed to run the backfill process for a single chain at a time. It's useful when you need to:
- Sync data for a specific chain ID
- Have more granular control over the backfill process
- Debug issues with a particular chain
- Manage resources more efficiently

#### Dockerfile.indexes
This Dockerfile is specifically for recreating database indexes. Use this when you need to:
- Rebuild corrupted indexes
- Optimize existing indexes
- Add new indexes to improve query performance
- Perform database maintenance

#### Dockerfile.middle-backfill
This Dockerfile orchestrates the backfill process across all chains simultaneously. It's beneficial when you want to:
- Perform a complete system backfill
- Sync data for all chains in parallel
- Save time by running multiple chain syncs concurrently
- Ensure consistency across all chains

For single chain backfill:
```bash
docker build -t chainbychain -f Dockerfile .
docker run --rm --name chainbychain --env-file .env chainbychain
```

For rebuilding indexes:
```bash
docker build -t rebuild-indexes -f Dockerfile.indexes .
docker run --rm --name rebuild-indexes --env-file .env rebuild-indexes
```

For all chains backfill:
```bash
docker build -t all-chains -f Dockerfile.middle-backfill .
docker run --rm --name all-chains --env-file .env all-chains
```
2 changes: 1 addition & 1 deletion backfill/config/env.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ func InitEnv(envFilePath string) {
}

config = &Config{
DbUser: getEnv("DB_USER"),
DbUser: getEnv("DB_USERNAME"),
DbPassword: getEnv("DB_PASSWORD"),
DbName: getEnv("DB_NAME"),
DbHost: getEnv("DB_HOST"),
Expand Down
10 changes: 5 additions & 5 deletions indexer/.env.template
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ SYNC_NETWORK="mainnet01"
KADENA_GRAPHQL_API_URL=localhost
KADENA_GRAPHQL_API_PORT=3001

DB_USERNAME="postgres"
DB_PASSWORD="YOUR_DB_PASSWORD"
DB_NAME="indexer"
DB_SSL_ENABLED=false
DB_HOST="YOUR_DB_HOST"
DB_USERNAME=postgres
DB_PASSWORD=password
DB_NAME=indexer
DB_HOST="YOUR_DB_HOST"
DB_SSL_ENABLED=false
Loading
Loading