Async Postgres-compatible solution to load the data from NEAR blockchain. Based on NEAR Lake Framework.
Indexer For Explorer has some disadvantages that we wanted to fix. That's why we've created smaller projects, independent mini-indexers:
indexer-baseworks with basic information about transactions, receipts;indexer-accountsworks with accounts and access_keys;indexer-balancescollects the info about native NEAR token balance changes (all the changes are validated);indexer-eventsworks with events produced by NEPs: FT, NFT (the events need to be validated separately).
- The data model changed a bit, naming changed;
- We moved from
dieseltosqlx, we prefer having lightweight ORM and write raw SQL queries; - Separate projects are easier to maintain;
- The main difference is in the future: we are thinking where to go next if we decide to get rid of Postgres.
Sure! Feel free to use this project as the example.
Your .env file should be in the project root (near-microindexers folder) with the contents:
AWS_SECRET_ACCESS_KEY=...
AWS_ACCESS_KEY_ID=...
DATABASE_URL=...
RPC_URL=...
CHAIN_ID=mainnet
INDEXER_ID=indexer-events-tip
INDEXER_TYPE=indexer-events
START_BLOCK_HEIGHT=30181671
- You need to have your own AWS credentials and pay for the S3 requests around $20 per month to follow the tip of the network;
- You need to create the DB where the data will be stored; you also need to apply the migrations manually.
- RPC URL could be found here
- CHAIN_ID could be
mainnetortestnet - INDEXER_ID could be anything; when you restart your app, it will find the last tracked
block_heightby this id; - INDEXER_TYPE could be also anything, but I suggest you to choose between the folders listed in this repo;
- START_BLOCK_HEIGHT could be any non-negative integer; if the process is restarted, we ignore this field;
sudo apt install git build-essential pkg-config libssl-dev tmux postgresql-client libpq-dev -y
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
cargo install --version=0.5.13 sqlx-cli --features postgres
ulimit -n 30000
cargo build --release
cargo run --releaseUnfortunately, sqlx migrations do not work if you have several projects writing to the same DB. We still use the migrations folder in each project, but we have to apply the changes manually.
We highly recommend using a separate read-only user to access the data. It helps you to avoid unexpected corruption of the indexed data.
We use public schema for all tables.
By default, new users have the possibility to create new tables/views/etc there.
If you want to restrict that, you have to revoke these rights:
REVOKE CREATE ON SCHEMA PUBLIC FROM PUBLIC;
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA PUBLIC FROM PUBLIC;
ALTER DEFAULT PRIVILEGES IN SCHEMA PUBLIC GRANT SELECT ON TABLES TO PUBLIC;After that, you could create read-only user in PostgreSQL:
CREATE ROLE readonly;
GRANT USAGE ON SCHEMA public TO readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public to readonly;
-- Put here your limit or just ignore this command
ALTER ROLE readonly SET statement_timeout = '30s';
CREATE USER explorer with login password 'password';
GRANT readonly TO explorer;$ PGPASSWORD="password" psql -h 127.0.0.1 -U explorer databasenamePlease refer to this guide before submitting PRs to this repo
account_changeshas only the absolute value for the balance, while we want to see the delta;account_changesdoes not have involved account_id;- We'll stop supporting
account_changessoon.
indexer-balances implementation does non-trivial work with extracting the balance-changing events and storing them in the correct order.
The ordering is taken from the nearcore implementation:
- validators account update
- process transactions
- process receipts
Using Indexer For Explorer terminology, we merge account_changes and action_receipt_actions by receipt_id.
We have the natural order in these 2 arrays.
- If
receipt_idis stored in both arrays -> merge them to one line in the resulting table. - If
receipt_idfromaction_receipt_actionshas no pair inaccount_changes-> collect all the possible info fromaction_receipt_actionsand put the line in the resulting table. - If the line in
account_changeshas noreceipt_id, we need to check whether it changed someone's balance. If the balance was changed -> collect all the possible info fromaccount_changesand put the line in the resulting table.
While merging, we can meet the situation #2 and #3 at the same point of time.
We need to find the right order of storing such cases.
I feel these 2 situations never affect each other, so any order will work fine.
I decided to put account_changes data first (just to be consistent)
assets__non_fungible_token_events, assets__fungible_token_events do not have the sorting column.
In the current solution, we've added artificial event_index column.
Moreover, we'll stop supporting assets__fungible_token_events soon.
The new fungible_token_events table stores the data in the format of affected/involved account_id, that simplifies filtering by affected account_id.
fungible_token_events still does not have absolute_value column, so you have to collect it from RPC if needed.
Please go and update your contract with our new SDK.
If it's important for you to collect all the previous history as well, you need to make the contribution and implement your own legacy handler.
You can use existing handlers as the example, wrap_near may be a good starting point.
Indexer Accounts is the only indexer that modifies the existing data.
While other indexers are append-only, Indexer Accounts updates the existing records with the deletion info.
accounts table in Indexer For Explorer stored only the first creation and last deletion of the account.
This solution stores all the creations/deletions, so accounts may appear in the table more than once.
We use indexer-balances in production; we use FT part of indexer-events in production as well.
The other pieces are frozen for now, they need to be upgraded and reviewed before any production usage.
indexer-balances requires the Near balance prior to the current block in order to calculate, and store, the delta. Previously, this value was fetched directly from JSON RPC, but as transaction volume increased, this method became a bottleneck within the application. A more performant approach is to fetch the previously stored balance from the DB, but this also comes with drawbacks. In summary:
DB- Performant, but potentially more error prone as incorrect deltas propagate to deltas followingRPC- Less performant, but also less error prone as the deltas rely on actual on-chain balances
Additionally, as DB relies on existing data, it can not be started from any arbitrary block, it requires the all blocks prior have already been indexed. The limitation does not exist for RPC.
The --balance-mode flag allows switching between the described methods so that the trade-offs can be more effectively managed.