Technical Deep Dive: What Are Blockchain Indexers? How To Build One with Envio

January 31, 2026·19 min read

Ever notice how some dApps feel instant - showing your transaction history, filtering by token, displaying trading volume - while others hang for seconds or fail entirely?

The difference is indexing.

If you're building onchain, you'll eventually hit this wall: your smart contract works perfectly, events are firing, but your frontend can't display the data users actually need. You try querying the blockchain directly and quickly realize it's painfully slow, hits rate limits, and can't handle basic operations like filtering or sorting.

That's because blockchains aren't databases. They're not built for queries. They're append-only ledgers optimized for consensus and immutability, not for answering questions like "show me all active loans" or "what's the total trading volume."

To build a real dApp with a responsive frontend, you need something in between: an indexer.

But what actually is indexing? Why can't we just query the blockchain directly via an RPC? What role do WebSockets play in keeping data real-time? And when you're ready to build one, how do you choose between options like Envio and The Graph?

The Use Case: Real-Time Intent Display for a P2P Lending Protocol
- The Players
- The Challenge: Real-Time Display
The Problem: Blockchains Aren't Databases
What Is a Blockchain Indexer?
- Quick Glossary
- The Google Analogy
The Graph vs Envio
Building the Indexer
The Full Stack
Refactoring: When EventHandlers.ts Hits 2000+ Lines
Real-Time Updates: WebSockets
The Gotcha: No Aggregates on Hosted Envio
Why Compute Status Early?
When You Need an Indexer
Takeaways

The Use Case: Real-Time Intent Display for a P2P Lending Protocol

To understand the indexing problem better, let's look at a real use case. At Floe Labs, we're building an AI-native P2P intent-based lending protocol.

Here's how it works:

Lenders post lending intents with their terms: "I'll lend 1000 USDC at 5% APY for 30 days, max 80% LTV." Borrowers post borrowing intents with their needs: "I need 800 USDC, offering ETH as collateral, willing to pay 6% APY for 30 days." A matcher algorithm (could be anyone) pairs compatible intents and creates loans onchain.

The Players

The protocol has three types of actors that all need real-time data:

Web app - Users browse intents, post new ones, monitor their loans
Solver bot - Watches for new intents, finds matches, executes them onchain for a fee
Liquidation bot - Monitors active loans, liquidates any that become undercollateralized

All three need the same thing: fast, real-time access to protocol state. That's what the indexer provides.

When a lender posts an intent, they call the registerLendIntent() function in our smart contract:

// In LendingIntentMatcherUpgradeable.sol
function registerLendIntent(LendIntent calldata intent) external {
    // Validates intent parameters (amount, rate, duration, expiry)
    // Stores intent onchain
    // Emits event for tracking
}

The contract emits an event when an intent is posted:

event LogLenderOfferPosted(
    address indexed lender,
    bytes32 indexed marketId,
    bytes32 offerHash
);

The Challenge: Real-Time Display

Here's the problem: How do you display all active lending intents in the frontend in real-time?

You need to show users:

All open lending intents for a specific market
Filter by interest rate, amount, duration
Sort by best rates
Update instantly when new intents are posted
Show which intents are matched, cancelled, or expired

End-to-End Intent Flow

The Problem: Blockchains Aren't Databases

The naive approach? Query the RPC for events, then fetch each intent's full details:

// Step 1: Fetch all LogLenderOfferPosted events
const lenderOfferEvents = await client.getLogs({
  address: LENDING_CONTRACT_ADDRESS,
  event: parseAbiItem('event LogLenderOfferPosted(address indexed lender, bytes32 indexed marketId, bytes32 offerHash)'),
  fromBlock: 40499040n,
  toBlock: 'latest'
});

// Step 2: For each event, fetch the full intent details
const lendIntents = await Promise.all(
  lenderOfferEvents.map(async (event) => {
    // Each intent requires another RPC call to get full data
    const intentDetails = await client.readContract({
      address: LENDING_CONTRACT_ADDRESS,
      abi: contractABI,
      functionName: 'getOnChainLendIntent',
      args: [event.args.offerHash]
    });

    return {
      offerHash: event.args.offerHash,
      lender: event.args.lender,
      marketId: event.args.marketId,
      amount: intentDetails.amount,
      minInterestRateBps: intentDetails.minInterestRateBps,
      duration: intentDetails.duration,
      expiry: intentDetails.expiry,
      // ... more fields
    };
  })
);

// Step 3: Filter and sort client-side
const activeIntents = lendIntents
  .filter(intent => intent.expiry > Date.now() / 1000)
  .sort((a, b) => Number(a.minInterestRateBps - b.minInterestRateBps));

This works for a few hundred events. But the problems stack up fast:

Slow - Scanning thousands of blocks takes minutes, not seconds
Expensive - Each intent requires an additional RPC call. 1000 intents = 1000+ RPC calls. Rate limits kill you.
No real-time updates - You'd need to poll constantly to show new intents
Client-side filtering - Want to filter by market or sort by rate? You're loading and processing everything in the browser.
Can't aggregate - "How many open intents are there?" requires loading all data to count.

Blockchains aren't databases. They're append-only ledgers optimized for consensus, not for answering queries like "show me all active USDC lending intents sorted by interest rate."

There's a better solution.

What Is a Blockchain Indexer?

An indexer is a service that:

Listens to blockchain events in real-time
Stores them in a queryable database (usually PostgreSQL)
Exposes an API (usually GraphQL) for your frontend

Quick Glossary

Before we dive deeper, let's clarify these terms:

PostgreSQL is a relational database. Think of it like a giant Excel spreadsheet that can hold millions of rows and lets you ask questions like "show me all rows where status = 'active' sorted by date." It's where your indexed data actually lives.

GraphQL is a query language for APIs. Instead of hitting different URLs for different data (REST-style), you send a single query describing exactly what you want. The server returns just that data, nothing more. It's how your frontend talks to the indexer.

API (Application Programming Interface) is just a way for two programs to talk to each other. When I say "fast API," I mean a way to request data that responds quickly, typically in milliseconds instead of seconds.

The Google Analogy

Think of it like building a search index. Google doesn't re-crawl the entire internet every time you search. It maintains an index of web pages, organized for fast lookup. Blockchain indexers do the same thing for on-chain data.

Blockchain Indexer Flow: Events flow from Blockchain to Indexer (PostgreSQL) to Your dApp (GraphQL)

Instead of querying the chain directly, your frontend queries the indexer. Fast, filterable, and reliable.

The Graph vs Envio

I've used The Graph before. It's battle-tested, decentralized, and supports 70+ networks including Base (both mainnet and Sepolia). You write subgraphs in AssemblyScript.

I learned about Envio at an Encode Club hackathon. What caught my attention: it's TypeScript, not AssemblyScript. Auto-generates boilerplate from your contract. And it's fast - they benchmarked indexing Uniswap V2 in 1 minute vs 15 minutes for alternatives.

Feature	The Graph	Envio
Networks	70+ (Base, Base Sepolia, Ethereum, Polygon, etc.)	200+ EVM chains (Base, Base Sepolia, etc.) + Fuel + Solana*
Language	AssemblyScript	TypeScript
Multi-chain	Separate subgraphs per chain	Native multi-chain in one indexer
Decentralization	Decentralized network available	Hosted service only
Setup	Manual schema + mappings	Auto-generates from contract ABI
Speed	Standard RPC-based indexing	HyperSync (10-100x faster for EVM)

*Solana support is experimental and RPC-only (no HyperSync yet)

I went with Envio for Floe. TypeScript instead of AssemblyScript meant I could move faster. The auto-generation from the contract ABI saved setup time. And the HyperSync speed made a difference when I needed to re-index during development.

Building the Indexer

For Floe, I needed to index:

Intents - lender offers, borrower requests
Loans - when intents get matched, repaid, liquidated
Markets - token pairs and their config

Step 1: Init

Envio's CLI generates everything from your contract address:

npx envio init
# Pick your network, paste contract address
# It pulls the ABI and generates the boilerplate

You get three files:

config.yaml - events to listen for
schema.graphql - your database schema
src/EventHandlers.ts - where your code goes

Step 2: Pick Your Events

In config.yaml, tell Envio which events to track:

name: floe-indexer
networks:
- id: 8453  # Base Mainnet
  start_block: 40499040
  contracts:
  - name: LendingIntentMatcher
    address: 0xYourContractAddress
    handler: src/EventHandlers.ts
    events:
    - event: LogLenderOfferPosted(address indexed lender, bytes32 indexed marketId, bytes32 offerHash)
    - event: LogBorrowerOfferPosted(address indexed borrower, bytes32 indexed marketId, bytes32 offerHash)
    - event: LogIntentsMatched(address indexed lender, address indexed borrower, address indexed matcher, bytes32 marketId, uint256 loanId)
    - event: LogLoanRepayment(uint256 indexed loanId, uint256 totalRepaid, uint256 protocolFee, uint256 collateralReturned)
    # ... more events

Step 3: Schema

Events don't give you everything. LogLenderOfferPosted only has the lender address, market ID, and offer hash. You still need amount, interest rate, duration, expiry.

The schema defines what you'll store:

type LenderIntent {
  id: ID!
  offerHash: String!
  lender: String!

  # Full intent details (fetched via RPC)
  amount: BigInt!
  minInterestRateBps: BigInt!
  maxLtvBps: BigInt!
  duration: BigInt!
  expiry: BigInt!

  # Computed at index time
  status: String!  # 'open' | 'matched' | 'cancelled' | 'expired'

  # Metadata
  blockTimestamp: BigInt!
  transactionHash: String!
}

Step 4: Write Handlers

When LogLenderOfferPosted fires:

Fetch the full intent from the contract (you only have the hash)
Figure out the status (open, matched, cancelled, expired)
Save it

LendingIntentMatcher.LogLenderOfferPosted.handler(async ({ event, context }) => {
  const offerHash = event.params.offerHash;

  // Fetch FULL intent details using Envio's Effect API
  const [intentData, isUsed] = await Promise.all([
    context.effect(getLendIntentDetails, offerHash),
    context.effect(checkIntentUsed, offerHash),
  ]);

  // Compute status at index time
  const status = computeIntentStatus(
    true, // isRegistered
    isUsed,
    intentData.expiry,
    BigInt(event.block.timestamp)
  );

  // Store the complete entity
  context.LenderIntent.set({
    id: `${event.chainId}_${offerHash}`,
    offerHash: offerHash,
    lender: intentData.lender,
    amount: intentData.amount,
    minInterestRateBps: intentData.minInterestRateBps,
    maxLtvBps: intentData.maxLtvBps,
    duration: intentData.duration,
    expiry: intentData.expiry,
    status: status,
    blockTimestamp: BigInt(event.block.timestamp),
    transactionHash: event.transaction.hash,
  });
});

Fetching Full Data: The Effect API

The event only has the hash. To get the full intent (amount, rate, duration, etc.), you need to call the contract.

Envio's Effect API handles this. It caches results, rate limits your RPC calls, and retries on failures. Without it, you'd hammer your RPC and probably get rate limited.

export const getLendIntentDetails = createEffect(
  {
    name: "getLendIntentDetails",
    input: S.string,  // offerHash
    output: { /* full intent schema */ },
    cache: true,  // Persist across indexer runs
    rateLimit: { calls: 3, per: "second" }
  },
  async ({ input, context }) => {
    // Fetch from contract with retry + fallback RPC
    return withRetryAndFallback(
      () => primaryClient.read.getOnChainLendIntent([input]),
      () => fallbackClient?.read.getOnChainLendIntent([input]),
      "getLendIntent"
    );
  }
);

The cache: true flag is important - it means if your indexer restarts, you don't re-fetch everything.

The Full Stack

Floe Indexer Architecture

Here's what's running:

Base Mainnet - where the contract lives
Envio HyperIndex - listens to events, fetches full data, writes to PostgreSQL
PostgreSQL - stores everything
Hasura - generates a GraphQL API on top of PostgreSQL
Frontend - queries via GraphQL

The frontend never talks to the blockchain for reads. Just GraphQL:

query GetOpenIntents($marketId: String!) {
  LenderIntent(
    where: { marketId: { _eq: $marketId }, status: { _eq: "open" } }
    order_by: { minInterestRateBps: asc }
  ) {
    id
    lender
    amount
    minInterestRateBps
    duration
    expiry
  }
}

Fast. Filtered. Sorted. No RPC calls.

Refactoring: When EventHandlers.ts Hits 2000+ Lines

Envio generates a single EventHandlers.ts file for all your event handlers. Works great at first. Then you add more events. More handlers. More logic.

Mine hit 2000+ lines. Scrolling through a massive file to find the right handler got old fast. Adding new features meant searching through hundreds of lines to find where to add code.

I broke it down:

src/
├── handlers/
│   ├── intentHandlers.ts       # Lender/borrower intent events
│   ├── loanHandlers.ts          # Loan creation, repayment, liquidation
│   ├── marketHandlers.ts        # Market config updates
│   ├── matchHandlers.ts         # Intent matching logic
│   └── partialFillHandlers.ts   # Partial fill tracking
├── effects/
│   ├── intentEffects.ts         # RPC calls for intent data
│   ├── loanEffects.ts           # RPC calls for loan data
│   └── marketEffects.ts         # RPC calls for market data
├── utils/
│   ├── statusHelpers.ts         # Status computation logic
│   ├── statsHelpers.ts          # Stats update helpers
│   └── rpc.ts                   # RPC retry logic
├── contracts/
│   ├── abi.ts                   # Contract ABIs
│   └── clients.ts               # Viem clients
└── EventHandlers.ts             # Main file imports and registers handlers

The pattern:

Handlers by domain (intents, loans, markets)
Effects grouped by what they fetch
Utils for shared logic (status computation, stats updates)
Contracts for ABIs and RPC clients
EventHandlers.ts becomes a thin import/export layer

Now when I need to update loan repayment logic, I open loanHandlers.ts. When I need to change how intent status is computed, I open statusHelpers.ts. Each file is 100-300 lines. Readable. Testable. Maintainable.

Worth doing once your main file crosses ~500 lines.

Real-Time Updates: WebSockets

The indexer stores data in PostgreSQL. But how do you show real-time updates in the frontend? How do bots know when new intents are posted?

WebSockets. Specifically, Hasura's built-in WebSocket support for GraphQL subscriptions.

How It Works

The WebSocket server is part of the indexer infrastructure, not something you build separately. When you set up Envio, you get three services running together:

Envio HyperIndex (processes blockchain events)
PostgreSQL (stores indexed data)
Hasura GraphQL Engine (provides WebSocket server automatically)

All three run via the docker-compose.yaml that Envio generates in your indexer directory.

The flow:

Smart contract emits event
Envio processes and writes to PostgreSQL
Hasura detects the database change
Hasura broadcasts update via WebSocket to all subscribers
Clients receive the update (web app, solver bot, liquidation bot)

No polling. No delay. Real-time.

You don't write WebSocket server code. Hasura does it for you.

The Setup

The indexer's docker-compose.yaml (generated by Envio) runs three services:

services:
  envio-postgres:
    image: postgres:17.5
    ports:
      - "5433:5432"

  graphql-engine:
    image: hasura/graphql-engine:v2.43.0
    ports:
      - "8090:8080"
    depends_on:
      - envio-postgres
    environment:
      HASURA_GRAPHQL_DATABASE_URL: postgres://postgres:testing@envio-postgres:5432/envio-dev
      HASURA_GRAPHQL_ENABLED_LOG_TYPES: startup, http-log, websocket-log, query-log
      HASURA_GRAPHQL_ADMIN_SECRET: testing
      HASURA_GRAPHQL_UNAUTHORIZED_ROLE: public

When you run envio dev, all three services start:

Envio indexes events and writes to PostgreSQL
PostgreSQL stores the indexed data (port 5433)
Hasura exposes GraphQL + WebSocket server (port 8090)

Port 8090 serves both HTTP (for queries/mutations) and WebSocket (for subscriptions). Same endpoint, different protocol (ws:// instead of http://).

The WebSocket server is built-in to Hasura. You didn't have to write any server code.

Web App Subscriptions

The frontend uses graphql-ws to subscribe to intent and loan updates.

Intent subscription:

subscription OnNewLenderIntent($marketId: String) {
  LenderIntent(
    where: {
      marketId: { _eq: $marketId }
      status: { _in: ["open", "partial"] }
    }
    order_by: { createdAt: desc }
    limit: 10
  ) {
    offerHash
    lender
    amount
    status
    filledAmount
    remainingAmount
  }
}

When a new lender intent is posted, the frontend receives it instantly. No refresh needed.

Loan subscription:

subscription OnLoanStatusChange($loanId: numeric!) {
  Loan(where: { loanId: { _eq: $loanId } }) {
    loanId
    status
    currentPrincipal
    repaid
    liquidated
  }
}

When a loan is repaid or liquidated, the UI updates immediately.

Client setup (apps/web/src/graphql/client.ts):

import { createClient } from 'graphql-ws';
import { buildAuthHeaders } from '@floe/shared';

// Configuration via env vars
const ENVIO_HTTP_URL = process.env.NEXT_PUBLIC_ENVIO_ENDPOINT
  || 'http://localhost:8090/v1/graphql';
const ENVIO_WS_URL = process.env.NEXT_PUBLIC_ENVIO_WS_URL
  || ENVIO_HTTP_URL.replace(/^http/, 'ws');
const HASURA_ADMIN_SECRET = process.env.NEXT_PUBLIC_HASURA_ADMIN_SECRET
  || 'testing';

const wsClient = createClient({
  url: ENVIO_WS_URL,
  connectionParams: () => ({
    headers: buildAuthHeaders({
      endpoint: ENVIO_HTTP_URL,
      adminSecret: HASURA_ADMIN_SECRET,
    })
  }),
  retryAttempts: 5,
  keepAlive: 30000  // 30s heartbeat
});

For production, set these env vars:

# Hosted Envio on indexer.dev.hyperindex.xyz
NEXT_PUBLIC_ENVIO_ENDPOINT=https://indexer.dev.hyperindex.xyz/your-project/v1/graphql
NEXT_PUBLIC_ENVIO_WS_URL=wss://indexer.dev.hyperindex.xyz/your-project/v1/graphql
NEXT_PUBLIC_HASURA_ADMIN_SECRET=your-secret

The URL auto-converts HTTP to WebSocket (http → ws, https → wss) if you don't specify ENVIO_WS_URL separately.

Bot Subscriptions

The solver bot and liquidation bot use the same WebSocket pattern and env var configuration.

Solver config (apps/solver/.env):

# Local
ENVIO_HTTP_ENDPOINT=http://localhost:8090/v1/graphql
ENVIO_WS_ENDPOINT=ws://localhost:8090/v1/graphql

# Production (Hosted Envio)
ENVIO_HTTP_ENDPOINT=https://indexer.dev.hyperindex.xyz/floe-base-mainnet/v1/graphql
ENVIO_WS_ENDPOINT=wss://indexer.dev.hyperindex.xyz/floe-base-mainnet/v1/graphql

Solver (apps/solver/src/services/ws-subscription-manager.ts) subscribes to all intent changes:

subscription OnLenderIntents {
  LenderIntent(order_by: { updatedAt: desc }) {
    offerHash
    status
    isRegistered
    isUsed
    amount
    filledAmount
  }
}

When it receives an update:

Check if the status changed (track previous state in a Map)
Skip if not open or partial
Skip if already used or not registered
Add to matching queue

Liquidation bot (apps/liquidation-bot/src/services/ws-loan-subscription-manager.ts) subscribes to loan changes:

subscription OnLoans {
  Loan(
    where: {
      status: { _in: ["active", "overdue"] }
      repaid: { _eq: false }
      liquidated: { _eq: false }
    }
    order_by: { updatedAt: desc }
  ) {
    loanId
    status
    currentPrincipal
    collateralAmount
    endTime
  }
}

When a loan becomes overdue or nears liquidation threshold, the bot receives the update immediately and checks if it's profitable to liquidate.

Bootstrap Pattern

Bots follow a consistent pattern:

HTTP fetch all existing data on startup (bootstrap)
WebSocket subscribe for real-time updates
Track seen items to avoid reprocessing

This ensures they don't miss anything between startup and subscription.

Why it matters: if you only subscribed, you'd miss anything that happened before the bot started. If you only polled, you'd have delays and waste RPC calls. Bootstrap + subscribe = complete and real-time.

The Gotcha: No Aggregates on Hosted Envio

Worked great locally. Deployed to Envio's hosted service. Failed.

Aggregate Queries

An aggregate query is when you ask the database to count or sum across rows:

COUNT - "How many open intents?" → 12
SUM - "Total value of active loans?" → 1,500,000
AVG - "Average interest rate?" → 5.2

Perfect for dashboards. My frontend needed to show counts like "12 open lender intents."

Hasura supports this locally:

query GetIntentCounts($marketId: String!) {
  LenderIntent_aggregate(where: { marketId: { _eq: $marketId }, status: { _eq: "open" } }) {
    aggregate {
      count
    }
  }
}

On hosted? These endpoints don't exist.

Why

From Envio's docs:

"Aggregations over large datasets can be very slow and unpredictable in production. The recommended approach is to compute and store aggregates at indexing time, not at query time."

Makes sense. Running COUNT(*) on millions of rows every time someone loads the page would be slow.

The Fix: Pre-compute

Instead of counting when someone asks, count as events happen.

Think of a librarian. Bad way: every time someone asks "how many mystery books?", count every book. Good way: keep a running count, +1 when a new mystery arrives, -1 when one leaves.

You need a stats entity:

type IntentStats {
  id: ID!                    # "global" or marketId
  marketId: String
  openLenderCount: Int!
  openBorrowerCount: Int!
  matchedLenderCount: Int!
  # ... more counts
  updatedAt: BigInt!
}

Every time an intent is created, matched, or cancelled, update the stats:

// In intentHandlers.ts
if (status === "open") {
  await updateIntentStats(
    context,
    marketId,
    { openLenderDelta: 1 },  // Increment open count
    BigInt(event.block.timestamp)
  );
}

// The helper function
async function updateIntentStats(
  context: any,
  marketId: string | null,
  deltas: { openLenderDelta?: number; matchedLenderDelta?: number; /* ... */ },
  timestamp: bigint
) {
  const id = marketId || "global";
  const existing = await context.IntentStats.get(id);

  context.IntentStats.set({
    id,
    marketId,
    openLenderCount: (existing?.openLenderCount || 0) + (deltas.openLenderDelta || 0),
    matchedLenderCount: (existing?.matchedLenderCount || 0) + (deltas.matchedLenderDelta || 0),
    // ... update all counts
    updatedAt: timestamp,
  });
}

Frontend just reads the number:

query GetStats($marketId: String!) {
  IntentStats(where: { marketId: { _eq: $marketId } }) {
    openLenderCount
    openBorrowerCount
  }
}

Instant.

Why Compute Status Early?

Why not just store the expiry timestamp and let the frontend check if it's expired?

You could. But then you can't filter by status in your queries. You'd have to load all intents and filter client-side.

Do the work once (when indexing) instead of every time someone queries.

function computeIntentStatus(
  isRegistered: boolean,
  isUsed: boolean,
  expiry: bigint,
  currentTimestamp: bigint,
  filledAmount?: bigint,
  totalAmount?: bigint
): string {
  if (!isRegistered) return 'cancelled';
  if (isUsed) return 'matched';
  if (currentTimestamp > expiry) return 'expired';
  if (filledAmount && totalAmount && filledAmount > 0n && filledAmount < totalAmount) {
    return 'partial';
  }
  return 'open';
}

When You Need an Indexer

You'll know you need one when you try to build a real frontend.

If you're just reading current state from a contract (like "what's the owner address?"), you can query the contract directly. No indexer needed.

But the moment you need historical data, filtering, sorting, or aggregation, you're stuck. Blockchains can't do "show me all loans created last week" or "filter by interest rate" or "count how many active positions I have." That requires an indexer.

The other sign: you're hitting RPC rate limits. If you're making hundreds of getLogs or readContract calls to display a single page, you need an indexer. Your RPC provider will throttle you, and your users will wait 10+ seconds for data.

Takeaways

Blockchains aren't databases. You can't query them like one. You need something in between that listens to events, stores structured data, and lets you run normal queries. That's what indexers do.

The Graph vs Envio isn't just about speed. The Graph is decentralized and battle-tested. Envio is faster to develop with (TypeScript, not AssemblyScript) and has better DX. Pick based on what you value more: decentralization or velocity.

Hosted Envio has no aggregate queries. If you need counts, sums, or averages, pre-compute them when indexing. Don't try to use Hasura's _aggregate queries in production. They'll fail. Compute once, store it, query it.

Compute derived fields early. Status, counts, boolean flags - calculate these when you're processing events, not when the frontend queries. It's faster and avoids complex logic in your queries.

Use the Effect API. It fetches full contract data without custom RPC calls, batches reads, handles retries, and keeps your handlers clean. Don't skip it.

Table of Contents