Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 16 additions & 16 deletions pages/en/indexing/operating-graph-node.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ The main store for the Graph Node, this is where subgraph data is stored, as wel

In order to index a network, Graph Node needs access to a network client via an Ethereum-compliant JSON-RPC. This RPC may connect to a single Ethereum client or it could be a more complex setup that load balances across multiple.

While some subgraphs may just require a full Ethereum node, some may have indexing features which require additional RPC functionality. Specifically subgraphs which make `eth_calls` as part of indexing will require an archive node which supports [EIP-1898](https://eips.ethereum.org/EIPS/eip-1898), and subgraphs with `callhandlers` require `trace_filter` support ([see trace module documentation here](https://openethereum.github.io/JSONRPC-trace-module)).
While some subgraphs may just require a full Ethereum node, some may have indexing features which require additional RPC functionality. Specifically subgraphs which make `eth_calls` as part of indexing will require an archive node which supports [EIP-1898](https://eips.ethereum.org/EIPS/eip-1898), and subgraphs with `callHandlers`, or `blockHandlers` with a `call` filter, require `trace_filter` support ([see trace module documentation here](https://openethereum.github.io/JSONRPC-trace-module)).

**Upcoming: Network Firehoses** - a Firehose is a gRPC service providing an ordered, yet fork-aware, stream of blocks, developed by The Graph's core developers to better support performant indexing at scale. This is not currently an indexer requirement, but Indexers are encouraged to familiarise themselves with the technology, ahead of full network support. Learn more about the Firehose [here](https://firehose.streamingfast.io/).

#### IPFS Nodes

Subgraph deployment metadata is stored on the IPFS network. The Graph Node primarily accesses the IPFS node during subgraph deployment to fetch the subgraph manifest and all linked files. Network indexers do not need to host their own IPFS node, an IPFS node for the network is hosted at https://ipfs.network.thegraph.com.
Subgraph deployment metadata is stored on the IPFS network. The Graph Node primarily accesses the IPFS node during subgraph deployment to fetch the subgraph manifest and all linked files. Network indexers do not need to host their own IPFS node. An IPFS node for the network is hosted at https://ipfs.network.thegraph.com.

#### Prometheus metrics server

Expand Down Expand Up @@ -73,7 +73,7 @@ cargo run -p graph-node --release -- \

#### Prerequisites

- **Ethereum node** - By default, the docker compose setup will use mainnet: [http://host.docker.internal:8545](http://host.docker.internal:8545) to connect to the Ethereum node on your host machine. You can replace this network name and url by updating `docker-compose.yaml`.
- **Ethereum node** - By default, the docker compose setup will use mainnet: [http://host.docker.internal:8545](http://host.docker.internal:8545) to connect to the Ethereum node on your host machine. You can replace this network name and url by updating `docker-compose.yml`.

#### Setup

Expand All @@ -84,7 +84,7 @@ git clone http://github.com/graphprotocol/graph-node
cd graph-node/docker
```

2. For linux users only - Use the host IP address instead of `host.docker.internal` in the `docker-compose.yaml `using the included script:
2. For linux users only - Use the host IP address instead of `host.docker.internal` in the `docker-compose.yml`using the included script:

```sh
./setup.sh
Expand Down Expand Up @@ -128,7 +128,7 @@ A [TOML](https://toml.io/en/) configuration file can be used to set more complex

A minimal `config.toml` file can be provided; the following file is equivalent to using the --postgres-url command line option:

```
```toml
[store]
[store.primary]
connection="<.. postgres-url argument ..>"
Expand All @@ -150,7 +150,7 @@ Graph Node indexing can scale horizontally, running multiple instances of Graph
Given multiple Graph Nodes, it is necessary to manage deployment of new subgraphs so that the same subgraph isn't being indexed by two different nodes, which would lead to collisions. This can be done by using deployment rules, which can also specify which `shard` a subgraph's data should be stored in, if database sharding is being used. Deployment rules can match on the subgraph name and the network that the deployment is indexing in order to make a decision.

Example deployment rule configuration:
```
```toml
[deployment]
[[deployment.rule]]
match = { name = "(vip|important)/.*" }
Expand Down Expand Up @@ -182,7 +182,7 @@ Read more about deployment rules [here](https://github.com/graphprotocol/graph-n

Nodes can be configured to explicitly be query nodes by including the following in the configuration file:

```
```toml
[general]
query = "<regular expression>"
```
Expand All @@ -207,7 +207,7 @@ Read more about store configuration [here](https://github.com/graphprotocol/grap

If there are multiple nodes configured, it will be necessary to specify one node which is responsible for ingestion of new blocks, so that all configured index nodes aren't polling the chain head. This is done as part of the `chains` namespace, specifying the `node_id` to be used for block ingestion:

```
```toml
[chains]
ingestor = "block_ingestor_node"
```
Expand All @@ -221,7 +221,7 @@ The Graph Protocol is increasing the number of networks supported for indexing r

The `[chains]` section controls the ethereum providers that graph-node connects to, and where blocks and other metadata for each chain are stored. The following example configures two chains, mainnet and kovan, where blocks for mainnet are stored in the vip shard and blocks for kovan are stored in the primary shard. The mainnet chain can use two different providers, whereas kovan only has one provider.

```
```toml
[chains]
ingestor = "block_ingestor_node"
[chains.mainnet]
Expand Down Expand Up @@ -254,7 +254,7 @@ Given a running Graph Node (or Graph Nodes!), the challenge is then to manage de

### Logging

Graph Node's logs can provide useful information for debugging and optimisation of Graph Node and specific subgraphs. Graph Node supports different log levels via the `GRAPH_LOG` environment variable, with the following levels: debug, error, info, warn, or trace.
Graph Node's logs can provide useful information for debugging and optimisation of Graph Node and specific subgraphs. Graph Node supports different log levels via the `GRAPH_LOG` environment variable, with the following levels: error, warn, info, debug or trace.

In addition setting `GRAPH_LOG_QUERY_TIMING` to `gql` provides more details about how GraphQL queries are running (though this will generate a large volume of logs).

Expand All @@ -264,13 +264,13 @@ Graph Node provides the metrics via Prometheus endpoint on 8040 port by default.

The indexer repository provides an [example Grafana configuration](https://github.com/graphprotocol/indexer/blob/main/k8s/base/grafana.yaml).

### `graphman`
### Graphman

`graphman` is a maintenance tool for Graph Node, helping with diagnosis and resolution of different day-to-day and exceptional tasks.

The graphman command is included in the official containers, and you can docker exec into your graph-node container to run it. It requires a `config.toml` file.

[Full documentation of `graphman` commands is available](https://github.com/graphprotocol/graph-node/blob/master/docs/graphman.md) in the Graph Node `/docs`
Full documentation of `graphman` commands is available in the Graph Node repository. See [/docs/graphman.md] (https://github.com/graphprotocol/graph-node/blob/master/docs/graphman.md) in the Graph Node `/docs`

## Working with subgraphs

Expand Down Expand Up @@ -321,9 +321,9 @@ However in some instances, if an Ethereum node has provided incorrect data for s
If a block cache inconsistency is suspected, such as a tx receipt missing event:

1. `graphman chain list` to find the chain name.
2. `graphman chain check-blocks <CHAIN> by-number <NUMBER>` will check if the cached block matches the provider, and print the diff delete the block from the cache if it doesn’t.
2. `graphman chain check-blocks <CHAIN> by-number <NUMBER>` will check if the cached block matches the provider, and deletes the block from the cache if it doesn’t.
1. If there is a difference, it may be safer to truncate the whole cache with `graphman chain truncate <CHAIN>`.
2. If the block is matches the provider, then the issue can be debugged directly against the provider.
2. If the block matches the provider, then the issue can be debugged directly against the provider.

### Querying issues and errors

Expand Down Expand Up @@ -357,12 +357,12 @@ The command `graphman stats show <sgdNNNN`> shows, for each entity type/table in

In general, tables where the number of distinct entities are less than 1% of the total number of rows/entity versions are good candidates for the account-like optimization. When the output of `graphman stats show` indicates that a table might benefit from this optimization, running `graphman stats show <sgdNNN> <table>` will perform a full count of the table - that can be slow, but gives a precise measure of the ratio of distinct entities to overall entity versions.

Once a table has been determined to be account-like, running `graphman stats account-like <sgdNNN>.<table>` will turn on the account-like optimization for queries against that table. The optimization can be turned off again with `graphman stats account-like --clear <sgdNNN>.<table>` It takes up to 5 minutes for query nodes to notice that the optimization has been turned on or off. After turning the optimization on, it is necessary to check in the appropriate dashboard that the change does not in fact make queries slower for that table; usually, slower queries cause queries against that table to show up in `pg_stat_activity` ([here](https://thegraph.grafana.net/d/Mo6FxoiWz/p-postgres-stats-prod?orgId=1&var-shard=graph_evm_compat&var-autovacuum=xautovacuum&viewPanel=4)) in large numbers, taking several seconds. In that case, the optimization needs to be turned off again.
Once a table has been determined to be account-like, running `graphman stats account-like <sgdNNN>.<table>` will turn on the account-like optimization for queries against that table. The optimization can be turned off again with `graphman stats account-like --clear <sgdNNN>.<table>` It takes up to 5 minutes for query nodes to notice that the optimization has been turned on or off. After turning the optimization on, it is necessary to verify that the change does not in fact make queries slower for that table. If you have configured Grafana to monitor Postgres, slow queries would show up in `pg_stat_activity`in large numbers, taking several seconds. In that case, the optimization needs to be turned off again.

For Uniswap-like subgraphs, the `pair` and `token` tables are prime candidates for this optimization, and can have a dramatic effect on database load.

### Removing subgraphs

> This is new functionality, which will be available in Graph Node 0.29.x

At some point an indexer might want to remove a given subgraph. This can be easily done via `graphman drop`, which deletes a deployment and all it's indexed data. The deployment can be specified as either a subgraph name, an IPFS hash `Qm..`, or the database namespace `sgdNNN`. Further documentation is available [here](https://github.com/graphprotocol/graph-node/blob/master/docs/graphman.md#-drop).
At some point an indexer might want to remove a given subgraph. This can be easily done via `graphman drop`, which deletes a deployment and all it's indexed data. The deployment can be specified as either a subgraph name, an IPFS hash `Qm..`, or the database namespace `sgdNNN`. Further documentation is available [here](https://github.com/graphprotocol/graph-node/blob/master/docs/graphman.md#-drop).