Skip to content

PeerStore Persistence #591

@vasco-santos

Description

@vasco-santos

PeerStore Persistence

As part of the PeerStore improvements epic, we intend to back the PeerStore with a datastore.

Sub-milestones

  • Add persitence back end to the books
  • Configurable persistence

Overview

Centralizing all the information a peer has about its environment enables us to easily persist this data. This is particularly useful when we need to restart a node as we will be able to start establishing connections with peers that we know (other than the bootstrap nodes).

This should have a big impact on browser nodes, since they have less ways of discovering nodes and these discovery services sometimes take a longer times. Moreover, the nodes will have a bigger set of connected nodes considerably faster.

Other than faster connectivity, a persisted peerStore enables us to rely less in the bootstrap nodes (via configuration), so that we reduce the load on them.

Implementation Design

What to persist

The PeerStore is composed by 4 different components, addressBook, KeyBook, metadataBook and protoBook. While some of the content of these books are super relevant to persist, others might not have a clear value.

The addressBook contains a list of multiaddr for each peer, as well as some relevant data for each multiaddr, including their validity, degree and confidence, .... While the multiaddr are super valuable, the remaining two are discussable. Considering that we will need to dial the peers when the peer restarts, the validity will be updated, so it is not that relevant to store. On the other side, the degree of confidence can have a good impact if we have multiple multiaddresses for the peer, since we will dial multiple ones in parallel.

The keyBook content must be stored as it is crucial for the correct work of the system.

The metadataBook may contain previously set metadata about each peer. As a result, this information should still exist when the peer is restarted.

The protoBook contains the list of protocols supported by each peer. As we will need to establish connections with the peers, we can run the identify service to get the updated list of protocols they support and multiaddrs they are listening on. Therefore, this information does not seem crucial to store. However, it could potentially be important if we had a large number of peers. In this context, we could choose the best peers to connect based on the protocols they run (but the information can also be outdated).

How to persist

A datastore stores the data in a key-value fashion. As a result, we need coherent keys so that we do not overwrite data.

A datastore allows us to query it through a key prefix. This way, we can find all the information if we define a consistent namespace that allow us to find the content without having any information.

The namespaces was defined as follows:

AddressBook

All the knownw peer addresses are stored with a key pattern as follows:

/peers/addrs/<b32 peer id no padding>

ProtoBook

All the knownw peer protocols are stored with a key pattern as follows:

/peers/protos/<b32 peer id no padding>

KeyBook

All public keys are stored under the following pattern:

/peers/keys/<b32 peer id no padding>

MetadataBook

Metadata is stored under the following key pattern:

/peers/metadata/<b32 peer id no padding>/<key>

Configuration

A user should be able to choose a datastore compatible with the interface-datastore to store the data.

Even though we can set a default on the information that we recomment to persist. Users should be able to persist the data that they want to. This way, libp2p should allow a custom persistence module.

With the persisted datastore, libp2p should provide a way to configure the libp2p-bootstrap nodes to only dial those peers if needed, or to only dial a subset of them according to a metric or percentage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    exp/expertHaving worked on the specific codebase is importantkind/enhancementA net-new feature or improvement to an existing featurestatus/readyReady to be worked

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions