-
Couldn't load subscription status.
- Fork 21.5k
Closed
Milestone
Description
as seen in dump https://gist.github.com/zelig/003203cd282e191a3476 which happens due to 2 addBlock calls causing invalid PoW
@fjl comments:
- Peer.run has already received some error or disconnect and is waiting for Peer.readLoop to exit.
- The read loop is waiting for the protocol to read a message.
- The protocol is waiting for Peer.run to accept the disconnect.
- But second Disconnect won't return because Peer.run is past the point where it waits for a message on Peer.disc
The general issue is that it's not safe to wait for the protocol without a timeout. The code in peer.go currently assumes that the protocol will always accept messages rather quickly.
It will be less of a problem later when we have concurrent message dispatch (RLPx chunked messages).
default:
// it's a subprotocol message
proto, err := p.getProto(msg.Code)
if err != nil {
return fmt.Errorf("msg code out of range: %v", msg.Code)
}
// ======= this should be a select and exit the loop if the protocol doesn't
// accept after 5 seconds.
proto.in <- msg
}disconnect by returning from the protocol loop instead of calling Disconnect will fix the specific issue of stalling the blockpool. @zelig