-
Couldn't load subscription status.
- Fork 110
forky: improve shard offset acquisition #2137
Conversation
storage/fcds/fcds.go
Outdated
| Put(ch chunk.Chunk) (shard uint8, err error) | ||
| Delete(addr chunk.Address) (err error) | ||
| ShardSize() (slots []ShardInfo, err error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storer now exposes the concept of shard. It was the intention that shard is the concept only related to the MetaStore to allow alternative Storer implementations that do not rely on explicit sharding.
But if it is required for make fcds work correctly, it is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct, but if we want to be able to test the package at the edges then we have to leak this information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am wondering if that could be done with another interface which would be used for this case. ShardSize method is easy to pull out, but Put method is the tricky one. I suppose that it would be too much complexity to implement an event on which shard the chunk is put, and put a subscription on the shard related interface.
swarm.go
Outdated
| Capacity: config.DbCapacity, | ||
| Tags: self.tags, | ||
| PutToGCCheck: to.IsWithinDepth, | ||
| PutToGCCheck: func(_ []byte) bool { return true }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to this on master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, i left this here in order to get this question. short answer is yes but i'm reverting this from this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Should we test this change on cluster also?
storage/fcds/mock/mock.go
Outdated
| return nil | ||
| } | ||
|
|
||
| func (s *Store) ShardSize() (slots []fcds.ShardInfo, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment on ShardInfo
// ShardInfo contains data about free number of slots
// in a shard.
suggests that it holds a number of free slots, but Count returns the number of stored chunks.
storage/fcds/fcds.go
Outdated
| func (s *Store) ShardSize() (slots []ShardInfo, err error) { | ||
| slots = make([]ShardInfo, len(s.shards)) | ||
| for i, sh := range s.shards { | ||
| sh.mu.Lock() | ||
| fs, err := sh.f.Stat() | ||
| sh.mu.Unlock() | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| slots[i] = ShardInfo{Shard: uint8(i), Val: fs.Size()} | ||
| } | ||
|
|
||
| return slots, nil | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment on ShardInfo
// ShardInfo contains data about free number of slots
// in a shard.
suggests that it holds a number of free slots, but ShardSize returns the total number of bytes. This also contradicts the mock store ShardSize implementation.
storage/fcds/meta.go
Outdated
| // ShardInfo contains data about an arbitrary shard | ||
| // in that Val could potentially represent any scalar | ||
| // size pertaining to a shard (number of free slots, | ||
| // size in bytes, number of occupied slots, etc). | ||
| type ShardInfo struct { | ||
| Shard uint8 | ||
| Val int64 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would not be better to be more precise on the Val? I find it quite strange that a value meaning is so vague. Could it be that we provide only the number of chunks as the value. That would be quite trivial to have in leveldb MetaStore implementation then the number of bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops. let me elaborate. originally due to the fact we were looking at the number of free slots in the shard and the size of the shard, and we needed to sort both in the same way - i renamed this to be general in order to not keep two struct definitions (and two duplicate sorting primitives as of such). i see your point now that this is only used by ShardSize. i can rename it to conform 👍
| sh := s.shards[getShard(addr)] | ||
| sh := s.shards[m.Shard] | ||
| sh.mu.Lock() | ||
| defer sh.mu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this defer removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea was not to hold the lock beyond the specific time that the shard is used (i.e. to not hold it for the metric increment or the chunk constructor)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will put it back as a defer statement
|
sure @janos. thanks! |
|
i'm closing this since the performance toll of this pr is huge. will keep it for reference. maybe one day :) |
This PR tries to improve forky shard offset acquisition for writing. This mitigates a possible previous data race possibility (on the interim implementation, not in current
fcdsbranch) when several writers would try to get a shard to write to, and several could get the same shard due to open offsets, but the amount of times in which the shard was given to waiting writers would not be taken into account. This might cause a situation where eventually more writes than needed would end up on a shard, causing an append at the end of the shard.This PR instead deletes a given open offset once its given to a writer. If the write fails - the open offset is rolled back and put back into the free slots map. Once a complete transaction is done (localstore -> forky) and once the data is written to the shard and metastore correctly - the free offset is then deleted from the free offset in-mem map. When
Setting the chunk in the metastore, the correlating reclaimed offset is deleted as part of the batch.The free offsets are loaded to the in-memory map on the constructor and the changes to the in-mem map are just constantly shadowed into leveldb when
Put/Delete/Setoperations are done on forky.https://snapshot.raintank.io/dashboard/snapshot/Nf6hV9cowms9qfxXaF6VsgnkN0bmkiBQ
https://snapshot.raintank.io/dashboard/snapshot/iThV8spSTIpI4KCiKlQK8KvNVZD5uuEb