Skip to content
This repository was archived by the owner on Aug 2, 2021. It is now read-only.

Conversation

@nolash
Copy link
Contributor

@nolash nolash commented Jan 19, 2018

This pr adds ResourceHandler API in the swarm HTTP layer, with a placeholder implementation of manifests. It introduces two new schemes: bzz-db and bzz-db-raw, which are for manifest view and raw data view of resource updates respectively.

GET methods map as follows:

* <scheme>:/<id>/ -  ResourceHandler.LookupLatest(<id>, true)
* <scheme>:/<id>/n - ResourceHandler.LookupHistorical(<id>, n, true)
* <scheme>:/<id>/n/m - ResourceHandler.LookupVersion(<id>, n, m, true)

POST path is always <scheme>:/<id>/ and the message body (current only raw data) is the data of the update.

<id> can be either ENS name or hash.


A new method ethclient.Client.BlockNumber() is added for convenience to enable passing the ensClient from swarm/swarm.go.NewSwarm(..) (otherwise we use an extra ?rpc.Client so we can access the eth_blockNumber RPC directly).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is becoming unwieldy, NewSwarm should just read all those bzzconfig.XXX variables directly rather than passing them as arguments.

Copy link

@lmars lmars Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a big.Int like the backend function? Same for the return value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I don't like the numbers when you add another, no, wait, two other conversions with big.Int. So maybe I should keep the rpcclient then:

$ go test -v -cpu 4 . -bench .
goos: linux
goarch: amd64
BenchmarkLoInt-4           	50000000	        20.4 ns/op
BenchmarkHiInt-4           	30000000	        38.2 ns/op
BenchmarkLoBig-4           	10000000	       219 ns/op
BenchmarkHiBig-4           	 5000000	       299 ns/op
BenchmarkLoBigViaInt64-4   	10000000	       188 ns/op
BenchmarkHiBigViaInt64-4   	10000000	       268 ns/op

https://gist.github.com/nolash/67e5fc146b911712af44528819dd49c8

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the numbers when you add another, no, wait, two other conversions with big.Int

I'm not sure what you mean, I'm saying pass big.Int into the rpc client so that it decodes the response into a big.Int (which is that the backend function actually returns) rather than here manually decoding into a uint64?

So maybe I should keep the rpcclient then

Why do you need to keep the rpcclient? This seems like a useful addition to ethclient no?

As for the benchmark, I'm not really sure what you are benchmarking here, are you concerned that decoding into a big.Int is slower than into a uint64? Why does it matter, surely we are not calling BlockNumber that often, it only changes every ~10s?

I'm more concerned with the assumption that BlockNumber is a uint64 when it isn't (it's a big.Int).

Copy link
Contributor Author

@nolash nolash Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetBlock() is called for every Resource creation, update and lookup for latest update. In the end I'm not how the use cases will be, and how often these calls will be performed.

I see your point that the middleware should reflect the data type of the backend, but I find it a bit bureaucratic to waste 10x extra cycles when it's not necessary just keep the form: I mean, do you think we will cross the max uint64 blocknumber anytime soon?

Actually, maybe the backend itself should not be big.Int for the blocknumber...

Copy link
Contributor Author

@nolash nolash Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that rpc.LatestBlockNumber which the backend function encapulates is type rpc.BlockNumber which is even just a int64 ...

I'll look to see if we can make this hook better. It seems to me to be a real waste of resources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To not get too out of context for this PR I'll implement as big.Int for nowbut will investigate how to slim it later.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetBlock() is called for every Resource creation, update and lookup for latest update.

So then we should cache the return value and invalidate it when a new block is mined (which happens every ~10s), there is no need to continue to call an RPC method that is most likely to return the same value.

Actually, maybe the backend itself should not be big.Int for the blocknumber...

I guess this is the point I was trying to make, if an RPC method returns big.Int, then the client should also return the same thing, it's supposed to look transparent. If we think big.Int is wrong, it should be changed consistently everywhere.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nolash doesn't look like this will be accepted anyway: ethereum/go-ethereum#15907

swarm/api/api.go Outdated
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this method have a key argument?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why is it prefixed Db? Would LookupResource not be more appropriate?

Copy link
Contributor Author

@nolash nolash Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that bzz-resource-* scheme was a bit on the verbose side so I chose db instead, and felt the name in the API layer should reflect that name.

I have no strong feeling about this one way or the other.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think db should be reserved for when Swarm actually has database functionality (e.g. indexing and querying).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an apparently but unconfirmed obsolete parameter in HiveParams. I think the idea was that someone go through the changes of all the parameters that were weeded out to see if some of them were actually needed. But I guess maybe enough time has passed now that we can just drop it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call s.BadRequest, explaining why the request was bad (e.g. "error parsing frequency from the request path").

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a little awkward to have to call teardownTest rather than the typical t.Fatal, typically tests just teardown in a defer, something like:

teardownTest := setupTest(t, ...)
defer teardownTest()
...
if err != nil {
	t.Fatal(err)
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really necessary to start an RPC server for this? Should the resource handler not just accept an interface containing the methods it needs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point, sir.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleanup needs to happen before calling t.Fatal, otherwise the temporary directories will leak.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is privatekey for?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this cleanup function being called?

@nolash
Copy link
Contributor Author

nolash commented Jan 20, 2018

@lmars thanks for the thorough review. A bit premature; it still has the "in progress" tag (even needs rebase) ;) but I'll take a pass and amend your comments.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand why we are using a string? Can the response not be decoded directly into big.Int?

swarm/api/api.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not call it Db anything please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 221b084

swarm/api/uri.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's find a different name for the scheme and functions reflecting it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this layout will change according to what we discussed right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah added in 3a90aee

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much nicer this way...

@zelig
Copy link
Member

zelig commented Jan 21, 2018

very nice PR @nolash , cannot wait to have this working

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the go-ethereum team are against adding this, should we take a different approach as suggested?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, annoying. Thanks for catching this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going with a different approach?

swarm/api/api.go Outdated
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return self.resource.GetContent(name)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ResourceUpdate should return a specific error to trigger a 401 response, otherwise this will just hide other errors.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If len(params) == 1 then isn't this going to panic?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for fmt.Sprintf here (nothing is being formatted).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should comment why we are using context.Context in a non-typical way (and warn users to change their expectations).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using a cursor, you could just decode the chunk directly into a struct using binary.Read

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really have to get myself a hobby letting my write some C ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the data is not fixed size, and the https://golang.org/pkg/encoding/binary/#Read state:

func Read(r io.Reader, order ByteOrder, data interface{}) error
Read reads structured binary data from r into data. Data must be a
    pointer to a fixed-size value or a slice of fixed-size values.

So I guess I get to keep my nerdy cursor? :)

Nice to know about this, though!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return idna.ToASCII(name)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using sync.Pool to make this more concurrent.

Copy link
Contributor Author

@nolash nolash Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean have several hashers in the pool? Never used that construct.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's provides the fake block number retriever, same as in swarm/storage/resource_test.go

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem to be instantiated anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in line 76, when creating the ResourceHandler

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fakeBackend not FakeRPC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right you are, thanks

swarm/api/api.go Outdated
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check err first, rsrc can be nil if there is an error, this can panic

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a loglevel command line flag, and then it is not disturbing at all, like here: https://github.com/ethersphere/go-ethereum/blob/swarm-network-rewrite-syncer/swarm/storage/common_test.go#L37

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for else after t.Fatal(err)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for else after t.Fatal(err)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmars but the case branch continues and tries s.api.ResourceLookup with the version which it failed to parse

@nolash nolash force-pushed the swarm-mutableresources-apinew branch from 930ad73 to cf14993 Compare January 23, 2018 14:22
@nolash nolash force-pushed the swarm-mutableresources-apinew branch from cf14993 to 690522b Compare January 23, 2018 14:23
@nolash
Copy link
Contributor Author

nolash commented Jan 23, 2018

@lmars can we merge?

@lmars
Copy link

lmars commented Jan 23, 2018

@nolash form a cursory glance, there are still a few comments which haven't been addressed, is there a rush to get this merged? If there is, feel free to merge but the comments will need to be addressed before merging upstream.

@nolash
Copy link
Contributor Author

nolash commented Jan 23, 2018

I thought I did them all. Which ones are you thinking of? The comments still open on this page are addressed, I believe.

@lmars
Copy link

lmars commented Jan 23, 2018

I'll do another review...

@nolash
Copy link
Contributor Author

nolash commented Jan 23, 2018

Great @lmars

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going with a different approach?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we keeping this then?

keybytes := make([]byte, common.HashLength)
copy(keybytes, []byte{42})
srv.Hasher.Reset()
srv.Hasher.Write([]byte(fmt.Sprintf("%x", keybytes)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to be writing the hex representation of the key to the hasher, rather than the bytes themselves?

For example, I'd expect this to be:

srv.Hasher.Write(keybytes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's correct because we're getting it in hex from the HTTP server.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. So you have keybytes which is the "key" of the resource. Why would I want to create a cryptographic hash of the hex representation of keybytes rather than just the hash of keybytes? It feels like a waste of time (the hex representation is twice as many bytes)?

Copy link
Contributor Author

@nolash nolash Jan 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I understand the confusion. These first "keybytes" are merely the "name" of the resource. The resource name is always hashed, and the hash is the key of the root chunk. In the backend, as you see, there are no double hashes. But in the test we need to to make the data match.

If you think it should be clearer, I guess we could just use a predictable string instead as "name".

if !bytes.Equal(b, []byte(keybyteshash)) {
t.Fatalf("resource update hash mismatch, expected '%s' got '%s'", keybyteshash, b)
}
resp.Body.Close()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be deferred right after getting the response so that it is called even if there is an error.

log.Root().SetHandler(log.CallerFileHandler(log.LvlFilterHandler(log.LvlTrace, log.StreamHandler(os.Stderr, log.TerminalFormat(true)))))
}

// \TODO if create -> get -> update -> get, the last get with return 1.1 because 1.2 retrieve is still pending
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been addressed?

if !bytes.Equal(data, b) {
t.Fatalf("Expected body '%x', got '%x'", data, b)
}
resp.Body.Close()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be test coverage of the "version" part of the URL.

err := self.rpcClient.Call(&currentblock, "eth_blockNumber")
func (self *ResourceHandler) GetBlock() (uint64, error) {
ctx, cancel := context.WithCancel(self.ctx)
defer cancel()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really make sense to create a context with a deferred cancel which will only be called when the API method being called with the context returns.

Clear redundant addition of BlockNumber in ethclient
@nolash
Copy link
Contributor Author

nolash commented Jan 23, 2018

Removing the cancel call results in go vet complaining:

swarm/storage/resource.go:628: the cancel function returned by context.WithCancel should be called, not discarded, to avoid a context leak

(#204 (comment))

@lmars
Copy link

lmars commented Jan 23, 2018

@nolash

Removing the cancel call results in go vet complaining:

Did you remove the context creation? The idea of a context is that it is passed in to API boundaries.

In this case it would seem like the ResourceXXX methods should take a context argument, and that should be passed through to the rpc client. This gives the calling user the ability to cancel everything if it so wishes (e.g. http.Request.Context for HTTP requests, which is cancelled if the client drops the TCP connection).

@nolash
Copy link
Contributor Author

nolash commented Jan 24, 2018

@lmars done

swarm/swarm.go Outdated

// set up high level api
// SET UP high level api
fmt.Fprintf(os.Stderr, "privkey %v\nswap %v\nswapendabled %v\n", self.privateKey, config.Swap, config.SwapEnabled)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not print the private key 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol let's not

Copy link

@lmars lmars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (please remove the private key debug log though)

@nolash nolash merged commit 85e2318 into swarm-network-rewrite Jan 25, 2018
@gbalint gbalint deleted the swarm-mutableresources-apinew branch May 25, 2018 14:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants