client-side mru signatures -- step 3/5 #734

jpeletier · 2018-06-19T22:26:54Z

Step 1 : (PR client-side mru signatures -- step 1/5 #732) Minimum set of changes for client-side MRU signatures to work + system-time based MRUs. This was the status roughly 8 days ago.
Step 2: (PR client-side mru signatures -- step 2/5 #733) Documentation, refactors and new, more concrete unit tests for the new structures
Step 3: (PR client-side mru signatures -- step 3/5 #734) Refactor LookupParams so that the different combinations are documented. Moved LookupParams to its own file. Reorder code and create separate files for a few structures
Step 4 (PR client-side mru signatures -- step 4/5 #748): Final refactors. Refactor serialization for extensibility. Refactor Timestamp to a "Timestamp" struct rather than a lone uint64. More comments throughout.
Step 5: (PR Client-side MRU signing -- step 5/5 #704) Rewrite CLI and simplify client and internal object structure according to comments to decouple resource creation from updating.

zelig

I still got issues with the fields of meta and update, smarter serialisation, plus use provable chunk hash on metadata.

zelig · 2018-06-22T01:13:42Z

cmd/swarm/main.go

raw is not the right word to use here. If multihash is the default, maybe inline is the better word?

As per @holisticode's suggestion directly in #704, I renamed it there to rawmru. I am fine with whatever name.

zelig · 2018-06-22T01:18:21Z

cmd/swarm/mru.go

unrecognised resource subcommand: %v

This has been refactored in #704 to make full use of the cli parser, so these messages are built-in.

zelig · 2018-06-22T01:23:02Z

swarm/api/api.go

comment needed.

please dont call the request mru since that shadows the package. req ?

Fixed. Renamed to request, since it is the variable name used in other parts.

note: fixes are made directly in PR #704.

zelig · 2018-06-22T01:37:07Z

swarm/api/api.go

comment does not match function name

Thanks. Fixed in #704.

zelig · 2018-06-22T01:37:44Z

swarm/api/api.go

arg name should not be mru

Fixed in #704.

zelig · 2018-06-22T02:37:23Z

swarm/storage/mru/doc.go

this should just be a normal swarm chunk with content address validation.
no need for initial byte differentiator. also dont see the use of length

It is content-addressed, but not Swarm-type. Remember the discussion about verifying resource ownership and the concept of metaHash. rootAddr = H(ownerAddr, H(metadata))

Again, including a header length field is always a good practice when encoding binary data, for extensibility and error checking.

The zero-byte differentiatior helps discriminate earlier at the Validator function, but can be removed.

Let's take the encoding discussion to #704, where it can be clearly seen in the refactored code.

zelig · 2018-06-22T02:46:52Z

swarm/storage/mru/request.go

Why use accessors when you can have fields?

Direct fields give the wrong message to the package user: that they can change the value any time.

A Request object can be signed, among other things, and altering its contents once it has been signed would make the signature incorrect and therefore the object would be inconsistent. This is why I am preventing direct write access to these fields.

zelig · 2018-06-22T02:48:47Z

swarm/storage/mru/request.go

len(j.Data) > 0

why is this better?

zelig · 2018-06-22T02:49:13Z

swarm/storage/mru/request.go

len(j.RootAddr) > 0

zelig · 2018-06-22T02:55:55Z

swarm/storage/mru/update.go

Signer address does not match owner address

Thanks. Fixed in #704

i reviewed the wrong PR first

nolash

The big question for me here is, again, whether the code blocks that have been moved around also contain actual changes? There is no way of easily determining if they do.

nolash · 2018-06-27T09:24:20Z

swarm/storage/mru/request_test.go

This seems pretty complex. Is there an advantage to this complexity?

This is so the update header serializer is clearly separated from the data, so adding new information to the header is very clear.

It is true that the initialization of low-level tests look more complicated, but at least you know from reading the code what is each thing.

It is true that the initialization of low-level tests look more complicated,

Right, I missed that this is in the test file. It's fine, then.

nolash · 2018-06-27T09:30:10Z

swarm/storage/mru/resource_test.go

not blocks anymore no?

Yes, this is scrubbed off later in #704

nolash · 2018-06-27T09:31:25Z

swarm/storage/mru/resource_test.go

Yes, this is scrubbed off later in #704

nolash · 2018-06-27T09:35:36Z

swarm/storage/mru/resource_test.go

It seems strange to me to have this as a method of SignedResourceUpdate. The description of which being:

// SignedResourceUpdate contains signature information about a resource update

I am changing the description of SignedResourceUpdate to "SignedResourceUpdate represents a resource update with all the necessary information to prove ownership of the resource"

nolash · 2018-06-27T09:36:45Z

swarm/storage/mru/resource_test.go

Same here. Generally the regime used in our codebase go more towards creating param structs passed to methods. Here it's the other way around. What is the advantage of breaking with our convention?

This is not what I have seen in Marshallers and Unmarshallers, whereupon you call .Marshall() of a struct and get a byte array, or instantiate a struct and call .Unmarshall(byte array) to populate it. This pattern is for example in resource.go for the resource struct (MarshallBinary, UnmarshallBinary). Also in general go libraries, for example big.Int.

newChunk is a serializer that takes the metadata and gives you a serialized chunk, so my reasoning is it is a serializer, so it would make sense to construct it this way.

Another reasoning is that it allows me to group logically the newChunk() method as part of the resourceMetadata struct. Otherwise it would be kind of orphan.

In any case, please note that in the tests, we always hack around the internals of a package to test certain things, so I understand the construct looks verbose or out of place. But look at how clean it looks metadata.newChunk() where it is used in real code. :-)

You're right in that it's only the more complex objects that we use the params setup.

If these are indeed (binary) Marshal and Unmarshal operations, they should implement those interfaces. In any case I think we should avoid arbitrary naming of serializers; it easily gets confusing. I've had this discussion with @janos before, and I think even we had a roundtable partially dedicated to the topic.

I'm sure there are examples in the code (maybe even especially in mine) that suggest otherwise, but let's not use that as an excuse to perpetuate wrong-doings :)

In the last review, you will see that all serializers have a homogenized names. It will be easy to change them to whatever naming convention.

In the case of newChunk the exception would be that it does not return a byte array, but a ready to use storage.Chunk

nolash · 2018-06-28T05:49:13Z

@jpeletier did you see my leading comment?

The big question for me here is, again, whether the code blocks that have been moved around also contain actual changes? There is no way of easily determining if they do.

jpeletier · 2018-06-28T20:13:56Z

swarm/storage/mru/error.go

This block does not contain changes. It was moved off resource.go to collect all error-related code in one file

jpeletier · 2018-06-28T20:16:06Z

swarm/storage/mru/lookup.go

This block is new code, with the exception of the LookupParams structure definition which is moved off resource.go.

jpeletier · 2018-06-28T20:16:41Z

swarm/storage/mru/metadata.go

No new code, just moved here, so it is now a method of resourceMetadata

jpeletier · 2018-06-28T20:17:33Z

swarm/storage/mru/request.go

Moved to update.go. In #704 is finally moved to resource_sign.go, as per @nolash's comments in the prior PR.

jpeletier · 2018-06-28T20:18:34Z

swarm/storage/mru/request.go

SignedResourceUpdate is moved to update.go, No new code.

jpeletier · 2018-06-28T20:19:28Z

swarm/storage/mru/request.go

Just moved this method higher in the file.

jpeletier · 2018-06-28T20:19:54Z

swarm/storage/mru/request.go

All SignedResourceUpdate methods just moved to update.go

jpeletier · 2018-06-28T20:27:08Z

swarm/storage/mru/request.go

This block moved to update.go. No changes.

jpeletier · 2018-06-28T20:28:03Z

swarm/storage/mru/resource.go

LookupParams struct definition moved to lookup.go, without changes.

jpeletier · 2018-06-28T20:38:39Z

swarm/storage/mru/resource.go

All error-related stuff moved to error.go.

jpeletier · 2018-06-28T21:30:35Z

swarm/storage/mru/resource.go

This is where the Handler moves to handler.go.
I made the mistake of not making a commit just for this, or maybe squashed at some point prior to one of the many rebases. Please accept my apologies.
The diff can be reviewed in detail here: http://www.mergely.com/mI2Z3XZA/

Summary of changes:

parseUpdate is now a method of SignedResourceUpdate, therefore refactor calls to it in Handler.

UpdateRequest is renamed to a simpler name Request, so update references to it in Handler.

Handler.newMetaChunk is refactored to a method of resourceMetadata, so update references to it in Handler and move that code to metadata.go

resourceData is renamed back to resourceUpdate, and resourceUpdate is divided between header and data, so update references to it in Handler.

Instantiation of LookupParams variables are refactored to LookupParams factories whose name indicate what a particular combination of params actually does: For example, Version=0, or Period=0, or a combination, have special meanings and looked cryptic to me what the purpose of a particular initialization was.

Handler.lookup is refactored to admit a LookupParams struct rather than the components of LookupParams separately.

resourceUpdateChunkAddr() is refactored to UpdateLookup.GetUpdateAddr(). The new UpdateLookup structure represents the three components of an update search key: period, version and rootAddr. Thus, UpdateLookup.GetUpdateAddr() hashes them together to produce the lookup key.
newUpdateChunk is refactored to SignedResourceUpdate.newUpdateChunk(), so move code to update.go and update references to it in Handler code.

I've gone through this and it looks ok, but I must admit it's rather hard to keep the overview even when breaking up in the series of PRs. I guess we won't get around an extra pass of thorough scrutiny on the last one.

That's fine. At the end, it is the final status what matters and that it is a net improvement to the codebase. Sorry for all the fuss. The next steps are much easier.

jpeletier · 2018-06-28T21:33:49Z

swarm/storage/mru/update.go

Not new code. Already commented in the place where the code was removed out of.

jpeletier · 2018-06-28T21:35:20Z

swarm/storage/mru/resource.go

See note at the end of this large hunk removal to see what happened to it!

nolash · 2018-06-29T14:32:16Z

swarm/storage/mru/lookup.go

Should be New... since you're actually hashing something? It's not a "getter."

Well, it is a getter in my opinion since the result is idempotent, that is you will always get the same result for the same state of that struct.

Ok fair enough. The godoc comment is not correct though.

@jpeletier fix the godoc comment please, alternately confirm you fixed it in last PR.

Fixed in #704. Thanks.

I would just call it Addr to be honest. These redundant atttibutes in names which are semantically scoped are very non-golang.
here you clearly work on an update
e

ok. Renamed to Addr() in #704. Much shorter. Thanks.

nolash · 2018-06-29T14:33:41Z

swarm/storage/mru/lookup.go

Why again please are we double hashing this?

NewResourceHash function name is misleading and its name was refactored later.
NewResourceHash does not hash anything. It is actually a serializer of (period,version, rootAddr). You can see the code right below.

... it inherited the name it had before, when it was part of resource.go. ;-)

NewResourceHash is later refactored as a serializer of the UpdateLookup structure.

Ah right, there's no sum.

nolash · 2018-06-29T14:47:30Z

swarm/storage/mru/handler.go

This thing makes me uncomfortable. But let's save the discussion for the last commit.

Yes, I know it looks a bit ugly, but you will see why it makes sense to break it down and the reusability and clarity it provides when it is time to make a change.

In any case, this I believe was finally rewritten as:

rsrc:= new(resource) rsrc.rootAddr = request.rootAddr rsrc.resourceMetadata = request.metadata rsrc.updated = time.Now()

This said, I think we could potentially remove one level, but I also think we should revisit the resource struct, because after spending quite a bit of time on this code I think it might not make sense to cache anything other than the resource metadata.

zelig · 2018-07-02T22:21:20Z

cmd/swarm/main.go

can we call this something else please? this is not 'raw', call it inline or maybe best to use a BoolTFlag called multihash which defaults to true. and you can specify -multihash=false if you inline the content

Ok, I went for this syntax:

swarm resource create <frequency> [--name <name>] [--data <0x Hexdata> [--multihash=false]]

the renamed --multihash flag defaults to true.
This is fixed directly on #704

zelig · 2018-07-02T22:22:37Z

swarm/api/api.go

can we avoid using variable names which shadow packages?

Yes. This was already fixed in #704.

zelig · 2018-07-02T22:24:43Z

swarm/api/client/client.go

too long variable names can be annoying too in a context where there is a clear semantic type and you just pass an agrument to it

Fixed. renamed updateRequest to request in #704.

zelig · 2018-07-02T22:25:05Z

swarm/api/client/client.go

here the log variable name is very nice

zelig · 2018-07-02T22:29:51Z

swarm/storage/mru/handler.go

why do we need to create these in the pool? it creates them on the fly as needed no?

It gives markedly better performance. We talked about this a few times already?

no i mean ok to use the pool but why do you need to prepopulate here i dont see that

Ah. The way I read it is that you instantiate the number of workers you expect to need. When this number is exceeded, the pool makes temporary new ones, and gets rid of them when not needed. But maybe I've understood it wrong. Maybe I could post a question to stackoverflow unless you guys know differently for certain?

zelig · 2018-07-02T22:53:23Z

swarm/storage/mru/request.go

just Owner?

This struct is serialized to JSON and we lose type information, so I'd like the JSON user to know it is a 0x address they have to put in that string.

I can rename OwnerAddr to just Owner, but in JSON it will be ownerAddr.

zelig · 2018-07-02T22:57:39Z

swarm/storage/mru/update.go

1.- helps quickly distinguish between metadata chunks / update chunks. Those two bytes are zero in metadata chunks.
2.- it is a good practice to include length of what you are sending, for integrity and extensibility. If this length does not match the content, the signature won't even be checked and we drop the packet earlier.

Let's take the serialization debate to the final PR, where you will see a separate serializer in each struct, making it easier to visualize and review

zelig · 2018-07-02T22:59:55Z

swarm/storage/mru/update.go

ToChunk or just Chunk

ok. Renamed in #704

zelig · 2018-07-03T05:35:14Z

swarm/storage/mru/update.go

fromChunk

ok. Renamed in #704

zelig · 2018-07-03T05:38:11Z

swarm/storage/mru/update.go

getOwner
verifyOwner

ok. Renamed in #704

This was referenced Jun 19, 2018

client-side mru signatures -- step 2/5 #733

Merged

client-side mru signatures -- step 1/5 #732

Merged

jpeletier force-pushed the client-side-mru-step-3 branch 2 times, most recently from 43db192 to a12ea61 Compare June 20, 2018 23:38

zelig added in progress pending-other-merge feeds labels Jun 22, 2018

zelig assigned jpeletier Jun 22, 2018

zelig previously requested changes Jun 22, 2018

View reviewed changes

jpeletier changed the base branch from mru-publickey-in-key to mru-tmp-1 June 22, 2018 13:51

nolash force-pushed the mru-tmp-1 branch from cca2242 to d0e18ba Compare June 25, 2018 10:38

jpeletier force-pushed the client-side-mru-step-3 branch from a12ea61 to 0da7b2d Compare June 25, 2018 10:42

jpeletier requested review from lmars and nolash as code owners June 25, 2018 10:42

jpeletier mentioned this pull request Jun 26, 2018

client-side mru signatures -- step 4/5 #748

Merged

jpeletier changed the title ~~client-side mru signatures -- step 3/4~~ client-side mru signatures -- step 3/5 Jun 26, 2018

jpeletier mentioned this pull request Jun 26, 2018

Client-side MRU signing -- step 5/5 #704

Merged

jpeletier changed the base branch from mru-tmp-1 to mru-publickey-in-key June 27, 2018 08:32

jpeletier changed the base branch from mru-publickey-in-key to mru-tmp-1 June 27, 2018 08:32

nolash suggested changes Jun 27, 2018

View reviewed changes