Skip to content
This repository was archived by the owner on Aug 2, 2021. It is now read-only.

Conversation

@jpeletier
Copy link
Contributor

@jpeletier jpeletier commented Jun 19, 2018

zelig
zelig previously requested changes Jun 22, 2018
Copy link
Member

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still got issues with the fields of meta and update, smarter serialisation, plus use provable chunk hash on metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw is not the right word to use here. If multihash is the default, maybe inline is the better word?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per @holisticode's suggestion directly in #704, I renamed it there to rawmru. I am fine with whatever name.

cmd/swarm/mru.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrecognised resource subcommand: %v

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been refactored in #704 to make full use of the cli parser, so these messages are built-in.

swarm/api/api.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please dont call the request mru since that shadows the package. req ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Renamed to request, since it is the variable name used in other parts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: fixes are made directly in PR #704.

swarm/api/api.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment does not match function name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed in #704.

swarm/api/api.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arg name should not be mru

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #704.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should just be a normal swarm chunk with content address validation.
no need for initial byte differentiator. also dont see the use of length

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is content-addressed, but not Swarm-type. Remember the discussion about verifying resource ownership and the concept of metaHash. rootAddr = H(ownerAddr, H(metadata))

Again, including a header length field is always a good practice when encoding binary data, for extensibility and error checking.

The zero-byte differentiatior helps discriminate earlier at the Validator function, but can be removed.

Let's take the encoding discussion to #704, where it can be clearly seen in the refactored code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use accessors when you can have fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct fields give the wrong message to the package user: that they can change the value any time.

A Request object can be signed, among other things, and altering its contents once it has been signed would make the signature incorrect and therefore the object would be inconsistent. This is why I am preventing direct write access to these fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len(j.Data) > 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len(j.RootAddr) > 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signer address does not match owner address

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed in #704

@zelig zelig dismissed their stale review June 22, 2018 03:03

i reviewed the wrong PR first

@jpeletier jpeletier changed the base branch from mru-publickey-in-key to mru-tmp-1 June 22, 2018 13:51
@jpeletier jpeletier force-pushed the client-side-mru-step-3 branch from a12ea61 to 0da7b2d Compare June 25, 2018 10:42
@jpeletier jpeletier requested review from lmars and nolash as code owners June 25, 2018 10:42
@jpeletier jpeletier changed the title client-side mru signatures -- step 3/4 client-side mru signatures -- step 3/5 Jun 26, 2018
@jpeletier jpeletier changed the base branch from mru-tmp-1 to mru-publickey-in-key June 27, 2018 08:32
@jpeletier jpeletier changed the base branch from mru-publickey-in-key to mru-tmp-1 June 27, 2018 08:32
Copy link
Contributor

@nolash nolash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The big question for me here is, again, whether the code blocks that have been moved around also contain actual changes? There is no way of easily determining if they do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty complex. Is there an advantage to this complexity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so the update header serializer is clearly separated from the data, so adding new information to the header is very clear.

It is true that the initialization of low-level tests look more complicated, but at least you know from reading the code what is each thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is true that the initialization of low-level tests look more complicated,

Right, I missed that this is in the test file. It's fine, then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blocks anymore no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is scrubbed off later in #704

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is scrubbed off later in #704

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange to me to have this as a method of SignedResourceUpdate. The description of which being:

// SignedResourceUpdate contains signature information about a resource update

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am changing the description of SignedResourceUpdate to "SignedResourceUpdate represents a resource update with all the necessary information to prove ownership of the resource"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Generally the regime used in our codebase go more towards creating param structs passed to methods. Here it's the other way around. What is the advantage of breaking with our convention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not what I have seen in Marshallers and Unmarshallers, whereupon you call .Marshall() of a struct and get a byte array, or instantiate a struct and call .Unmarshall(byte array) to populate it. This pattern is for example in resource.go for the resource struct (MarshallBinary, UnmarshallBinary). Also in general go libraries, for example big.Int.

newChunk is a serializer that takes the metadata and gives you a serialized chunk, so my reasoning is it is a serializer, so it would make sense to construct it this way.

Another reasoning is that it allows me to group logically the newChunk() method as part of the resourceMetadata struct. Otherwise it would be kind of orphan.

In any case, please note that in the tests, we always hack around the internals of a package to test certain things, so I understand the construct looks verbose or out of place. But look at how clean it looks metadata.newChunk() where it is used in real code. :-)

Copy link
Contributor

@nolash nolash Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right in that it's only the more complex objects that we use the params setup.

If these are indeed (binary) Marshal and Unmarshal operations, they should implement those interfaces. In any case I think we should avoid arbitrary naming of serializers; it easily gets confusing. I've had this discussion with @janos before, and I think even we had a roundtable partially dedicated to the topic.

I'm sure there are examples in the code (maybe even especially in mine) that suggest otherwise, but let's not use that as an excuse to perpetuate wrong-doings :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the last review, you will see that all serializers have a homogenized names. It will be easy to change them to whatever naming convention.

In the case of newChunk the exception would be that it does not return a byte array, but a ready to use storage.Chunk

@nolash
Copy link
Contributor

nolash commented Jun 28, 2018

@jpeletier did you see my leading comment?

The big question for me here is, again, whether the code blocks that have been moved around also contain actual changes? There is no way of easily determining if they do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block does not contain changes. It was moved off resource.go to collect all error-related code in one file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is new code, with the exception of the LookupParams structure definition which is moved off resource.go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new code, just moved here, so it is now a method of resourceMetadata

Copy link
Contributor Author

@jpeletier jpeletier Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to update.go. In #704 is finally moved to resource_sign.go, as per @nolash's comments in the prior PR.

Copy link
Contributor Author

@jpeletier jpeletier Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SignedResourceUpdate is moved to update.go, No new code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved this method higher in the file.

Copy link
Contributor Author

@jpeletier jpeletier Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All SignedResourceUpdate methods just moved to update.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block moved to update.go. No changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LookupParams struct definition moved to lookup.go, without changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All error-related stuff moved to error.go.

Copy link
Contributor Author

@jpeletier jpeletier Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the Handler moves to handler.go.
I made the mistake of not making a commit just for this, or maybe squashed at some point prior to one of the many rebases. Please accept my apologies.
The diff can be reviewed in detail here: http://www.mergely.com/mI2Z3XZA/

Summary of changes:

  • parseUpdate is now a method of SignedResourceUpdate, therefore refactor calls to it in Handler.
  • UpdateRequest is renamed to a simpler name Request, so update references to it in Handler.
  • Handler.newMetaChunk is refactored to a method of resourceMetadata, so update references to it in Handler and move that code to metadata.go
  • resourceData is renamed back to resourceUpdate, and resourceUpdate is divided between header and data, so update references to it in Handler.
  • Instantiation of LookupParams variables are refactored to LookupParams factories whose name indicate what a particular combination of params actually does: For example, Version=0, or Period=0, or a combination, have special meanings and looked cryptic to me what the purpose of a particular initialization was.
  • Handler.lookup is refactored to admit a LookupParams struct rather than the components of LookupParams separately.
  • resourceUpdateChunkAddr() is refactored to UpdateLookup.GetUpdateAddr(). The new UpdateLookup structure represents the three components of an update search key: period, version and rootAddr. Thus, UpdateLookup.GetUpdateAddr() hashes them together to produce the lookup key.
    newUpdateChunk is refactored to SignedResourceUpdate.newUpdateChunk(), so move code to update.go and update references to it in Handler code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone through this and it looks ok, but I must admit it's rather hard to keep the overview even when breaking up in the series of PRs. I guess we won't get around an extra pass of thorough scrutiny on the last one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. At the end, it is the final status what matters and that it is a net improvement to the codebase. Sorry for all the fuss. The next steps are much easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not new code. Already commented in the place where the code was removed out of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See note at the end of this large hunk removal to see what happened to it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be New... since you're actually hashing something? It's not a "getter."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is a getter in my opinion since the result is idempotent, that is you will always get the same result for the same state of that struct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fair enough. The godoc comment is not correct though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpeletier fix the godoc comment please, alternately confirm you fixed it in last PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #704. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just call it Addr to be honest. These redundant atttibutes in names which are semantically scoped are very non-golang.
here you clearly work on an update
e

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Renamed to Addr() in #704. Much shorter. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why again please are we double hashing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewResourceHash function name is misleading and its name was refactored later.
NewResourceHash does not hash anything. It is actually a serializer of (period,version, rootAddr). You can see the code right below.

... it inherited the name it had before, when it was part of resource.go. ;-)

NewResourceHash is later refactored as a serializer of the UpdateLookup structure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, there's no sum.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thing makes me uncomfortable. But let's save the discussion for the last commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know it looks a bit ugly, but you will see why it makes sense to break it down and the reusability and clarity it provides when it is time to make a change.

In any case, this I believe was finally rewritten as:

rsrc:= new(resource)
rsrc.rootAddr = request.rootAddr
rsrc.resourceMetadata = request.metadata
rsrc.updated = time.Now()

This said, I think we could potentially remove one level, but I also think we should revisit the resource struct, because after spending quite a bit of time on this code I think it might not make sense to cache anything other than the resource metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we call this something else please? this is not 'raw', call it inline or maybe best to use a BoolTFlag called multihash which defaults to true. and you can specify -multihash=false if you inline the content

Copy link
Contributor Author

@jpeletier jpeletier Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I went for this syntax:

swarm resource create <frequency> [--name <name>] [--data <0x Hexdata> [--multihash=false]]

the renamed --multihash flag defaults to true.
This is fixed directly on #704

swarm/api/api.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid using variable names which shadow packages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This was already fixed in #704.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too long variable names can be annoying too in a context where there is a clear semantic type and you just pass an agrument to it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. renamed updateRequest to request in #704.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here the log variable name is very nice

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to create these in the pool? it creates them on the fly as needed no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gives markedly better performance. We talked about this a few times already?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no i mean ok to use the pool but why do you need to prepopulate here i dont see that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. The way I read it is that you instantiate the number of workers you expect to need. When this number is exceeded, the pool makes temporary new ones, and gets rid of them when not needed. But maybe I've understood it wrong. Maybe I could post a question to stackoverflow unless you guys know differently for certain?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just Owner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct is serialized to JSON and we lose type information, so I'd like the JSON user to know it is a 0x address they have to put in that string.

I can rename OwnerAddr to just Owner, but in JSON it will be ownerAddr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.- helps quickly distinguish between metadata chunks / update chunks. Those two bytes are zero in metadata chunks.
2.- it is a good practice to include length of what you are sending, for integrity and extensibility. If this length does not match the content, the signature won't even be checked and we drop the packet earlier.

Let's take the serialization debate to the final PR, where you will see a separate serializer in each struct, making it easier to visualize and review

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToChunk or just Chunk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Renamed in #704

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fromChunk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Renamed in #704

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOwner
verifyOwner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Renamed in #704

@nolash nolash merged this pull request into ethersphere:mru-tmp-1 Jul 4, 2018
@jpeletier jpeletier deleted the client-side-mru-step-3 branch July 22, 2018 08:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants