Skip to content

Conversation

runcom
Copy link
Member

@runcom runcom commented Jun 16, 2016

Signed-off-by: Antonio Murdaca [email protected]

@wking
Copy link
Contributor

wking commented Jun 16, 2016

On Thu, Jun 16, 2016 at 02:26:05PM -0700, Antonio Murdaca wrote:

  • descriptor: size int64

I'm meh about the change ;), but for consistency you should update 1
to $ref 2.

And maybe uint64? Why keep the sign?

@runcom
Copy link
Member Author

runcom commented Jun 16, 2016

And maybe uint64? Why keep the sign?

because docker/distribution uses int64 - and I didn't want to move that far there. Obviously uint64 would be better as long as sizes aren't negative :)

but for consistency you should update [1]
to $ref [2].

done

@wking
Copy link
Contributor

wking commented Jun 16, 2016

On Thu, Jun 16, 2016 at 02:40:29PM -0700, Antonio Murdaca wrote:

And maybe uint64? Why keep the sign?

because docker/distribution uses int64 - and I didn't want to move
that far there. Obviously uint64 would be better as long as sizes
aren't negative :)

I think that if we're going to get picky about how many bits we're
using, we should at least get the signedness right ;).

but for consistency you should update 1
to $ref [2].

done

int64 isn't a native type, you have to $ref it like 1.

@runcom
Copy link
Member Author

runcom commented Jun 16, 2016

I personally like being picky about types because it wasn't obvious to me if the image-spec defines an int but in reality it's at least an int64 and since it's a spec I expect this pickyness (and since I defined types in Go as well.). I also think we have an int64 defined for some reason.....

BTW I'll leave this decision to the maintainers and I'll either close or update this PR

@philips
Copy link
Contributor

philips commented Jun 16, 2016

Are int64's represented by json numbers? A reference to the docker distribution value would be helpful for tracking.

@runcom
Copy link
Member Author

runcom commented Jun 16, 2016

@wking
Copy link
Contributor

wking commented Jun 16, 2016

On Thu, Jun 16, 2016 at 02:52:36PM -0700, Antonio Murdaca wrote:

I personally like being picky about types because it wasn't obvious
to me if the image-spec defines an int…

I'd been interpreting this like JSON which does not specify a maximum
size. “At least 64 bits of integer” would be a reasonable requirement
(like C's short, int, long, …), but int64 sets a maximum 1. I
don't see much use in that, because if you have an object that is
bigger (unlikely, but still), you probably want the right to take that
risk and set:

"size": 9223372036854776001

Signedness, on the other hand, is something that we should be able to
form a stable agreement on. What would negative sizes mean?

@runcom
Copy link
Member Author

runcom commented Jun 17, 2016

Signed-off-by: Antonio Murdaca <[email protected]>
@stevvooe
Copy link
Contributor

stevvooe commented Jul 6, 2016

I think that if we're going to get picky about how many bits we're
using, we should at least get the signedness right ;).

signedness shouldn't be used for data validation. And a negative size is also a very clear way of saying the size is unknown, albeit inadvisable.

I'm not really sure if we need to be picky, other than that the parser should error out if the value overflows the field on whatever platform is consuming the data.

@wking
Copy link
Contributor

wking commented Jul 6, 2016

On Wed, Jul 06, 2016 at 12:06:35PM -0700, Stephen Day wrote:

signedness shouldn't be used for data validation. And a negative
size is also a very clear way of saying the size is unknown, albeit
inadvisable.

The spec REQUIRES size 1, so I think:

  • We can (and should) consider it during data validation.
  • We should not allow negative values. If we want “I don't know the
    size” to be a legal statement, we should instead make ‘size’
    optional (and use a *uint in Go).

I'm not really sure if we need to be picky, other than that the
parser should error out if the value overflows the field on whatever
platform is consuming the data.

Agreed, and that^ sounds like a JSON-parser issue to me, and not
something we need to specify in the spec. RFC 7159 has a paragraph on
this here [2](“This specification allows implementations to set
limits on the range and precision of numbers accepted…”), although it
doesn't explicitly require the JSON implementation to error out on
overflows.

@stevvooe
Copy link
Contributor

stevvooe commented Jul 6, 2016

we should instead make ‘size’
optional (and use a *uint in Go).

I'm sorry, but using unsigned integers for data validation is bad practice. It makes underflow detection challenging and complicates offset declaration when working with binary storage formats. int64 is used throughout Go's APIs and they made a good decision. We can validate that this integer is positive, but requiring an unsigned integer in the implementation is just silly.

In general, this is a good addition. Throughout "the stack", 64-bit integers are sufficient for declaring offsets and size for large binary objects. The only area where I see this being a problem is with javascript, where it doesn't have a 64-bit integer, so making a requirement to support at least 64 bits, is problematic.

@runcom Was this PR a matter of hygiene or did you encounter a specific problem?

@wking
Copy link
Contributor

wking commented Jul 6, 2016

On Wed, Jul 06, 2016 at 02:14:45PM -0700, Stephen Day wrote:

we should instead make ‘size’ optional (and use a *uint in Go).

I'm sorry, but using unsigned integers for data validation is bad
practice. It makes underflow detection challenging and complicates
offset declaration when working with binary storage formats.

Isn't underflow just a floating point thing 1? When I try to
json.Unmarshal a negative integer into a uint64 or *uint64 type, I
get:

json: cannot unmarshal number -1 into Go value of type uint64

The same is true of uint and *uint, although the error message ends
with ‘uint’.

I don't see how a signed size helps with offset declaration. Do you
use negative offsets? Can you point me to an example where using an
unsigned size would have complicated things?

int64 is used throughout Go's APIs and they made a good
decision. We can validate that this integer is positive, but
requiring an unsigned integer in the implementation is just silly.

I'm not sure why Go would choose to use a signed integer for a
parameter that can never be negative. That seems like it's just
asking for manual type checking that I'd rather leave to the compiler.

@stevvooe
Copy link
Contributor

stevvooe commented Jul 6, 2016

When I said "underflow", I mean, negative numbers. Let's say we have to ask the question "Can it fit?". space - size >= 0 answers that question when using signed integers.

The use case of negative binary offsets is to declare blocks relative to some known point. Let's say we add binary blob storage metadata to the end of a file. We can say that the block is -blocksize bytes from the end of the file and the end of the blob is at -n*blocksize, where n is the number of metadata blocks.

I'm not sure why Go would choose to use a signed integer for a
parameter that can never be negative. That seems like it's just
asking for manual type checking that I'd rather leave to the compiler.

The point of an unsigned integer is not to provide a number that can't go negative. The point is to define a field with specific overflow behavior. When working with sizes and offsets, we ignore the behavior of that field and operate as if they are integers, which is true when operating within the bounds of the field. When doing arithmetic operations on size, we want to believe it is an integer, not a field with overflow.

This is all beside the point: the question being asked here is, should we validate the size as positive (Yes, probably) and should we have a minimum range requirement on the field value for size?

@wking
Copy link
Contributor

wking commented Jul 6, 2016

On Wed, Jul 06, 2016 at 02:53:27PM -0700, Stephen Day wrote:

space - size >= 0 answers that question when using signed
integers.

And ‘space >= size’ answers it for all integers, signed or not.

The point of an unsigned integer is not to provide a number that
can't go negative. The point is to define a field with specific
overflow behavior.

And also to give you an extra bit's worth of positive-number space.
An int64 only gives you 63 bits of positive-number space.

This is all beside the point: the question being asked here is,
should we validate the size as positive (Yes, probably)…

I agree here.

… and should we have a minimum range requirement on the field value
for size?

And I'd rather punt to JSON here 1.

@philips
Copy link
Contributor

philips commented Jul 28, 2016

I am fine with int64 and saying it MUST be positive.

@wking
Copy link
Contributor

wking commented Jul 28, 2016

On Wed, Jul 27, 2016 at 05:05:13PM -0700, Brandon Philips wrote:

I am fine with int64 and saying it MUST be positive.

And putting a cap on the value in the JSON Schema, or would you leave
that off 1?

@philips
Copy link
Contributor

philips commented Aug 3, 2016

@wking If we can express greater than 0 and less than 9223372036854775807 (9223 Petabytes) then I think that would be great.

Hopefully this won't be my 640k is enough for anyone moment :-P But I think 9,223 Petabytes is enough for anyone. :)

@philips
Copy link
Contributor

philips commented Aug 3, 2016

LGTM

We can followup with more code if necessary. But, this is a good fix as-is.

Approved with PullApprove

@wking
Copy link
Contributor

wking commented Aug 3, 2016

On Wed, Aug 03, 2016 at 10:50:10AM -0700, Brandon Philips wrote:

Hopefully this won't be my 640k is enough for anyone moment :-P But
I think 9,223 Petabytes is enough for anyone. :)

I don't see a reason to specify a max cap, but I agree that I'm
unlikely to bump into this one myself ;).

@stevvooe
Copy link
Contributor

stevvooe commented Aug 3, 2016

9PB layer -> maybe break up your image?

LGTM

Approved with PullApprove

@vbatts
Copy link
Member

vbatts commented Aug 3, 2016

LGTM

Approved with PullApprove

@vbatts vbatts merged commit bda75f2 into opencontainers:master Aug 3, 2016
@runcom runcom deleted the size-int64 branch August 3, 2016 20:14
@cgwalters
Copy link

And a negative size is also a very clear way of saying the size is unknown, albeit inadvisable.

If we wanted to support unknown size wouldn't we just make the size field optional? Why express that via negative numbers?

Anyways, 9 years later, I would like to know: is -1 valid in the size field or not? The spec doesn't say. We're getting this from some registries apparently (xref bootc-dev/bootc#1567 )

I introduced this problem apparently with youki-dev/oci-spec-rs#203 (my bad for not digging through the history of this spec).

But I don't want to just change it back to int64 without having some guidance for why I'm doing that. (And if I were to do so...because in Rust we have Option<T> I would definitely make it Option<u64> and not int64)

@cyphar
Copy link
Member

cyphar commented Sep 4, 2025

In umoci we support -1 as a value for unknown sizes (I think I added this because of this discussion, but it was ~10 years ago...). I now think this is generally a misfeature -- the main reason why we have a size field along with a hash to avoid DoS attacks where an attacker drip-feeds you data forever and locks up a client because they don't know how long the stream should be. With a size field you can kill the connection once you've read enough bytes.

@sudo-bmitch
Copy link
Contributor

In my own code, a -1 would always fail the size check, and I wouldn't ever expect to see that in a descriptor. In what scenario would the digest be known, but not the size?

@cgwalters
Copy link

The point is to define a field with specific overflow behavior.

@stevvooe You are conflating the types with specific programming language semantics. For example in Rust, there's explicit saturating methods on integers.

(Actually this gets into a different issue with the spec that afaics "int64" is not defined anywhere - i.e. the mapping of that type to JSON though its definition is pretty obvious; anyways that sub-thread was mostly covered above)

And ‘space >= size’ answers it for all integers, signed or not.

But yes exactly (or often I think people would write the inverse i.e. if (size > free_space) error()) but anyways this is really the problem domain for storage systems, having the type be forced to be signed isn't helpful at all in that as negative numbers as input are nonsensical and would need to be rejected early on anyways.


In my own code, a -1 would always fail the size check, and I wouldn't ever expect to see that in a descriptor. In what scenario would the digest be known, but not the size?

Anyways...it turns out that this issue was not directly related to this spec. It happens in the https://github.com/containers/container-libs codebase when the remote registry doesn't return a Content-Length for GetBlob HTTP request at least. The codebase used -1 for this case in a returned size data, but...I would place a monetary bet that the person writing it was very influenced in choosing int64 for this because the OCI spec did.

Obviously we do know the size up front because we're following the manifest, and it turns out that all callers of this internal API basically just ignore it - except in my Rust bindings I try to pervasively use u64 because negative sizes are obviously malformed and nothing should emit them, and the deserializer was erroring in this case.

So again, not directly related to this spec issue, but I'm sure influenced by it.

cgwalters added a commit to cgwalters/containers-image-proxy-rs that referenced this pull request Sep 4, 2025
The internal API in containers/image returns a size...which
is an `int64` in the source probably because OCI Descriptor
type have it be an int64 too for bad historical reasons;
xref opencontainers/image-spec#153

Anyways, the `GetBlob` wrapper API returns `-1` when
the remote registry doesn't emit `Content-Length`:
https://github.com/containers/container-libs/blob/3af9abd27f3b875122b9fedd88953e8c840d6958/image/docker/docker_client.go#L1068

And this apparently happens with at least the Quay
mirror registry.

We also noticed this problem when adding `GetRawBlob`
in containers/skopeo#2601
which simply doesn't return a size to the client at all.

Signed-off-by: Colin Walters <[email protected]>
cgwalters added a commit to cgwalters/containers-image-proxy-rs that referenced this pull request Sep 4, 2025
The internal API in containers/image returns a size...which
is an `int64` in the source probably because OCI Descriptor
type have it be an int64 too for bad historical reasons;
xref opencontainers/image-spec#153

Anyways, the `GetBlob` wrapper API returns `-1` when
the remote registry doesn't emit `Content-Length`:
https://github.com/containers/container-libs/blob/3af9abd27f3b875122b9fedd88953e8c840d6958/image/docker/docker_client.go#L1068

And this apparently happens with at least the Quay
mirror registry.

We also noticed this problem when adding `GetRawBlob`
in containers/skopeo#2601
which simply doesn't return a size to the client at all.

Signed-off-by: Colin Walters <[email protected]>
@sudo-bmitch
Copy link
Contributor

Anyways...it turns out that this issue was not directly related to this spec. It happens in the https://github.com/containers/container-libs codebase when the remote registry doesn't return a Content-Length for GetBlob HTTP request at least. The codebase used -1 for this case in a returned size data

I'd treat the descriptor and the registry Content-Length header separately. Code will often have the descriptor when pulling the blob, and would therefore know the size in advance. If Content-Length is a valid value, then it could be validated against the descriptor. But otherwise I'd ignore the registry and depend on the descriptor.

There are scenarios in interacting with the registry where the size isn't known in advance (chunked or streaming blob push), but in those cases the digest isn't known yet and you don't have the descriptor yet, they become available at the end of the blob push.

@tianon
Copy link
Member

tianon commented Sep 5, 2025

saying it MUST be positive

So maybe we need a PR that adds this clarification in the markdown?
(I don't think we can reasonably change the Go types again, as nice as it might be to do so).

@sudo-bmitch
Copy link
Contributor

saying it MUST be positive

So maybe we need a PR that adds this clarification in the markdown?

I've raised #1285 for this.

cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
VerifiedReadCloser previously would allow for negative ExpectedSize to
disable the size checking features added in commit ad66299 ("pkg:
hardening: expand to verify descriptor length").

This was based on a somewhat overly-permissive reading of the discussion
in opencontainers/image-spec#153 which was finally clarified in
opencontainers/image-spec#1285. Unknown sizes are a classic DoS vector,
so allowing them seems like a bad idea in general.

Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
This was implicitly allowed by VerifiedReadCloser, and while we have
closed that hole it's probably best to provide a more helpful error
message.

Not blocking this earlier was mostly due to a somewhat overly-permissive
reading of the discussion in opencontainers/image-spec#153 which was
finally clarified in opencontainers/image-spec#1285. Unknown sizes are a
classic DoS vector, so allowing them (especially for descriptors where
it makes little sense to have an unknown size) seems like a bad idea in
general.

Ref: opencontainers/image-spec#1285
Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
VerifiedReadCloser previously would allow for negative ExpectedSize to
disable the size checking features added in commit ad66299 ("pkg:
hardening: expand to verify descriptor length").

This was added partially because a somewhat overly-permissive reading of
the discussion in opencontainers/image-spec#153 (which was finally
clarified in opencontainers/image-spec#1285), but was also necessary for
some users of VerifiedReadCloser that did not really know the proper
blob size. We have now adjusted all of those callers, so there is no
longer any reason to continue supporting this.

Unknown sizes are a classic DoS vector, so allowing them seems like a
bad idea in general. We might need to adjust this if/when umoci grows
OCI distribution-spec support, but for now it isn't needed.

Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
This was implicitly allowed by VerifiedReadCloser, and while we have
closed that hole it's probably best to provide a more helpful error
message.

Not blocking this earlier was mostly due to a somewhat overly-permissive
reading of the discussion in opencontainers/image-spec#153 which was
finally clarified in opencontainers/image-spec#1285. Unknown sizes are a
classic DoS vector, so allowing them (especially for descriptors where
it makes little sense to have an unknown size) seems like a bad idea in
general.

Ref: opencontainers/image-spec#1285
Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
VerifiedReadCloser previously would allow for negative ExpectedSize to
disable the size checking features added in commit ad66299 ("pkg:
hardening: expand to verify descriptor length").

This was added partially because a somewhat overly-permissive reading of
the discussion in opencontainers/image-spec#153 (which was finally
clarified in opencontainers/image-spec#1285), but was also necessary for
some users of VerifiedReadCloser that did not really know the proper
blob size. We have now adjusted all of those callers, so there is no
longer any reason to continue supporting this.

Unknown sizes are a classic DoS vector, so allowing them seems like a
bad idea in general. We might need to adjust this if/when umoci grows
OCI distribution-spec support, but for now it isn't needed.

Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
This was implicitly allowed by VerifiedReadCloser, and while we have
closed that hole it's probably best to provide a more helpful error
message.

Not blocking this earlier was mostly due to a somewhat overly-permissive
reading of the discussion in opencontainers/image-spec#153 which was
finally clarified in opencontainers/image-spec#1285. Unknown sizes are a
classic DoS vector, so allowing them (especially for descriptors where
it makes little sense to have an unknown size) seems like a bad idea in
general.

Ref: opencontainers/image-spec#1285
Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
VerifiedReadCloser previously would allow for negative ExpectedSize to
disable the size checking features added in commit ad66299 ("pkg:
hardening: expand to verify descriptor length").

This was added partially because a somewhat overly-permissive reading of
the discussion in opencontainers/image-spec#153 (which was finally
clarified in opencontainers/image-spec#1285), but was also necessary for
some users of VerifiedReadCloser that did not really know the proper
blob size. We have now adjusted all of those callers, so there is no
longer any reason to continue supporting this.

Unknown sizes are a classic DoS vector, so allowing them seems like a
bad idea in general. We might need to adjust this if/when umoci grows
OCI distribution-spec support, but for now it isn't needed.

Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
This was implicitly allowed by VerifiedReadCloser, and while we have
closed that hole it's probably best to provide a more helpful error
message.

Not blocking this earlier was mostly due to a somewhat overly-permissive
reading of the discussion in opencontainers/image-spec#153 which was
finally clarified in opencontainers/image-spec#1285. Unknown sizes are a
classic DoS vector, so allowing them (especially for descriptors where
it makes little sense to have an unknown size) seems like a bad idea in
general.

Ref: opencontainers/image-spec#1285
Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
VerifiedReadCloser previously would allow for negative ExpectedSize to
disable the size checking features added in commit ad66299 ("pkg:
hardening: expand to verify descriptor length").

This was added partially because a somewhat overly-permissive reading of
the discussion in opencontainers/image-spec#153 (which was finally
clarified in opencontainers/image-spec#1285), but was also necessary for
some users of VerifiedReadCloser that did not really know the proper
blob size. We have now adjusted all of those callers, so there is no
longer any reason to continue supporting this.

Unknown sizes are a classic DoS vector, so allowing them seems like a
bad idea in general. We might need to adjust this if/when umoci grows
OCI distribution-spec support, but for now it isn't needed.

Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
This was implicitly allowed by VerifiedReadCloser, and while we have
closed that hole it's probably best to provide a more helpful error
message.

Not blocking this earlier was mostly due to a somewhat overly-permissive
reading of the discussion in opencontainers/image-spec#153 which was
finally clarified in opencontainers/image-spec#1285. Unknown sizes are a
classic DoS vector, so allowing them (especially for descriptors where
it makes little sense to have an unknown size) seems like a bad idea in
general.

Ref: opencontainers/image-spec#1285
Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
VerifiedReadCloser previously would allow for negative ExpectedSize to
disable the size checking features added in commit ad66299 ("pkg:
hardening: expand to verify descriptor length").

This was added partially because a somewhat overly-permissive reading of
the discussion in opencontainers/image-spec#153 (which was finally
clarified in opencontainers/image-spec#1285), but was also necessary for
some users of VerifiedReadCloser that did not really know the proper
blob size. We have now adjusted all of those callers, so there is no
longer any reason to continue supporting this.

Unknown sizes are a classic DoS vector, so allowing them seems like a
bad idea in general. We might need to adjust this if/when umoci grows
OCI distribution-spec support, but for now it isn't needed.

Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
This was implicitly allowed by VerifiedReadCloser, and while we have
closed that hole it's probably best to provide a more helpful error
message.

Not blocking this earlier was mostly due to a somewhat overly-permissive
reading of the discussion in opencontainers/image-spec#153 which was
finally clarified in opencontainers/image-spec#1285. Unknown sizes are a
classic DoS vector, so allowing them (especially for descriptors where
it makes little sense to have an unknown size) seems like a bad idea in
general.

Ref: opencontainers/image-spec#1285
Signed-off-by: Aleksa Sarai <[email protected]>
cyphar added a commit to cyphar/umoci that referenced this pull request Sep 6, 2025
VerifiedReadCloser previously would allow for negative ExpectedSize to
disable the size checking features added in commit ad66299 ("pkg:
hardening: expand to verify descriptor length").

This was added partially because a somewhat overly-permissive reading of
the discussion in opencontainers/image-spec#153 (which was finally
clarified in opencontainers/image-spec#1285), but was also necessary for
some users of VerifiedReadCloser that did not really know the proper
blob size. We have now adjusted all of those callers, so there is no
longer any reason to continue supporting this.

Unknown sizes are a classic DoS vector, so allowing them seems like a
bad idea in general. We might need to adjust this if/when umoci grows
OCI distribution-spec support, but for now it isn't needed.

Signed-off-by: Aleksa Sarai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants