@@ -307,9 +307,9 @@ kinds of metadata RECOMMENDED for PyPI.
307307
308308__ https://github.com/theupdateframework/tuf/blob/v0.11.1/docs/METADATA.md
309309
310- In addition, all target files SHOULD be available on disk at least three times.
310+ In addition, all target files SHOULD be available on disk at least two times.
311311Once under their original filename, to provide backwards compatibility, and
312- twice with their SHA-256 and SHA- 512 hash respectively included in their
312+ once with their SHA-512 hash included in their
313313filename. This is required to produce `Consistent Snapshots`_.
314314
315315Depending on the used file system different data deduplication mechanisms MAY
@@ -321,7 +321,7 @@ PyPI and TUF Metadata
321321
322322TUF metadata provides information that clients can use to make update
323323decisions. For example, a *targets* metadata lists the available target files
324- on PyPI and includes the required signatures, cryptographic hashes , and
324+ on PyPI and includes the required signatures, cryptographic hash , and
325325file sizes for each. Different metadata files provide different information, which are
326326signed by separate roles. The *root* role indicates what metadata belongs to
327327each role. The concept of roles allows TUF to delegate responsibilities
@@ -345,20 +345,19 @@ roles used in TUF.
345345Figure 1: An overview of the TUF roles.
346346
347347Unless otherwise specified, this PEP RECOMMENDS that every metadata or
348- target file be hashed using both the SHA2-256 and SHA2- 512 functions of
348+ target file be hashed using the SHA2-512 function of
349349the `SHA-2`__ family. SHA-2 has native and well-tested Python 2 and 3
350350support (allowing for verification of these hashes without additional,
351- non-Python dependencies), and using both functions should provide
352- sufficient protection against `collision attacks`__ for the foreseeable
353- future. However, this assumes that a collision attack for SHA2-256 does
354- not easily translate to SHA2-512. If stronger security guarantees are
355- required, then SHA2-256 and `SHA3-256`__ MAY be used instead, since they
356- are based on very different designs from each other. However, SHA-3
351+ non-Python dependencies). If stronger security guarantees are
352+ required, then both SHA2-256 and SHA2-512 or both SHA2-256 and `SHA3-256`__
353+ MAY be used instead. SHA2-256 and SHA3-256
354+ are based on very different designs from each other, providing extra protection
355+ against `collision attacks`__. However, SHA-3
357356requires installing additional, non-Python dependencies for `Python 2`__.
358357
359358__ https://en.wikipedia.org/wiki/SHA-2
360- __ https://en.wikipedia.org/wiki/Collision_attack
361359__ https://en.wikipedia.org/wiki/SHA-3
360+ __ https://en.wikipedia.org/wiki/Collision_attack
362361__ https://pip.pypa.io/en/latest/development/release-process/#python-2-support
363362
364363
@@ -509,13 +508,13 @@ __ https://github.com/theupdateframework/tuf/blob/v0.11.1/docs/TUTORIAL.md#deleg
509508Based on our findings as of the time this document was updated for
510509implementation (Nov 7 2019), summarized in Tables 1-2, PyPI SHOULD
511510split all targets in the *bins* role by delegating them to 16,384
512- *bin-n* roles (see C11 in Table 1). Each *bin-n* role would sign
513- for the PyPI targets whose SHA2-256 hashes fall into that bin
511+ *bin-n* roles (see C10 in Table 1). Each *bin-n* role would sign
512+ for the PyPI targets whose SHA2-512 hashes fall into that bin
514513(see and Figure 2 and `Consistent Snapshots`_). It was found
515514that this number of bins would result in a 6-10% metadata overhead
516- (relative to the average size of downloaded distribution files; see V14 and
517- V16 in Table 2) for returning users, and a 70% overhead for new
518- users who are installing pip for the first time (see V18 in Table 2).
515+ (relative to the average size of downloaded distribution files; see V13 and
516+ V15 in Table 2) for returning users, and a 70% overhead for new
517+ users who are installing pip for the first time (see V17 in Table 2).
519518
520519A few assumptions used in calculating these metadata overhead percentages:
521520
@@ -526,31 +525,29 @@ A few assumptions used in calculating these metadata overhead percentages:
526525+------+--------------------------------------------------+-----------+
527526| Name | Description | Value |
528527+------+--------------------------------------------------+-----------+
529- | C1 | # of bytes in a SHA2-256 hexadecimal digest | 64 |
528+ | C1 | # of bytes in a SHA2-512 hexadecimal digest | 128 |
530529+------+--------------------------------------------------+-----------+
531- | C2 | # of bytes in a SHA2-512 hexadecimal digest | 128 |
530+ | C2 | # of bytes for a SHA2-512 public key ID | 64 |
532531+------+--------------------------------------------------+-----------+
533- | C3 | # of bytes for a SHA2-256 public key ID | 64 |
532+ | C3 | # of bytes for an Ed25519 signature | 128 |
534533+------+--------------------------------------------------+-----------+
535- | C4 | # of bytes for an Ed25519 signature | 128 |
534+ | C4 | # of bytes for an Ed25519 public key | 64 |
536535+------+--------------------------------------------------+-----------+
537- | C5 | # of bytes for an Ed25519 public key | 64 |
536+ | C5 | # of bytes for a target relative file path | 256 |
538537+------+--------------------------------------------------+-----------+
539- | C6 | # of bytes for a target relative file path | 256 |
538+ | C6 | # of bytes to encode a target file size | 7 |
540539+------+--------------------------------------------------+-----------+
541- | C7 | # of bytes to encode a target file size | 7 |
540+ | C7 | # of bytes to encode a version number | 6 |
542541+------+--------------------------------------------------+-----------+
543- | C8 | # of bytes to encode a version number | 6 |
542+ | C8 | # of targets (simple indices and distributions) | 2,273,539 |
544543+------+--------------------------------------------------+-----------+
545- | C9 | # of targets (simple indices and distributions) | 2,273,539 |
544+ | C9 | Average # of bytes for a downloaded distribution | 2,184,393 |
546545+------+--------------------------------------------------+-----------+
547- | C10 | Average # of bytes for a downloaded distribution | 2,184,393 |
548- +------+--------------------------------------------------+-----------+
549- | C11 | # of bins | 16,384 |
546+ | C10 | # of bins | 16,384 |
550547+------+--------------------------------------------------+-----------+
551548
552- C9 by computed querying the number of release files.
553- C10 was derived by taking the average between a rough estimate of the average
549+ C8 by computed querying the number of release files.
550+ C9 was derived by taking the average between a rough estimate of the average
554551size of release files *downloaded* over the past 31 days (1,628,321 bytes),
555552and the average size of releases files on disk (2,740,465 bytes).
556553Ernest W. Durbin III helped to provide these numbers on November 7, 2019.
@@ -560,41 +557,39 @@ Table 1: A list of constants used to calculate metadata overhead.
560557+------+------------------------------------------------------------------------------------+------------------------------+-----------+
561558| Name | Description | Formula | Value |
562559+------+------------------------------------------------------------------------------------+------------------------------+-----------+
563- | V1 | Length of a path hash prefix | math.ceil(math.log(C11 , 16)) | 4 |
560+ | V1 | Length of a path hash prefix | math.ceil(math.log(C10 , 16)) | 4 |
564561+------+------------------------------------------------------------------------------------+------------------------------+-----------+
565562| V2 | Total # of path hash prefixes | 16**V1 | 65,536 |
566563+------+------------------------------------------------------------------------------------+------------------------------+-----------+
567- | V3 | Avg # of targets per bin | math.ceil(C9/C11) | 139 |
568- +------+------------------------------------------------------------------------------------+------------------------------+-----------+
569- | V4 | Avg size of SHA-256 hashes per bin | V3*C1 | 8,896 |
564+ | V3 | Avg # of targets per bin | math.ceil(C8/C10) | 139 |
570565+------+------------------------------------------------------------------------------------+------------------------------+-----------+
571- | V5 | Avg size of SHA-512 hashes per bin | V3*C2 | 17,792 |
566+ | V4 | Avg size of SHA-512 hashes per bin | V3*C1 | 17,792 |
572567+------+------------------------------------------------------------------------------------+------------------------------+-----------+
573- | V6 | Avg size of target paths per bin | V3*C6 | 35,584 |
568+ | V5 | Avg size of target paths per bin | V3*C5 | 35,584 |
574569+------+------------------------------------------------------------------------------------+------------------------------+-----------+
575- | V7 | Avg size of lengths per bin | V3*C7 | 973 |
570+ | V6 | Avg size of lengths per bin | V3*C6 | 973 |
576571+------+------------------------------------------------------------------------------------+------------------------------+-----------+
577- | V8 | Avg size of bin-n metadata (bytes) | V4+V5+V6+V7 | 63,245 |
572+ | V7 | Avg size of bin-n metadata (bytes) | V4+V5+V6 | 54,349 |
578573+------+------------------------------------------------------------------------------------+------------------------------+-----------+
579- | V9 | Total size of public key IDs in bins | C11*C3 | 1,048,576 |
574+ | V8 | Total size of public key IDs in bins | C10*C2 | 1,048,576 |
580575+------+------------------------------------------------------------------------------------+------------------------------+-----------+
581- | V10 | Total size of path hash prefixes in bins | V1*V2 | 262,144 |
576+ | V9 | Total size of path hash prefixes in bins | V1*V2 | 262,144 |
582577+------+------------------------------------------------------------------------------------+------------------------------+-----------+
583- | V11 | Est. size of bins metadata (bytes) | V9+V10 | 1,310,720 |
578+ | V10 | Est. size of bins metadata (bytes) | V8+V9 | 1,310,720 |
584579+------+------------------------------------------------------------------------------------+------------------------------+-----------+
585- | V12 | Est. size of snapshot metadata (bytes) | C11*C8 | 98,304 |
580+ | V11 | Est. size of snapshot metadata (bytes) | C10*C7 | 98,304 |
586581+------+------------------------------------------------------------------------------------+------------------------------+-----------+
587- | V13 | Est. size of metadata overhead per distribution per returning user (same snapshot) | 2*V8 | 126,490 |
582+ | V12 | Est. size of metadata overhead per distribution per returning user (same snapshot) | 2*V7 | 108,698 |
588583+------+------------------------------------------------------------------------------------+------------------------------+-----------+
589- | V14 | Est. metadata overhead per distribution per returning user (same snapshot) | round((V13/C10 )*100) | 6 % |
584+ | V13 | Est. metadata overhead per distribution per returning user (same snapshot) | round((V12/C9 )*100) | 5 % |
590585+------+------------------------------------------------------------------------------------+------------------------------+-----------+
591- | V15 | Est. size of metadata overhead per distribution per returning user (diff snapshot) | V13+ V12 | 224,794 |
586+ | V14 | Est. size of metadata overhead per distribution per returning user (diff snapshot) | V12+V11 | 207,002 |
592587+------+------------------------------------------------------------------------------------+------------------------------+-----------+
593- | V16 | Est. metadata overhead per distribution per returning user (diff snapshot) | round((V15/C10 )*100) | 10% |
588+ | V15 | Est. metadata overhead per distribution per returning user (diff snapshot) | round((V14/C9 )*100) | 9% |
594589+------+------------------------------------------------------------------------------------+------------------------------+-----------+
595- | V17 | Est. size of metadata overhead per distribution per new user | V15+V11 | 1,535,514 |
590+ | V16 | Est. size of metadata overhead per distribution per new user | V14+V10 | 1,517,722 |
596591+------+------------------------------------------------------------------------------------+------------------------------+-----------+
597- | V18 | Est. metadata overhead per distribution per new user | round((V17/C10 )*100) | 70 % |
592+ | V17 | Est. metadata overhead per distribution per new user | round((V16/C9 )*100) | 69 % |
598593+------+------------------------------------------------------------------------------------+------------------------------+-----------+
599594
600595Table 2: Estimated metadata overheads for new and returning users.
@@ -829,7 +824,7 @@ version of the *snapshot* metadata, which in turn lists the versions of the
829824snapshot.
830825
831826The *targets* or delegated targets metadata refer to the actual target
832- files, including all of their cryptographic hashes as specified above.
827+ files, including their cryptographic hashes as specified above.
833828Thus, to mark a target file as part of a consistent snapshot it MUST, when
834829written to disk, include its hash in its filename:
835830
0 commit comments