Skip to content

sa: add authz reuse table #7715

@jsha

Description

@jsha

Right now our authz2 table looks like this:

CREATE TABLE `authz2` (
  `id` bigint(20) UNSIGNED NOT NULL AUTO_INCREMENT,
  `identifierType` tinyint(4) NOT NULL,
  `identifierValue` varchar(255) NOT NULL,
  `registrationID` bigint(20) NOT NULL,
  `status` tinyint(4) NOT NULL,
  `expires` datetime NOT NULL,
  `challenges` tinyint(4) NOT NULL,
  `attempted` tinyint(4) DEFAULT NULL,
  `attemptedAt` datetime DEFAULT NULL,
  `token` binary(32) NOT NULL,
  `validationError` mediumblob DEFAULT NULL,
  `validationRecord` mediumblob DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `regID_expires_idx` (`registrationID`,`status`,`expires`),
  KEY `regID_identifier_status_expires_idx` (`registrationID`,`identifierType`,`identifierValue`,`status`,`expires`),
  KEY `expires_idx` (`expires`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
 PARTITION BY RANGE(id)
(PARTITION p_start VALUES LESS THAN (MAXVALUE));

Those indexes are pretty big, particularly the regID_identifier_status_expires_idx one. Among other problems, it has a datetime field, which is high cardinality.

One of the main things (only thing?) we use that index for is authz reuse. We can do that more efficiently with a separate table, and we can also design that table to better fit the assumptions of key-value storage: only one key, and it's the primary key.

Roughly speaking, this table would map:

(account, identifier) -> (authz ID, expiration)

Each time an authz is successfully validated, we would append or update a row in this table. Compared to the current system, this defers some amount of work until successful validation, which is nice because so many validations fail.

We can skip encoding the identifier type, because the only two identifier types we ever plan to support have completely non-overlapping syntax (hostnames and IP addresses).

If we deploy this table in a database that supports TTLs, we would set the TTL of this row to the expiration of the authz, and update it as new authzs are written to it. Old rows would automatically be removed by the database system. If we choose to deploy it in a database that does not support TTLs, we could prepend a rough granularity epoch (e.g. number of 90-day periods since Jan 2024), making the key (epoch, account, identifier). That would allow partitioning and dropping of old partitions.

If the key is (epoch, account, identifier), that means querying for authzs to reuse would have to query for multiple keys: one in the current epoch, one in the previous epoch, and potential longer-ago epochs. If we assume the authz lifetime is always less than the epoch (which would be true with our current 30-day authzs and a hypothetical epoch of 90 days), then we would only ever have to query for two epochs, current and previous.

To find authzs for reuse for a new order, we would query for the appropriate account and identifier, check the result's expiration, then fetch the corresponding authz (to check whether it has been deactivated). This will require one additional round trip compared to our current system, which queries the authz2 table directly and so gets status right away. This can be a batch query for several identifiers (using IN syntax) or it could be several parallel queries.

One refinement could be: when an authz2 is deactivated, we delete its row in the reuse table. That would allow us to directly incorporate returned authzs in a new-order without a second query to check their status.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions