Skip to content

Conversation

@tombentley
Copy link
Member

@tombentley tombentley commented Oct 10, 2025

This PR is based on, and replaces, @k-wall's initial PR

This proposal is to add pluggable authorization support to Kroxylicious. It:

  • makes changes to the kroxylicious-api to support all the filters in the same virtual cluster having a common notion about a connect client,
  • adds an Authorizer abstraction for making decisions about access to resources,
  • implements that Authorizer abstraction in terms of flexible access control lists,
  • uses an Authorizer instance to provide access controls to Kafka resources that will be equivalent to the access controls of a Kafka broker.

@tombentley tombentley requested a review from a team as a code owner October 10, 2025 00:54
5. Add a new `kroxylicious-authorization` module implementing an `Authorization` protocol filter plugin which uses an `Authorizer` instance to provide Kafka-equivalent authorization.

`Principal P is [Allowed/Denied] Operation O On Resource R`.
The following subections will explain these pieces in detail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The following subections will explain these pieces in detail.
The following subsections will explain these pieces in detail.

Copy link
Member

@k-wall k-wall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make one more pass through this, but I think it is looking good.

On receipt of a response message from the upstream, the Authorizing filter will filter the resources so that the downstream receives only resources that they are authorized to `DESCRIBE`.
public record Anonymous(String name) implements Principal {
public static Anonymous newAnonymous() {
return new Anonymous(UUID.randomUUID().toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you give Anonymous a random name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question!

Philosophically speaking, we have two unauthenticated connections, should be consider them to have the same subject? Or should we treat each as having its own distinct subject, both of which we know nothing about? I don't think there's correct answer to that. It could be argued either way.

For the ACL-based authz proposed here the answer doesn't really make a difference because under a 'distinct subjects' model we can write rules that treat them the same based on the Anonymous class, i.e. allow Anonymous with name * ....

But we should consider possible uses for subjects beyond authZ.

Consider audit logging, where we want to record who's done what to what. The 'singleton anonymous' model doesn't feel very useful to me in this case, as you can't correlate the actions of a single bad anonymous actor without also correlating all the good anonymous actors in the system. That risks drowning out useful signals of bad behaviour.

Another example concerns bandwidth quotas/traffic shaping. In that case it does make a difference:

  • Under the 'singleton anonymous' model all unauthenticated connections would share the same quota.
  • Under the 'distinct subjects' model, each unauthenticated connection gets its own quota.

Again I don't think either of these is objectively correct; it depends what you're trying to achieve.

But the point is that 'subject distinctness' feels like a useful property to define, and one that some filters might well need to rely on.

Maybe we could use ClientAddress (we'd need the source port too) and/or ClientId principals to provide 'subject distinctness' for these use cases. In which case I suppose Anonymous could be a singleton without giving anything up. But then it makes me a bit uneasy if the 'subject distinctness' property isn't guaranteed: If the user doesn't configure the right subject builder then they don't have this property.

We could lean on the fact that we know we'll always have a client network address.

  • We could make clientAddress a first class property of Subject alongside the principals. But as I've mentioned before that means we can't address that particular identifier in the same was as other principals (e.g. in the ACL rules language, but not limited to just there).
  • Or we could just always add a ClientAddress principal, irrespective of the configured SubjectBuilder.

Sorry, if that's a bit of a stream-of-consciousness answer! I'm interested to hear reasoned opinions about how we should deal with this.

Copy link
Member

@robobario robobario Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then it makes me a bit uneasy if the 'subject distinctness' property isn't guaranteed: If the user doesn't configure the right subject builder then they don't have this property.

If the traffic is arriving via a load balancer, kroxy might see the same remote ip/port crop up repeatedly over time despite them being distinct subjects, so maybe it isn't enough to distinguish a session of bad activity.

Perhaps this would be covered by adding a public concept of Session with a random id per connection. We have:

/**
     * A description of this channel.
     * @return A description of this channel (typically used for logging).
     */
    String channelDescriptor();

which is a bit techy and I wouldn't know how to rely on this for Filter logic, but if we had a

/**
 * Gets the unique identifier assigned to this client's session.
 * <p>
 * This ID is generated upon connection and remains constant for the
 * lifetime of the session.
 *
 * @return the unique session ID for this connection.
 */
String sessionId();

You could use it for these audit logging purposes. (I imagine session id is quite ubiquitous)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the Anonymous-with-random-name looks a bit like we are incorporating the session as a fact we know about the Subject, and following that logic all subjects would be distinct because their sessions are distinct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a unique session id to correlate actions on a channel together is essential. This is something we need regardless of whether the channel is authenticated or anonymous. So I don't think the "distinct subjects anonymous model" really buys us anything. If we are trying to understand a sequence of the actions on a channel, session id will be our first port of call.

With quotas, it depends what you are trying to achieve For some use-cases you might want to say I want no more than 25% of the systems resources going to anonymous connections. For other use-cases you might want to say each anonymous connection is quota'd separately. I say, let's solve the problems we need to solve today and leave thinking about quota until another day.


The table below sets out the authorization checks and filters will be implemented.
Eventually we expect to make the creation of the `Subject` instances exposed through `FilterContext.authenticatedSubject()` to be pluggable.
But initiall the `SubjectBuilder` interface will be internal to the `kroxylicious-runtime` module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
But initiall the `SubjectBuilder` interface will be internal to the `kroxylicious-runtime` module.
But initially the `SubjectBuilder` interface will be internal to the `kroxylicious-runtime` module.

The default `SubjectBuilder` will return anonymous `Subjects`.
In addition to the default subject builder we will support two other `SubjectBuilders` within the `kroxylicious-runtime`:
1. A builder called `Tls`, for `Subjects` based entirely on TLS client certificate information
2. A builder called `Sasl`, for `Subjects` based entirely on the SASL authorized id.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud. I suppose there is a possibility of a use-case where some connections are using Sasl and some using Tls. You might want a Subject providing either Principal type. I suppose the door is open to a CompositeSubjectBuilder than comprises Tls and Sasl.

### File based Authorizer implementation
The default `SubjectBuilder` will return anonymous `Subjects`.
In addition to the default subject builder we will support two other `SubjectBuilders` within the `kroxylicious-runtime`:
1. A builder called `Tls`, for `Subjects` based entirely on TLS client certificate information
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be an error to configure a TlsSubjectBuilder and have downstream ingress configured for plain connections? I think no, I think the behaviour should be the Subject is Anonymous in this case (which is how your POC behaves, if I've understood the code correctly).

@k-wall k-wall mentioned this pull request Oct 13, 2025
Copy link
Member

@robobario robobario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to merging and carry on the review in the other PR

@tombentley tombentley changed the base branch from authorizer to main October 13, 2025 20:45
@tombentley tombentley changed the title Authorizer changes based on PoC 010 - Authorizer Filter (redux) Oct 13, 2025
@tombentley
Copy link
Member Author

@robobario, @k-wall closed his original PR, so I've repointed this one to main and changed the title to be more descriptive.

## Motivation

We are identifying use-cases where making authorization decisions over Kafka entities like topics at the proxy is desirable.
Examples include where one wishes to restrict a virtual cluster to a sub-set of the resources (say topics) of the cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we go a little deeper on the motivation for using a proxy vs using Kafka ACLs. I could imagine a couple:

  1. Defense in depth. The proxy can provide a more restrictive authorization layer. It can focus on hardening a particular access path, rather than having to configure that across Kafka ACLs.
  2. The configuration will be centralized, potentially easier to audit than Kafka ACLs which look quite dynamic.
  3. Simpler configuration. We have the potential to use different auth models like RBAC to deduplicate authorization policies.

k-wall and others added 11 commits October 16, 2025 14:55
Signed-off-by: Keith Wall <[email protected]>
Signed-off-by: Tom Bentley <[email protected]>
Signed-off-by: Keith Wall <[email protected]>
Signed-off-by: Tom Bentley <[email protected]>
Signed-off-by: Tom Bentley <[email protected]>
Signed-off-by: Tom Bentley <[email protected]>
Signed-off-by: Tom Bentley <[email protected]>
Signed-off-by: Tom Bentley <[email protected]>
This reverts commit 5492206.

Signed-off-by: Tom Bentley <[email protected]>
Signed-off-by: Tom Bentley <[email protected]>
Signed-off-by: Tom Bentley <[email protected]>
Copy link
Member

@robobario robobario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this @k-wall and @tombentley, LGTM

*/
public interface SubjectBuilder {

CompletionStage<Subject> buildSubject(Context context);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we might need to make the FilterContext.clientSaslAuthenticationSuccess() a bit more flexible. I'm currently looking at the SubjectBuilder I've proposed for authz. One of the things that tries to do is allow implementations of SubjectBuilder use things like token introspection endpoints. The problem I see these APIs could push us into needing to use token introspection twice in some cases: Once to figure out the authorized id in order to call clientSaslAuthenticationSuccess() (for example introspecting on an opaque token), and then a second time in the SubjectBuilder (which is invoked with the authorizedId obtained from the call to clientSaslAuthenticationSuccess()) , if the subject should be based on some other datum exposed through that same endpoint.

In other words, for cases like this, the signature of clientSaslAuthenticationSuccess() is a bit inconvenient. We really want to do subject building as part of authN. That would allow hitting the endpoint once, getting all the information needed to build the Subject (which could be populating multiple principals based on different bit of info obtained from the endpoint), and then publishing that Subject through clientSaslAuthenticationSuccess(), rather then merely an authorizedId.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd had thoughts in the same direction. I was thinking about JWT, and the possibility that the token includes a roles or groups claim. In this situation I want the Sasl Inspection Filter to be capable of publishing Role or Group Principals in addition to causing a SaslPrincipal (or whatever) to come in existence. I figure we don't need this functionality right now, so maybe defer this for another day.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about JWT, and the possibility that the token includes a roles or groups claim.

Exactly. And the need to address this use case feels inevitable in the longer term.

I guess we'd change clientSaslAuthenticationSuccess() to accept a Subject. Problems with doing that include:

  • We'd still need to cater for TLS-based Subjects, so there's still a need for SubjectBuilder elsewhere in the runtime. I.e. we can't just make move the SubjectBuilder API out of the kroxylicious-api module and make it an API used by SASL Filters which choose to opt in to supporting it.
  • In this PR I'm proposing a VC-level SubjectBuilder, which looks odd when SASL authz happens at the Filter level. We could just have a method for getting the SubjectBuilder from the FilterContext. However, nothing forces a SASL Filter to actually use it (it could just new-up a Subject itself), so user's intent in the configuration might not be honoured. Better if the configuration schema of such a filter reflected it's non-participation in using the SubjectBuilder.
  • Or maybe this is a reason to have a TlsSubjectBuilder interface in the api, configured within the tls of the config, and whose Context doesn't have the ClientSaslContext. And a SaslSubjectBuilder which needn't be in the api and could be scoped to any SASL Filter which was actually going to use it.

That last bullet feels like a much better model. But it's a way away from what's in this proposal. Getting from here to there would be an upheaval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants