-
Notifications
You must be signed in to change notification settings - Fork 206
netutilsv2 #328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
netutilsv2 #328
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
net/v2/ipfamily.go
Outdated
_, masklen := cidr.Mask.Size() | ||
if masklen == 128 || (family == IPv4 && masklen == 32) { | ||
return family | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's annoying to have to check this manually, but it just seemed wrong to have it claim the address was IPv4 or IPv6 when it stringifies to "<nil>"
...
net/v2/ipfamily.go
Outdated
// - all elements of ips are valid | ||
// - at least one IP from each family (v4 and v6) is present | ||
func IsDualStackIPs(ips []net.IP) (bool, error) { | ||
func IsDualStackIPs(ips []net.IP) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// IPFamily refers to a specific family if not empty, i.e. "4" or "6". | ||
// IPFamily refers to the IP family of an address or CIDR value. Its values are | ||
// intentionally identical to those of "k8s.io/api/core/v1".IPFamily and | ||
// "k8s.io/discovery/v1".AddressType, so you can cast values between these types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See danwinship/kubernetes@94fc8f2 for the initial changes mostly in places that were already using IPFamilyOf, but part of the idea is that there are other places that could be rewritten to use IPFamilyOf efficiently now.
|
||
type cidrOrString interface { | ||
*net.IPNet | string | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to avoid needing to also add IPFamilyOfAddr
, IPFamilyOfPrefix
, IsIPv4Addr
, IsIPv4Prefix
, IsIPv6Addr
, IsIPv6Prefix
, IsDualStackAddrs
, and IsDualStackPrefixes
later...
The implementation feels like I'm cheating with the generics, but it looks good in use: danwinship/kubernetes@9229e7b
FWIW I originally merged the IP and CIDR methods together, so you could just do IPFamilyOf("192.168.0.1")
and IPFamilyOf("192.168.0.0/16")
, rather than needing IPFamilyOfCIDR
, but I eventually decided I didn't like that, in part because it meant that things like IsDualStack("1.2.3.4", "::/128")
would compile...
// non-validated input strings. It should be used instead of net.ParseIP (which rejects | ||
// some strings we need to accept for backward compatibility) and the old | ||
// netutilsv1.ParseIPSloppy. | ||
func ParseIP(ipStr string) (net.IP, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of the error return value is frequently annoying for callers (danwinship/kubernetes@8cb7e60, though in other places it's easy to ignore, danwinship/kubernetes@9adeb45). But this is more consistent (and the addition of MustParseIP
later helps for unit tests).
Also, making the return value consistent ((X, error)
) helps with the generic Must and List methods later...
// The return value is equivalent to the second return value from net.ParseCIDR. Note that | ||
// this means that if the CIDR string has bits set beyond the prefix length (e.g., the "5" | ||
// in "192.168.1.5/24"), those bits are simply discarded. | ||
func ParseIPNet(cidrStr string) (*net.IPNet, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most ParseCIDRSloppy callers ignore the first argument anyway (danwinship/kubernetes@dce0e89). Half of the ones that don't were trying to manually implement what IPFromInterfaceAddr
does, so they can use that now. (The remainder will use ParseIPAsIPNet, coming in a few commits.)
And again, the consistent return value helps with generics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, the renaming from ParseCIDR
to ParseIPNet
is to be consistent with ParseIP
, ParseAddr
, and ParsePrefix
; each function uses the basename of the type it returns.
// CIDR's prefix length. | ||
// | ||
// Compare ParseAddrAsPrefix, which returns a netip.Prefix, but is otherwise identical. | ||
func ParseIPAsIPNet(cidrStr string) (*net.IPNet, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the naming here... I occasionally feel like it should be the other way around? (ParseIPNetAsIP
... though the idea of the existing name is that you're parsing an IP, and getting the value as an IPNet). Maybe I should just use the "ifaddr" naming. (ParseIFAddrAsIPNet
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this "ParseIP..." and not "ParseIPNet..." or "ParseCIDR..." if the input is "cidrStr"?
ParseCIDRAsIPNet()
?
Or maybe lean on "Addr" and "Prefix" names as the norm, so ParsePrefixAsIPNet()
?
More generally Parse(Addr|Prefix)(As(IP|IPNet))?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this "ParseIP..." and not "ParseIPNet..." or "ParseCIDR..." if the input is "cidrStr"?
So as I said, I'm still not quite sure how to name these functions correctly, but the important point is that cidrStr
is an IP address written in CIDR form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am out of touch with the formal language?
An IP address says to me: 1.2.3.4
or ::1
or similar.
CIDR notation says to me: 10.0.0.0/8
or ::1/128
As soon as you hit the "/"
it's not longer an "IP address". I know "CIDR" isn't really a noun, but we all tend to use it as one, right?
So if I see a function ParseCIDR...()
or a variable cidr string
I expect an IP, a slash, and an int. (I don't know if we allow a naked IP and perhaps infer /32 or /128 ?)
If I see a function ParseIP...()
or a variable ip string
, I expect a naked IP, and anything further makes it not an IP.
Am I misunderstanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's syntactically a CIDR string, but it's semantically an IP address.
99% of CIDR strings in Kubernetes represent either a subnet (as in node.status.podCIDRs[0]
) or a mask (as in networkPolicy.spec.egress[0].to[0].ipBlock.cidr
), and that's what ParseIPNet
/ParsePrefix
are for.
This function is for CIDR strings that represent an IP address with associated information about its subnet attached (eg "192.168.1.5/24") as in resourceClaim.status.devices[0].networkData.ips[0]
. (And nowhere else in core k8s, though they're common in CNI plugins because that's how IPs on network interfaces are normally represented.)
Since this is mostly used for addresses on interfaces, we call it "ifaddr" format in some places, and I'd mentioned we could do that here (ParseIFAddrAsIPNet
/ ParseIFAddrAsPrefix
).
(We need separate functions for subnet/mask-like CIDRs and ifaddr-like CIDRs to preserve backward-compatible handling of pre-KEP-4858 CIDR fields: traditionally, if a user put a value like "192.168.1.5/24" into a NetworkPolicy ipBlock, code using ParseCIDRSloppy
would see it as though you had said "192.168.1.0/24", and so ParseIPNet
/ParsePrefix
work the same way. You need the new function for cases like the DRA field, where you don't want that backward-compatible ignoring of the lower bits.)
} | ||
|
||
// MustParseIP is like ParseIP, but it panics on error instead of returning an error value. | ||
var MustParseIP = must(ParseIP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So these get used A LOT in k/k in unit tests, eg danwinship/kubernetes@88ba5a5, danwinship/kubernetes@e47bc9b.
In my first version of this, I actually had an exported generic method rather than exported individual functions, so callers would do ip := netutils.MustParse[net.IP]("1.2.3.4")
or prefix := netutils.MustParse[netip.Prefix]("1.2.3.0/24")
. One problem with this though was that there was no way to use the macros with both ParseIPNet
and ParseIPAsIPNet
though, since netutils.MustParse[*net.IPNet]
can only mean one thing. Oh, another annoying thing is that you need to include the *
in *net.IPNet
, but of course not with any of the other types.
So then I thought about netutils.MustParse[netutils.ParseIP]("1.2.3.4")
or netutils.MustParse(netutils.ParseIP("1.2.3.4"))
(yeah, you can make that work, kubernetes/kubernetes#127882). But I didn't like the fact that you have to say netutils
twice... So anyway, that led to what you see here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other reason I'd originally had netutils.MustParse[net.IP]()
was because I was also originally trying to use netutils.Parse[net.IP]
and netutils.Parse[netip.Addr]
rather than having separate functions for the two types. But that was based on the idea that golang would do more type inference than it actually does and so you'd be able to omit the type constraints just like you can with the genericized IPFamilyOf
. But that doesn't actually work; golang never infers the type of a return value, so you end up always needing to include the constraint, so it ends up being ugly.
var MustParseIPAsIPNetList = mustlist(ParseIPAsIPNetList) | ||
|
||
// MustParseAddrAsPrefixList parses a list of strings with ParseAddrAsPrefix and returns a []netip.Prefix or else panics on error. | ||
var MustParseAddrAsPrefixList = mustlist(ParseAddrAsPrefixList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also thought about having a canonicalization wrapper, that returns an error if the parsed result doesn't re-stringify back to what you passed in. But then, like, do we need to have MustParseIPListCanonical
and every other possible combination? So, that's a point in favor of exposing the generic functions rather than exposing the specific methods, so then someone could do netutils.MustParse(netutils.ParseList(netutils.Canonical(netutils.ParseIP)))
...
or not...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this overall. Some of the "when is it a pointer and when is it a value" is annoying, but that seems like vestiges of legacy net
APIs.
Should we mark all new APIs which use net
types as deprecated-at-launch and start the process of converting to newer netip
APIs?
Should we lean on the word "Legacy" or "Old" ?
Should we make the old (v1) functions forward to v2 functions when we can?
return func(str string) T { | ||
ret, err := parse(str) | ||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to add a prefix like "must() failed: %s" or something?
} | ||
} | ||
|
||
// MustParseIP is like ParseIP, but it panics on error instead of returning an error value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these should probably explain "for test use only" -- Go doesn't give us a good way to enforce that AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could have net/v2/testing
...
(I feel like we ought to be able to use MustParseIP
when parsing already-validated fields from API objects. Certainly there is lots of existing code that uses net.ParseIP
and doesn't bother to check if there was an error and possibly eventually crashes if it got back nil
. But also, it just feels wrong to me to actually panic on bad data even if we allegedly guaranteed it can't be bad... But ALSO, I hate it when I see code handling parse errors on values that can't possibly fail to parse... I've been thinking we should make more use of pre-parsed field types in API objects. Like, change all the IP-valued fields from string
to "metav1.IP
", which is something like struct { Raw string; Canonical string; Parsed netip.Addr }
so that then people who need the parsed value already have it...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we ought to be able to use MustParseIP when parsing already-validated fields from API objects.
I guess that's OK, but is it better than just assigning the error to _
?
Like, change all the IP-valued fields from string to "metav1.IP", which is something like struct { Raw string; Canonical string; Parsed netip.Addr } so that then people who need the parsed value already have it...)
I love that, and I wish we had done that. The problem is that the type definitions we use for clients of the API (e.g. k,ube-proxy) are also used to emit protobuf and docs and everything else. Changing anything is a breaking change.
If I have my way (declarative validation leading the charge), we end up with an IDL which describes the on-the-wire schema (and generates openapi docs and proto, etc) and ALSO generates client code which would be wonderful for what you are describing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for a testing sub-pkg, I don't think we need to go that far, I think comments alone are useful.
func ParseIP(ipStr string) (net.IP, error) { | ||
// Note: if we want to get rid of forkednet, we should be able to use some | ||
// invocation of regexp.ReplaceAllString to get rid of leading 0s in ipStr. | ||
ip := forkednet.ParseIP(ipStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ever get to a place where we don't allow leading 0s? If we did leading-zero cleanup in the API (e.g. upon read) then this layer could be clean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KEP-4858 doesn't do anything about config files/CLI arguments. We currently don't even warn about leading 0s in that case.
In the larger ecosystem, there may also be aggregated APIs that didn't get updated along with KEP-4858, CRDs with pre-kubernetes#121912 IP/CIDR validation, and IPs/CIDRs in annotations.
We could explicitly distinguish legacy-vs-"correct" parsing in the API like we now do for utilvalidation.IsValidIP
/ utilvalidation.IsValidIPForLegacyField
, and encourage people to use the "correct" variants when appropriate, but we'd have to do it separately for the net
and netip
types, so that would be 4 new base functions, plus, potentially, corresponding Must
, List
, etc functions... (unless we can find a better way to deal with the API combinatoric explosion...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declarative validation has a notion of automatic ratcheting. If, during an update, a field has not change from the "old" value, we don't even bother to validate it. This means that some of the "IsValidIPForLegacyField" stuff should be handled automatically. We have not done a full audit yet, and I know if a few places that won't be sufficient, and we will need more.
Let's clarify our goal.
WRT validation, I think we can get to a place where we do not allow new uses of leading-zero IPs in the API and config files. It will tolerate existing uses of them but not allow new uses, and when they are found it will simply pass the value on to clients without change.
The question then is: can we ever get to a place where we don't pass the value on to clients (so libraries like this don't need to handle them)? I forget if your KEP covered this?
E.g. we could "sanitize" them at the API layer, at the potential cost of apply-loop. Like, on read we convert "001.002.003.004" to "1.2.3.4" and on write we do the same, storing the sanitized form.
Or are we FOREVER going to have forkednet
and a caveat to kube API clients that you MIGHT hit one of these...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WRT validation, I think we can get to a place where we do not allow new uses of leading-zero IPs in the API and config files.
Right; we already basically have that for API. I forget what the last new legacy IP field was, but it was actually added several releases prior to the start of KEP-4858.
I kind of ended up punting on config files, and maybe I need to get back to that. Config validation generally doesn't use utilvalidation
, so they are not using strict validation yet, even conditionally. (Or even getting warnings). (I think the failure to use utilvalidation
is because the same code gets used to validate config file and CLI args, and it would be confusing to give errors in terms of field paths when the user had provided the value via the CLI?)
But config files are also impossible to ratchet, because they have no way of distinguishing config values the admin just added to the file, from config values that the admin added to the file 5 releases ago. So for config files, we'd need to warn about bad values for a while, and then eventually just start rejecting them unconditionally.
can we ever get to a place where we don't pass the value on to clients (so libraries like this don't need to handle them)? I forget if your KEP covered this?
We deferred any decisions about fixing existing data (kubernetes/enhancements#4899 (comment)). This has been brought up multiple times since we first started scrambling to deal with the original golang behavior change (eg, initial comment of kubernetes/kubernetes#108074) but I don't think we've ever deeply thought about the pros and cons.
Or are we FOREVER going to have
forkednet
We can get rid of forkednet
with some regex
if we really want to...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are config and flags really a problem? I do not feel they are, IIRC the problem was with the APIs because different actors can interpret different meanings. However, flags and configs are interpreted only by one component, so there can not be a problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for config/flags the only thing we would really have to worry about is a component reading a "bad" value and then passing it to someone else.
But actually, since there's no "reconciliation loop" problem with config/cli, we could just do repair-on-read and always pass the canonicalized value to the component, right?
So I guess the fix for config/CLI is to add some helpers to log warnings and then canonicalize the values when reading in bad config values.
// CIDR's prefix length. | ||
// | ||
// Compare ParseAddrAsPrefix, which returns a netip.Prefix, but is otherwise identical. | ||
func ParseIPAsIPNet(cidrStr string) (*net.IPNet, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this "ParseIP..." and not "ParseIPNet..." or "ParseCIDR..." if the input is "cidrStr"?
ParseCIDRAsIPNet()
?
Or maybe lean on "Addr" and "Prefix" names as the norm, so ParsePrefixAsIPNet()
?
More generally Parse(Addr|Prefix)(As(IP|IPNet))?
// Note that "k8s.io/utils/net/v2".IPFamily intentionally has identical values to | ||
// "k8s.io/api/core/v1".IPFamily and "k8s.io/discovery/v1".AddressType, so you can cast | ||
// the return value of this function to those types. | ||
func IPFamilyOfCIDR[T cidrOrString](val T) IPFamily { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of a poor-man's function overload. I personally don't hate it, but it smells kind of un-go-like.
Are there other places in the ecosystem that do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah, it's totally un-go-like, as seen by the ugliness of the implementation. (But it feels right when you actually use the functions...) As commented elsewhere, I did it this way to avoid needing to add another set of methods for the netip
types, but given that we need separate parsing methods for net
and netip
types, maybe having separate categorization methods for them too isn't awful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more go-ish to just explode the number of functions and then fan-in on the impl. CF go doc regexp
, which says:
There are 16 methods of Regexp that match a regular expression and identify the
matched text. Their names are matched by this regular expression:
Find(All)?(String)?(Submatch)?(Index)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to find examples in other libraries of this usage but I couldn't find it ... generics provide the benefit of type safety at compile time, but since one of the types set is a string I wonder the benefits vs casting an interface
The go stdlib has some similar things, like ioutil => io, json/v2 .. |
Yes.
When we had first started talking about this back when |
Kubernetes is big in the Go ecosystem -- we should embody the change we want to see. This looks like a great move directionally. |
Right, hopefully our moving toward
The biggest porting problem is actually our usage of the But anyway, maybe it would make sense to port all of our internal usage to |
"strings" | ||
) | ||
|
||
// AddrFromIP converts a net.IP to a netip.Addr. Given valid input this will always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be also part of golang/net library IMHO , it is a very dangerous behavior and subtle
if bits == 128 && addr.Is4() && (bits-prefixLen <= 32) { | ||
// In the same way that net.IP allows an IPv4 IP to be either 4 or 16 | ||
// bytes (32 or 128 bits), *net.IPNet allows an IPv4 CIDR to have either a | ||
// 32-bit or a 128-bit mask. If the mask is 128 bits, we discard the | ||
// leftmost 96 bits. | ||
prefixLen -= 128 - 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😵 😜
// Make sure this can be called repeatedly without explosions. | ||
select { | ||
case <-ml.stopCh: | ||
return fmt.Errorf("use of closed network connection") | ||
default: | ||
} | ||
|
||
// Tell all sub-listeners to stop. | ||
close(ml.stopCh) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was my fault on the original PR , anyway you should update the code to fix this possible panic
@@ -0,0 +1,202 @@ | |||
/* | |||
Copyright 2024 The Kubernetes Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(random nit: this is 2024, unsure if that really matters though)
It's no longer used in k/k and is considered a bad idea now.
These aren't currently used by k/k anyway, and could be replaced with set[netip.Addr] and set[netip.Prefix] in new code. Also, ParseIPSet() and ParseIPNets() had strange whitespace-ignoring semantics that we don't want anyway.
99% of callers are calling these functions on already-validated data, making the error return values just annoying.
A manually-constructed IPNet could have a Mask that doesn't match its IP (in which case .String() would return "<nil>" and .Contains() would always return false). So treat that as IPFamilyUnknown.
(so you can cast between them)
- Squash the String functions (eg, IsIPv4String, IsDualStackCIDRStrings) into their corresponding non-string versions (eg, IsIPv4, IsDualStackCIDRs). The new versions use generics to support both types. - Add IsDualStackPair and IsDualStackCIDRPair - Add OtherIPFamily
All the functions that previously took a net.IP or string now also take a netip.Addr. All the functions that previously took a *net.IPNet or string now also take a netip.Prefix.
We will need to deal with both net.IP/net.IPNet and netip.Addr/netip.Prefix for a while, so add functions to convert between the types.
net.InterfaceAddrs() has a weird and useless return value that various pieces of code parse in variously correct ways. Add functions to reliably convert its return values to net.IP or netip.Addr.
Replace ParseIPSloppy and ParseCIDRSloppy with ParseIP and ParseIPNet, where both have a single return value plus an error.
This is the latest version of what was originally kubernetes/kubernetes#122549.
Big ideas:
I'm not sure that "k8s.io/utils/net/v2" is necessarily the right way to do this. An alternative would be to add all the new stuff to "k8s.io/utils/ip" instead (and deprecate the IP-related methods in utils/net).
https://github.com/danwinship/kubernetes/commits/netutilsv2/ is an example of porting k/k to this, which helps to motivate some of the new stuff (particularly the "MustParse*" and "Parse*List" functions).
/kind feature
/kind cleanup
/cc @aojea