-
Notifications
You must be signed in to change notification settings - Fork 1.1k
V2 certificate format #1216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V2 certificate format #1216
Conversation
9f75b80 to
a09f39f
Compare
--------- Co-authored-by: Jack Doan <[email protected]>
…didn't work on the first try (#1268)
|
TODO: |
|
This PR no longer exports the certificate structs; neither v1 or v2: Before: It is going to be a big breaking change and inconvenience for anyone who has used and uses the libraries. Would prefer to keep these exported. I see that it is a breaking change PR, but it should at least allow restoring the same functionality as was available before. |
|
I see now it’s switched to a factory approach, with TBSCertificate that is exported. 👍 A few thoughts to consider: A helper function for converting a Certificate back to a TBSCertificate would be useful to complete the factory loop and allow manipulation of existing signed certificates (as unsigned ones). There are also some Marshal functions like Marshal(), but without a corresponding Unmarshal(). I imagine most people are using pem encoded certs, but if Marshal() is exposed there should probably be a corresponding Unmarshal(). |
cert/cert_v2.go
Outdated
| networks []netip.Prefix | ||
| unsafeNetworks []netip.Prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned this as a reply to a resolved comment already, but in case it's folded and not seen: I think the name networks and unsafeNetworks are confusing. The former one is an address + its on-link network prefix, while the latter is just a network prefix. For the former, only a single address is routable to the host, while all addresses in the network is routable to the host for the latter.
I would propose the name to be addresses (I think it's fine even though this expects an CIDR, as the input is same as what ip address add command expects) which clearly indicate that the address part of the network prefix is significant, and routable_networks (or maybe unsafe_routable_networks) which clearly indicates that these are CIDRs for routing only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I share concerns of confusion with the naming here. I'm also comfortable sticking with the old name which at least creates no new confusion (especially with respect to existing documentation and discussions): #1216 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also be open to the name addresses for the networks field. I think the name networks came up when we were considering it in the context of a CA certificate? Using that same field name for two very-similar-but-different purposes muddies the waters a little bit, but the intent was that it shows which networks (as V2 certs let you have more than one!) the host participates in, as well as the actual address assignment on non-CA certs. However, the ip address add argument is very persuasive and probably much more obvious to new users.
Naming unsafeNetworks is a little trickier. When the addresses field is named networks, I think it's a pretty logical name, but unsafeAddresses doesn't really work. I'd propose something like unsafeRouterFor: the prefix unsafe links it mentally to unsafe_routes in the config, and I think RouterFor shows that the field expresses a capability, rather than something that might affect how "regular" Nebula traffic is routed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think addresses places emphasis on the use of the ip; networks on the prefix use. But ultimately both are used and the content passed in is CIDRs? Perhaps better to reflect what it is and what is passed in by the user rather than the things it’s used for (similar to before where it was called Ips but CIDRs would probably be more accurate).
The bigger challenge before was having to manually change the ip when using ParseCIDR because it returned the base address instead of the provided IP, and the confusion of the certificate field ‘subnets’. Both of those are no longer an issue so bounds better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Jack said, the issue is that the field is used for dual purposes. For a non-CA cert, addresses and unsafeRouterFor makes more sense, but for CA cert, networks and unsafeNetworks make more sense. I would say it makes more sense to cater for the non-CA cert case since there're more of them, or perhaps they can have different names for different nebula-cert commands?
* enforce certificate correctness in TBSCertificate.SignWith * check length, not nil * Address review comments * github hates me --------- Co-authored-by: Nate Brown <[email protected]> Co-authored-by: Jack Doan <[email protected]>
wadey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved to merge and fix forward. Left a few comments we can fix in the future
| **/coverage.out | ||
| **/cover.out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| **/coverage.out | |
| **/cover.out | |
| coverage.out | |
| cover.out |
You can just remove the leading / and it matches that path in any folder.
| cf := caFlags{set: flag.NewFlagSet("ca", flag.ContinueOnError)} | ||
| cf.set.Usage = func() {} | ||
| cf.name = cf.set.String("name", "", "Required: name of the certificate authority") | ||
| cf.version = cf.set.Uint("version", uint(cert.Version2), "Optional: version of the certificate format to use") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we default to version 1 for now?
| // Race loser renews and handshakes | ||
| // Does race winner repin the cert to old? | ||
| //TODO: add a test with many lies | ||
| func TestV2NonPrimaryWithLighthouse(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like all of the other e2e handshake tests are for V1 certs? Do we need more tests with V2?
| droppedLocalAddr: metrics.GetOrRegisterCounter("firewall.incoming.dropped.local_addr", nil), | ||
| droppedRemoteAddr: metrics.GetOrRegisterCounter("firewall.incoming.dropped.remote_addr", nil), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to indicate in the release notes that these metrics names are changing
This is a near complete implementation of an ipv6 enabled overlay network. There are a handful of pull requests targeting this branch in addition to a number of incomplete issues within this PR.
I have tried to highlight the main trouble spots with comments on the PR and in addition here are my internal notes:
nebula-certis defaulting to mint bothv1andv2certificates and nebula config is defaulting to transmitv1certificates if both are present. This will lead to a situation where net new adopters will be stuck using v1 certificates. Should this be the default behavior?Currently nebula wont allow you to remove a
v1certificate on reload but this means a restart is required to complete a migration tov2. A comment on this review highlights this, we should allow it.Currently there are a few spots (control, ssh) where we return the certificate in use but they only return 1 cert, should we leave it returning the default or force folks to consume both? How do we indicate which one is default?
Every time we encounter a certificate we need to sort the networks as well as only record the union of applicable addresses. See [cert-v2] punchy-respond on an address in common with the querying host #1261
We need to take extra care to ensure we never allow a ipv4 mapped ipv6 address. This is not currently addressed in this PR.
There may be an opportunity to improve the lighthouse protocol by directly using the
MarshalBinaryandUnmarshalBinaryfunctions fornetipobjects. This is more of a nit than anything but we should agree on this before merging.nebula-certsignhelp text needs to talk about primary vpn address (or the union of effective addresses) being the first one after being sorted and what that means for tunnels.The firewall config still uses naming like
ca-shashould we alias this toca-fingerprintto normalize the verbiage?The firewall config still uses naming like
ipshould we alias this tonetworkto normalize the verbiageNormalize all names for
Marshal*andUnmarshal*, we have someFrom*/To*and others withoutFrom/To.Should we use this moment to swap p256 keys to use their compressed form? Saves ~32 bytes.
Once this is merged the upgrade path should likely be:
pki.default_versionto1. Everything continues to usev1protocols.am_relay: truehostspki.default_versionto2. Relays can now initiatev2tunnels, but this is an uncommon flow.pki.default_versionto2. Lighthouses will reply tov2hosts usingv2protocols.pki.default_versionto2.v1certs from all hosts