Skip to content

Conversation

@jabr
Copy link
Contributor

@jabr jabr commented Nov 3, 2025

Adds an alias to the connections config to use in --connect option.

It also stores the name with the connection info when adding a machine, updates the name when renaming the machine, and removes the connection entry by name when removing the machine.

Just cli changes which are working for me locally in --connect usage, but I have not testing the add/rename/remove bit. (We could remove the auto-update parts for an initial version, too. Even just being able to add manual names/labels is pretty convenient on its own.)

…name}` option

Also updates config entries when adding, renaming, or removing a machine.
Copy link
Owner

@psviderski psviderski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I like the idea of being able to name/label the connections 👍
Please see the comment discussing the change in --connect behaviour. Let me know what you think

conn = &config.MachineConnection{
SSH: config.SSHDestination(dest),
}
} else {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feeling about this change. Originally, --connect was added to connect using an explicitly specified connection string without requiring the config:

      --connect string          Connect to a remote cluster machine without using the Uncloud configuration file. [$UNCLOUD_CONNECT]

The separation was IMO clear, but now --connect may involve the config as well so I'm trying to figure out if we may potentially confuse users with this 🤔 How would you explain --connect now?

Most commands also define the --context flag which we should have made global long time ago. With the --context specified, this branch using the current context the config will be misleading.

I think we shouldn't change the current behaviour of --connect to keep things simple.

What scenarios do you have on your mind where you would want to manually specify what connection to use for a specific command? Maybe we should describe some strong use cases first to work out what the best UX is.

From my personal experience so far, I had to manually switch connections only when I try to access my home cluster when I'm not at home (rarely). I described the use case here: #119 (comment). But in this case, I need to change the default once and then use it for the entire session until I'm back home. For my use case, having an interactive command for changing the default connection or a non-interactive one like uc ctx connection use <name>/<index> will be enough.

If for some reason one need to run a one-off command connected to specific machine, I think specifying its SSH target should be fine. More over, @luislavena introduced a new ssh+cli way to connect to machines #152 which we plan to make the default. It will support any configuration and shortcuts defined in .ssh/config config.

So if one defines something like this in .ssh/config:

Host myserver
    HostName 192.168.100.200
    User myuser

they can ssh to it using just ssh myserver. This will automatically become supported by uncloud:

uc machine init myserver
uc machine add myserver
uc --connect myserver COMMAND

I think this path of supporting .ssh/config looks more versatile as it will work with both regular ssh and uncloud the same way.

Please let me know what you think @jabr @luislavena

Copy link
Contributor Author

@jabr jabr Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inspiration to add it came up when working on the wireguard command: for that tool, you really need to be connecting directly to the machine you want to inspect, as proxying the gRPC call is likely not going to work when you actually need to use that command (i.e. the machines aren't talking properly so they can't proxy the call).

So I wanted to use the --connect option for that, but I couldn't because my ssh connection strings are complicated:

  1. I have a separate keypair for my uncloud machines, so just doing --connect user@host wasn't enough -- I would need to specify the key somehow, too. (Maybe the -i option works with --connect? I didn't think to try that at the time.)

  2. The user names for the different machines are a bit odd (cloud host provisioned "system" accounts which are different on some machines and have random numbers for some reason).

  3. I'd need to configure /etc/hosts aliases for the IPs.

So I was just editing the config.yaml to switch the default whenever I actually wanted to use the --connect option.

However, the ssh+cli option now will make that a lot better (I do have them configured properly in my ~/.ssh/config) so it'll just be --connect ssh+cli://{machine-name} for me.

For what it's worth, I'd tried that --connect {machine-name} approach early on when starting with uncloud, expecting it to work like this. I might be the only one, but it was a bit just ux/dx polish, imo -- it felt like that "should work" to me. 🤷

Maybe it should just be logic in the wireguard command, like uc wg show <machine>, since that was the most pertinent use case I had? Or another "protocol prefix" like tcp:, ssh+cli:, etc in the connect string? Another option flag to not overload connect and confuse the interaction with context?

Anyway, I'm not hard set on anything here. Just going through my thought process on why I ended up with it like this. 😄

Copy link
Contributor Author

@jabr jabr Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought I had last night:

The one thing in this branch that I think we're likely all agreed is useful is removing the connection entry for a machine when the machine itself is removed from the cluster. (There was a TODO note in the code about doing that.)

To do that, we'll need something to connect the entry to the machine itself, but perhaps that should just be the machine ID. And then this branch (or another branch/PR might be better for a clean start) just does the add/remove bits of this but with the ID not the name (and machine rename no longer being relevant).

I'm not sure the "named connection" is then even necessary with Luis's ssh+cli option now, but if we decide to do something like it, we can do that as a separate issue, which should be able to build on the machine ID association we add here.

What do you think?

Copy link
Owner

@psviderski psviderski Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate you shared your thought process! This all totally makes a lot of sense

For what it's worth, I'd tried that --connect {machine-name} approach early on when starting with uncloud, expecting it to work like this. I might be the only one, but it was a bit just ux/dx polish, imo -- it felt like that "should work" to me. 🤷

I'm pretty sure you're not the only one and I'd personally love to be able to make this work like that reliably.
We could easily do that if we had a central authority we can talk to (e.g. a cloud service each machine connects to). But we're in a bit weird position where there is no single place that would maintain an up-to-date information about the machines in the cluster and connections to them.

Slightly tangential to this PR, but I'm planning on creating a managed public service that would optionally allow connecting machines to to simplify access and management. In this case, the connection may look like this (no ssh):

contexts:
  managed-cluster-access:
    connections:
      - hub: hub.uncloud.run
        token: <secret>

But this will only complement the current ssh options. So the problem with managing ssh connections still exists.

The one thing in this branch that I think we're likely all agreed is useful is removing the connection entry for a machine when the machine itself is removed from the cluster. (There was a TODO note in the code about doing that.)

Agreed

To do that, we'll need something to connect the entry to the machine itself, but perhaps that should just be the machine ID. And then this branch (or another branch/PR might be better for a clean start) just does the add/remove bits of this but with the ID not the name (and machine rename no longer being relevant).

Using machine ID sounds like the most "correct" (more stable) approach we considered so far. Although, not the most user-friendly imo. Note also that this won't save us from getting a stale config when there are multiple cluster users and one of them removes a machine. We can do our best effort but this still won't be a complete solution anyway unfortunately. I also don't expect that adding/removing machines is done very often so could not be a big of a problem.

I'm not sure the "named connection" is then even necessary with Luis's ssh+cli option now, but if we decide to do something like it, we can do that as a separate issue, which should be able to build on the machine ID association we add here.

I would love to make the @luislavena's work on ssh+cli the default first (assume ssh+cli:// prefix if none is specified) and see how it works. The ssh config provides so many different ways of accessing machine and it would be amazing to piggyback on this. Maybe it will cover all the needs in practice so we won't need to do anything custom on our own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to make the @luislavena's work on ssh+cli the default first (assume ssh+cli:// prefix if none is specified) and see how it works.

Yeah, I like that approach. It would also have "just worked" for me with my initial expectations that way, too, since I have ~/.ssh/config entries with the same aliases as my machine names.

I'd like to close this PR now, but I'd be happy to make new ones for either of these if you want to pursue them now:

  1. making ssh+cli the default over ssh
  2. adding machine id to the connection config entries on machine add and removing corresponding ones on machine rm

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. making ssh+cli the default over ssh

We can't make it the default until we address the issue described here #173 (comment). It would also be good to test it for some time for the practical scenarios before switching the default.

  1. adding machine id to the connection config entries on machine add and removing corresponding ones on machine rm

Let's do this one. This approach seems to be the most pragmatic of what we've discussed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR for storing Machine ID in corresponding connection (and removing with machine): #182

Closing this one.


// Remove the connection to the machine from the uncloud config if it exists.
if uncli.Config != nil {
context := uncli.Config.Contexts[uncli.Config.CurrentContext]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in another comment, we should use the manually specified context (opts.context) if specified and fall back to the current one.

@jabr jabr changed the title feat: Add name property to config connections for use in --connect {name} option feat: [draft] Add name property to config connections for use in --connect {name} option Nov 9, 2025
@jabr jabr closed this Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants