Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with managing external machines with lima #2000

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

afbjorklund
Copy link
Member

@afbjorklund afbjorklund commented Nov 15, 2023

Similar to the docker-machine "generic" driver, bring your own virtual machine (or physical server)

Not so useful in itself, but not so bad when made into a real driver or wrapped with helper scripts...

NAME          STATUS     SSH                   VMTYPE    ARCH      CPUS    MEMORY    DISK      DIR
beaglebone    Running    192.168.7.2:22        ext       armv7l    1       512MiB    4GiB      ~/.lima/beaglebone
core          Stopped    127.0.0.1:0           qemu      x86_64    1       1GiB      100GiB    ~/.lima/core
vmType: ext

arch: "armv7l"
cpus: 1
memory: 512MiB
disk: 4GiB

# We do not have arm-v7 binaries of containerd
containerd:
  system: false
  user: false

ssh:
  address: 192.168.7.2

Requires:


Installed lima-guestagent, and nerdctl from tarballs/binaries.

$ _output/bin/limactl shell beaglebone nerdctl version
Client:
 Version:	v1.7.0
 OS/Arch:	linux/arm
 Git commit:	e674fe7ba6e49f12e88cd9c6c442e7ea5232502c
 buildctl:
  Version:	v0.12.3
  GitCommit:	438f47256f0decd64cc96084e22d3357da494c27

Server:
 containerd:
  Version:	v1.7.6
  GitCommit:	091922f03c2762540fd057fba91260237ff86acb
 runc:
  Version:	1.1.9
  GitCommit:	v1.1.9-0-gccaecfc

limactl guest-install beaglebone


Hardware

https://www.beagleboard.org/boards/beaglebone-black

You could also use a Raspberry Pi Zero*, or a cloud droplet.

* need the Zero 2, for arm-v7 (previous model was arm-v6)

https://www.raspberrypi.com/products/raspberry-pi-zero-2-w/

Discussion

@afbjorklund
Copy link
Member Author

afbjorklund commented Nov 18, 2023

The probes are somewhat annoying when not using cidata, but that was the same story on FreeBSD and others.

Maybe there should be some fallback implementation, at least for ssh-ready/guestagent install/boot-done ?

sudo diff -q /run/lima-ssh-ready /mnt/lima-cidata/meta-data

install -m 755 /mnt/lima-cidata/lima-guestagent /usr/local/bin/lima-guestagent
sudo /usr/local/bin/lima-guestagent install-systemd

sudo diff -q /run/lima-boot-done /mnt/lima-cidata/meta-data

Like copying the lima-guestagent over the ssh connection (my workaround)

And keeping the instance meta-data (id) somewhere else, like in /etc ?

@afbjorklund
Copy link
Member Author

afbjorklund commented Nov 22, 2023

Added support for hostnames, so that you can use nice features like avahi-daemon

NAME           STATUS     SSH                     VMTYPE    ARCH       CPUS    MEMORY    DISK      DIR
raspberrypi    Running    raspberrypi.local:22    ext       aarch64    4       512MiB    32GiB     ~/.lima/raspberrypi
anders@raspberrypi:~ $ sudo mkdir /mnt/lima-cidata
anders@raspberrypi:~ $ sudo touch /mnt/lima-cidata/meta-data
anders@raspberrypi:~ $ sudo touch /run/lima-ssh-ready
anders@raspberrypi:~ $ sudo touch /run/lima-boot-done

Previously I was assuming IP. Now, to have a nice way to install the lima-guestagent

@afbjorklund
Copy link
Member Author

afbjorklund commented Nov 29, 2023

Added host key checking:

The authenticity of host 'raspberrypi.local (192.168.0.113)' can't be established.
ECDSA key fingerprint is SHA256:tKIRfOeWP1HeCFLpM0UT30CUWSDXpC7gxPsKHUnS+h4.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes

And shortened the timeout, for when the device is not connected. Resolves in 10ms, when cached (or ~250ms).

errors="[field `ssh.address` must be IP: lookup raspberrypi.local: i/o timeout]"

@afbjorklund
Copy link
Member Author

Added provision scripts.

Using sudo for system.

pkg/sshutil/sshutil.go Outdated Show resolved Hide resolved
@@ -66,6 +66,7 @@ const (
QEMU VMType = "qemu"
VZ VMType = "vz"
WSL2 VMType = "wsl2"
EXT VMType = "ext"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be called "external" to avoid confusion with "extended"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In minikube we ended up calling the driver "ssh", but I don't think it's great

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @lima-vm/maintainers @lima-vm/reviewers RFC

I still feel "ext" is confusing.
Sounds like some sort of ext2/ext3/ext4 stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think external is fine. unmanaged or raw could be alternatives, but I prefer external.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @afbjorklund WDYT?

@@ -117,6 +117,9 @@ additionalDisks:
# fsType: "ext4"

ssh:
# Address for the host.
# 🟢 Builtin default: "127.0.0.1" (localhost)
address: null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does localPort mean now for non-local address?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used as "port", but the old name is kept for compatibility.

If the port is zero, then for non-localhost it will be set to 22.
For localhost (127.0.0.1), it is still assigned to a random port.

Normally it is not set, and only the IP address is used for SSH.

@afbjorklund
Copy link
Member Author

afbjorklund commented Mar 18, 2024

Rebased to lima v0.21.0 [RPi runs Debian GNU/Linux 12 (bookworm)]

$ _output/bin/limactl shell raspberrypi nerdctl version
Client:
 Version:	v1.7.5
 OS/Arch:	linux/arm64
 Git commit:	cffed372371dcbea3dc9a646ce5a913fc1c09513
 buildctl:
  Version:	v0.12.5
  GitCommit:	bac3f2b673f3f9d33e79046008e7a38e856b3dc6

Server:
 containerd:
  Version:	v1.7.14
  GitCommit:	dcf2847247e18caba8dce86522029642f60fe96b
 runc:
  Version:	1.1.12
  GitCommit:	v1.1.12-0-g51d5e946

Quite svelte, without the cidata etc: 44K /home/anders/.lima/raspberrypi

NAME           STATUS     SSH                     VMTYPE    ARCH       CPUS    MEMORY    DISK     DIR
raspberrypi    Running    raspberrypi.local:22    ext       aarch64    4       512MiB    32GiB    ~/.lima/raspberrypi

@AkihiroSuda
Copy link
Member

Would it be possible to test this on CI?

@AkihiroSuda AkihiroSuda added this to the v0.21.1 milestone Apr 22, 2024
@afbjorklund
Copy link
Member Author

afbjorklund commented Apr 22, 2024

Would it be possible to test this on CI?

As long as it is possible to supply a VM, with access through host keys and authorized keys, that should be possible.

I should detail the required steps (with example log), especially now with the addition of the cloud-config generation.

@afbjorklund afbjorklund removed this from the v0.22.0 milestone Apr 26, 2024
@afbjorklund afbjorklund marked this pull request as draft April 26, 2024 06:01
@afbjorklund
Copy link
Member Author

afbjorklund commented Jul 9, 2024

Probably not a great idea to have the validation fail, when the host is offline (or not available)

field `ssh.address` must be IP: lookup raspberrypi.local: i/o timeout"

The timeout was tricky to get right anyway...

Too long is annoying, too short might fail.

@afbjorklund afbjorklund force-pushed the external branch 2 times, most recently from e27900d to d2f506c Compare July 9, 2024 16:52
@afbjorklund
Copy link
Member Author

afbjorklund commented Jul 10, 2024

LXD timed out in 30 minutes, not sure if it would work to run the "VM" with it...

https://documentation.ubuntu.com/lxd/en/latest/tutorial/first_steps/

EDIT: Might be some secret clues here:

https://github.com/canonical/setup-lxd

@afbjorklund
Copy link
Member Author

Maybe this is a better way to do it, i.e. using GitHub Docker instead of Canonical LXD:

https://docs.github.com/en/actions/using-containerized-services/about-service-containers

Still needs a custom image, with sshd?

And setting up keys, possibly cloud-init

@AkihiroSuda
Copy link
Member

Maybe we can just use limactl create --plane with QEMU and re-register the instance with the generic driver

@afbjorklund
Copy link
Member Author

afbjorklund commented Jul 11, 2024

Maybe we can just use limactl create --plane with QEMU and re-register the instance with the generic driver

I removed all the old LXD code, and there are still concerns about testing containerd and reverse-sshfs with this...

i.e. when trying to run a fake machine in a container, that leads to problems not happening on "real" machines

Whether those are actually real (hardware) machines, or if they are virtual machines emulating real hardware.


EDIT: "plain" removes most of the lima features:

When the "plain" mode is enabled:

  • the YAML properties for mounts, port forwarding, containerd, etc. will be ignored
  • guest agent will not be running
  • dependency packages like sshfs will not be installed into the VM

Currently the guestagent with mounts and forwards is the main feature.

It and sshfs and containerd are currently supposed to be installed by the user:

The provision scripts are run separately (not from cloud-init), through ssh...

@afbjorklund afbjorklund marked this pull request as ready for review August 31, 2024 18:02
@AkihiroSuda
Copy link
Member

Thanks for working on this, but I still think "ext"is confusing and should be called like "external" or something.
This also needs an integration test.

@afbjorklund afbjorklund marked this pull request as draft September 1, 2024 09:15
@afbjorklund
Copy link
Member Author

afbjorklund commented Sep 1, 2024

I will give it some more thought, what the new name of the "generic" driver should be (naming things is hard...)

Integration tests with (non-hosted) GitHub Actions seems to be very limited? Tried faking it with LXD, but it failed.


It does not have to be in 1.0, even if it makes for a nice demo.

There is still more documentation left to do, mostly external.

Signed-off-by: Anders F Björklund <[email protected]>
Signed-off-by: Anders F Björklund <[email protected]>
Verify ssh host keys, when connecting to a remote server.

The first connection will prompt, if not in known_hosts.

Signed-off-by: Anders F Björklund <[email protected]>
Signed-off-by: Anders F Björklund <[email protected]>
Signed-off-by: Anders F Björklund <[email protected]>
Signed-off-by: Anders F Björklund <[email protected]>
It is not using cloud-init anyway, and does not need
another copy of lima-guestagent and nerdctl-full.tgz

Signed-off-by: Anders F Björklund <[email protected]>
Signed-off-by: Anders F Björklund <[email protected]>
Signed-off-by: Anders F Björklund <[email protected]>
Signed-off-by: Anders F Björklund <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants