-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds current mkosi PID to qemu JSON state file name #2945
base: main
Are you sure you want to change the base?
Conversation
c017719
to
aa2e2c3
Compare
When mkosi crashes, Python does not always clean qemu state file. This patch appends current mkosi PID to the name, solving this problem and allowing other instances of mkosi to use the same machine name without being confused.
aa2e2c3
to
9af5fdf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this makes sense. It completely sidesteps the locking, since every filename is different (disregarding PID reuse), so the locking seems a bit pointless?
What kind of crashes sidestep the finally
?
with flock(INVOKING_USER.runtime_dir() / "machine"): | ||
if (p := INVOKING_USER.runtime_dir() / "machine" / f"{config.machine_or_name()}.json").exists(): | ||
with flock(machine_folder): | ||
if (p := machine_folder / f"{config.machine_or_name()}-{mkosi_pid:d}.json").exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The :d
is not necessary, numbers are by default formatted as decimal numbers.
if (p := machine_folder / f"{config.machine_or_name()}-{mkosi_pid:d}.json").exists(): | |
if (p := machine_folder / f"{config.machine_or_name()}-{mkosi_pid}.json").exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locking is done on the machine folder (typically /run/user/$uid/mkosi/machine
). This PR does not change that.
Is there any reason no two mkosi
instances should not use the same machine name?
If the only reason is that it's later used by mkosi
to ssh into the VM, then the current approach is fine.
What kind of crashes sidestep the
finally
?
I'm not sure if it's just ctrl+C
at the wrong time or if it's just when I try booting an image that crashes or if I quit the terminal without doing ctrl+C
but I can't seem to just be able to mkosi qemu
for a full day without having to manually clean the machine state directory. It's genuinely happening a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gdonval Are you on the latest commit on main? We recently added logic to handle SIGHUP
, that could maybe fix your issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm on 24.3.
At any rate, I'm a bit confused... Is the if ...: die()
here a simple way to prevent spurious overwrites/deletions or is there any value in making sure machine names are unique across the machine?
If it's the first one, I think my approach is valid. If it's the second one, maybe let's just not use hostname as default machine name value?
I insist on "default" because I don't really want to start seeing doc saying mkosi --machine=$(uuidgen) qemu
, just to make sure things work consistently, whether you happen to choose the same hostname in two mkosi
calls or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At any rate, I'm a bit confused... Is the if ...: die() here a simple way to prevent spurious overwrites/deletions or is there any value in making sure machine names are unique across the machine?
The SIGHUP
handling we have on git main will likely reduce the number of times you run into this as the files in /run/ will be properly cleaned up when mkosi qemu
crashes.
If it's the first one, I think my approach is valid. If it's the second one, maybe let's just not use hostname as default machine name value?
@gdonval The machine name is used to identify the target for mkosi ssh
so making it random would make that harder. We can probably pick a better name than default
though and allow disabling "registering" the machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for SIGHUP
.
@DaanDeMeyer, so
- The machine name is not a lock, therefore its existence should not make new instances die.
- That's a
mjosi ssh
concern, therefore it should not affect newmkosi qemu
calls.
I suggest:
- either don't write a state file, unless
--machine
is specified; - or write machine named with a counter appended (similar to this PR) so that
mkosi qemu
doesn't crash.
In the latter case, qemu ssh
can look for a precise machine name (with counter) or could look at the highest counter for some specific machine name stem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gdonval I prefer mkosi ssh
to work out of the box without having to specify --machine
. What about using the working directory name as the machine name instead of default
? That should drastically reduce the number of conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DaanDeMeyer, you prefer
mkosi ssh
to work without having to specify machine
and suggest using the folder name.
Am I right to assume you would call mkosi ssh
only from the current mkosi
folder then?
In that case, why don't we write a softlink reference somewhere appropriate in the current mkosi
project folder? State would still be written centrall (albeit with a convenient distinguisher) and mkosi ssh
would follow the local link to that folder's default session.
When mkosi crashes, Python does not always clean qemu state file. This patch appends current mkosi's PID to the name, solving this problem and allowing other instances of mkosi to use the same machine name without being confused.