Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring feature/perf up-to-date with master #5582

Merged
merged 80 commits into from
Apr 23, 2024

Conversation

edwintorok
Copy link
Contributor

Fixed some conflicts in the Makefile, C bindings gen, and a build issue with pgpu_dom0_access_to_string

psafont and others added 30 commits March 26, 2024 12:28
This is about the same order of magnitude as the number of VMs supported in a
host

Signed-off-by: Pau Ruiz Safont <[email protected]>
This helps when running it to observe behaviour of the library

Signed-off-by: Pau Ruiz Safont <[email protected]>
Add standard http attributes to from opentelemetry such as:

- `http.request.method`
- `http.request.header.content-type`;
- `http.request.body.size`;
- `http.request.header.user-agent`;
- `http.request.header.(key)`.

This improves debuggability associated with client requests.

Signed-off-by: Gabriel Buica <[email protected]>
This reuses the re-key process to gather current contents of the cache and
depending on the amount of elements pending it
1. Bypasses the cache completely if it's empty
2. Uses the queue if all the elements fit in the queue
3. Starts as before if there are more elements

Signed-off-by: Pau Ruiz Safont <[email protected]>
Unknown directories detected in the cache will get logged, and at the
appropriate times deleted, along with all the files contained within.

Signed-off-by: Pau Ruiz Safont <[email protected]>
Previously a logline was created per push attempt, now up to two per failure
period are done: one on the first failure, and one on recovery.

Signed-off-by: Pau Ruiz Safont <[email protected]>
Now temporary files are an internal detail of the persistence function. The
rest of code needs not be aware of it: if some of temp files linger in the
filesystem means that the process did not acknowledge a requested write, so the
domain could not have possibly acted on the write. Delete these as this does
not incur in any data corruption in the domain side.

Signed-off-by: Pau Ruiz Safont <[email protected]>
With *hybrid source format in export, the following cases are supported:
1. nbdhybrid: QCOW2 -> NBD device in dom0 -> exported file
2. nbdhybrid: VHD -> NBD device in dom0 -> exported file
3. hybrid: VHD -> blktap device in dom0 -> exported file

The case 2 above can't support an optional parameter "base".
This parameter holds the ID of another VHD VDI. When it is passed, the
export will only write the differences between "source" and "base" to
the destination file.

As a short-term solution, in case 2 above, when the "base" is passed,
the source format is changed to "hybrid" in this commit.
This can work because:
1. the comparsion on blocks required by "base" is supported by "hybrid";
2. the raw data from NBD device and blktap (Frankentap) device are same;
3. the sparseness information of the source required in case 2 can be
   get by either NBD interface (nbdhybrid) or VHD parsing (hybrid).

Signed-off-by: Ming Lu <[email protected]>
waitpid_nohang closes the socket by calling waitpid and then calls
clear_nonblock on that socket, which fails.

waitpid_nohang's policy around closing the socket is not obvious enough.
Make it more explicit by not calling waitpid but inlining code and
looking at the different cases. The behaviour is still to close the
socket when the child process has terminated and not otherwise.

Logging can be re-enabled if we have to come back to this.

Signed-off-by: Christian Lindig <[email protected]>
…384483

CA-384483: Can't export VDI to VHD file with base VDI
Slightly different from missing file, add missing path.

Signed-off-by: Frediano Ziglio <[email protected]>
Clean even if user press Ctrl-C, not only sending TERM signal.
Remove possible stale unix socket to allow to run multiple times.

Signed-off-by: Frediano Ziglio <[email protected]>
Make sure we get the correct output from program launched.

Signed-off-by: Frediano Ziglio <[email protected]>
Make sure program receives the string we are sending.

Signed-off-by: Frediano Ziglio <[email protected]>
Requires libxml2-devel installed.

Signed-off-by: Edwin Török <[email protected]>
(cherry picked from commit 0008395)
The C SDK build was failing with a recent GCC on Fedora39 like this:
```
src/xen_common.c: In function ‘xen_session_logout’:
src/xen_common.c:298:5: error: ‘xen_call_’ accessing 16 bytes in a region of size 0 [-Werror=stringop-overflow=]
  298 |     xen_call_(session, "session.logout", params, 0, NULL, NULL);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```

Use `NULL` instead of a VLA of size 0.

Signed-off-by: Edwin Török <[email protected]>
(cherry picked from commit bfe78bd)
…dora-fix

Port of "Fix C SDK build on Fedora39" to `master`
Domains using TPMs need to have SWTPM always available. Encode this information
in the service file so the services are shut down at the correct time while
powering off the hosts.

Signed-off-by: Pau Ruiz Safont <[email protected]>
I am not sure where this restriction came from. It was introduced in the
same commit that introduced the metadata-export feature itself, without
explanation. Full exports of snapshots are already allowed.

Signed-off-by: Rob Hoes <[email protected]>
This allows clients to know which snapshots they knowingly just created and act
on them instead of looking again into the db and try to divine which snapshots
the call created.

Signed-off-by: Pau Ruiz Safont <[email protected]>
This allows clients to detect that the newer device type exclusions for export
are available.

Signed-off-by: Rob Hoes <[email protected]>
This allows to orchestrators to copy devices like VBDs from the VM, export the
metadata without the disk, import the VM to any pool, and finally add the
saved VBDs to the new VM before booting it.

Signed-off-by: Pau Ruiz Safont <[email protected]>
This allows to easily test the http endpoint. Also advertises the metadata
parameter on these calls

Signed-off-by: Pau Ruiz Safont <[email protected]>
Use the helper functions to process parameters, use atomics instead of ad-hoc
references for counting in possibly-parallel case

Signed-off-by: Pau Ruiz Safont <[email protected]>
Signed-off-by: Pau Ruiz Safont <[email protected]>
robhoes and others added 12 commits April 19, 2024 13:15
…dk-c

CA-387885 and templatization of the C SDK
Attempts to split the tracing library into components and exporter
parts.

While trying to instrument `forkhelpers.ml` with tracing, I found a
cycle dependency: `Tracing(Exporter)` -> `Open_uri` -> `Stunnel` ->
`Forkhelpers`. This splits the tracing library between exporting
functionality and components.

Signed-off-by: Gabriel Buica <[email protected]>
Small improvements to `tracing_export.ml` and `tracing_exoprt.mli`.

Issues found while splitting the library such as: missing documentation
and small code refactoring for better readability.

Signed-off-by: Gabriel Buica <[email protected]>
Adds `with_tracing` helper function to `forkhelpers.ml`.

This is to show that there are no more cycle dependecies between
`tracing` library and `forexecd` and it will be used in a follow-up PR
to instrument `forkhelpers` with tracing.

Signed-off-by: Gabriel Buica <[email protected]>
As discussed here: xapi-project/xapi-project.github.io#286

New methods:
- `PCI.disable_dom0_access`: Hide a PCI from dom0 kernel
- `PCI.enable_dom0_access`: Unhide a PCI from dom0 kernel
- `PCI.get_dom0_access_status`: Return a PCI device dom0 access status
The answer is no longer based on a DB field but by comparing the `/proc/cmdline`
and the xen cmdline.

Deprecated methods:
- `PGPU.disable_dom0_acces`
- `PGPU.enable_dom0_access`
The methods should be called on the PCI belonging to the PGPU instead

Deprecated field:
- `PGPU.dom0_access`

The deprecated methods still work since they now call the PCI methods.
The PGPU field is kept up to date with the statuss returned by the PCI methods.

Corresponding xe CLI calls have been implemented.

Signed-off-by: Benjamin Reis <[email protected]>
It was allocating a string for each character, only to then immediately add it to a buffer.
Instead add the character to the buffer directly.

The temporary string was probably created to factor out 'Buffer.add',
but I'd rather copy Buffer.add 4x times than have the perf hit.

Further optimizations might be possible here as the comments says,
but this is a low-risk, obvious optimization.

Signed-off-by: Edwin Török <[email protected]>
Optimistically check for the presence of escape characters before escaping.
This avoids rebuilding the string when escapable characters are not present (the common case).

Signed-off-by: Edwin Török <[email protected]>
Check for the presence of the escape character before rebuilding the string

Signed-off-by: Edwin Török <[email protected]>
…expr

IH-553: Optimize Sexpr.{escape,unescape}
…/CP-48195-v2

CP-48195: Split `tracing` library to avoid future cyclic dependencies
scripts/xe-xentrace Fixed Show fixed Hide fixed
scripts/xe-xentrace Fixed Show fixed Hide fixed
scripts/xe-xentrace Fixed Show fixed Hide fixed
scripts/xe-xentrace Fixed Show fixed Hide fixed
scripts/xe-xentrace Fixed Show fixed Hide fixed
@edwintorok
Copy link
Contributor Author

xe-xentrace was changed recently, will look into fixing it so it passes the CI.

@edwintorok edwintorok marked this pull request as draft April 22, 2024 10:52
@edwintorok
Copy link
Contributor Author

Once I fixed some of the warnings shellcheck did actually find this one that points to an actual bug:

echo ...
if [ $? -ne 0 -o -z "${vdi_uuid}" ]; then
     ^-- SC2320 (warning): This $? refers to echo/printf, not a previous command. Assign to variable to avoid it being overwritten.

scripts/xe-xentrace Fixed Show fixed Hide fixed
Fix shellcheck warnings and bugs.

Signed-off-by: Edwin Török <[email protected]>
@edwintorok edwintorok marked this pull request as ready for review April 22, 2024 15:11
Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is merging perf to master (!)

@edwintorok edwintorok changed the base branch from master to feature/perf April 22, 2024 16:02
@edwintorok
Copy link
Contributor Author

This is merging perf to master (!)

Good spot, fixed

Copy link
Contributor

@Vincent-lau Vincent-lau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, are you now merging perf in your private fork to the perf in upstream, is this right? I thought it would be something like into xapi-project:master from xapi-project:feature/perf?

@edwintorok
Copy link
Contributor Author

edwintorok commented Apr 23, 2024

"edwintorok wants to merge 80 commits into xapi-project:feature/perf from edwintorok:feature/perf"

I think it merged into the correct place into xapi-project

What we do here is merge upstream master into upstream feature/perf, but there are conflicts that I need to fix, so that can only be done from my private fork (I can't directly push to feature/perf on upstream)

@robhoes robhoes merged commit 7c23f8e into xapi-project:feature/perf Apr 23, 2024
14 checks passed
Copy link

pytype_reporter extracted 47 problem reports from pytype output

.

You can check the results of the job here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.