Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: route discards via the I/O queue #2386

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

pwrobelse
Copy link
Contributor

This set of changes is an another attempt to route discard requests
via the I/O queue to ensure, that they are issued to the drive in a
controlled way.

This PR:

  • adds discard type to internal::io_request
  • renames io_desc_read_write to queued_io_request_completion -- discards will use it too
  • adds discard_to_write_ratio to io-properties.yaml
  • adds discard direction to io_direction_and_length
  • routes discard requests via the I/O queue
  • introduces remove_file_via_blocks_discarding() function that can be used to remove files block by block


// Caller synchronizes using the future returned from `desc->get_future()`.
// The attached continuation resolves that future when `punch_hole()` finishes.
(void) engine().punch_hole(discard_req.fd, discard_req.offset, discard_req.length).then_wrapped([desc, len = discard_req.length] (future<> f) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: There's neat future::forward_to(promise<>) method that can be used to link punch_hole's future with desc's promise directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use it, but the following two issues occurred.

Firstly, the type of the future returned from punch_hole() is different than the type of future produced by the promise from desc.

/repo/scylladb/seastar/seastar/include/seastar/core/future.hh:1610:34: note:   no known conversion for argument 1 from ‘seastar::promise<long unsigned int>’ to ‘seastar::promise<void>&&’

Secondly, desc->set_exception() and desc->complete() are part of reasoning about the lifetime of queued_io_request_completion. They delete desc object -- the memory is freed. Moreover, they update priority_class_data and io_queue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, the type of the future returned from punch_hole() is different than the type of future produced by the promise from desc.

There's future::ignore_result() that converts future into future<>

size_t length() const noexcept { return _directed_length >> direction_bits_count; }
int rwd_idx() const noexcept { return _directed_length & direction_bits_mask; }

static constexpr int discard_idx = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that discard request should carry write_idx direction and whether or not to submit it via sink or via punch_hole() should be decided based on request::op()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can revert the change and use io_direction_and_length::write for discards. The new direction does not look right in the case of duplex mode -- I would not expect three directions in duplex mode.

However, if there is no discard direction, I see the following three issues that are described below.

Firstly, if the write direction is used, then the calculation in internal::request_tokens() will need to contain a new if statement. It is going to distinguish between true write requests and discard requests (based on operation type), because the coefficients are different for write/discard operations.

Secondly, I am not sure about the usage of _group->_max_request_length[]. If I understand correctly, the maximum request length is different for discards and for writes. So, simple usage of _group->_max_request_length[direction] will be invalid. Somehow, I feel that it would end up with creation of a new private function that would look similar to the function below:

size_t get_max_request_length(io_direction_and_length dnl, const internal::io_request& req) {
    if (req.opcode() != internal::io_request::operation::discard) {
    	return _group->_max_request_length[dnl.rw_idx()];
    }
    
    return _group->_max_discard_request_length;
}

Lastly, the way how statistics are accumulated in io_queue::priority_class_data::on_dispatch() will mix discards and writes. The code looks as follows: _rwstat[dnl.rwd_idx()].add(dnl.length());.

If the three mentioned points do not cause a problem, then I will introduce the change.

@pwrobelse pwrobelse force-pushed the implement_discard_request_attempt_2 branch from 5c47859 to e515a36 Compare August 12, 2024 11:11
@pwrobelse
Copy link
Contributor Author

The following things have been updated between revisions:

  • the branch has been rebased to the latest master
  • remove_file_via_blocks_discarding() has been moved from seastar.hh to utils/file.hh
  • io_queue::submit_io_discard() has been adjusted to use queue_request() instead of queue_one_request() to allow splitting to big requests

@pwrobelse pwrobelse force-pushed the implement_discard_request_attempt_2 branch from e515a36 to 37cb60b Compare August 13, 2024 12:26
This patch is a part of the preparation to route discard requests
via I/O scheduler. In spite of the fact that 'fallocate()' is
not supported by Linux AIO and thus reactor backend will not
support it, the existing infrastructure around io_queue is
going to use internal::io_request with the fields related to
discard operation.

This change extends internal::io_request with 'discard' operation type.

Signed-off-by: Patryk Wrobel <[email protected]>
This patch is a part of the preparation to route discard requests
via I/O scheduler.

The mentioned 'io_desc_read_write' is used to track the completion
of read and write requests submitted to 'io_sink'. It implements
'io_completion' interface - its functions are invoked when the
processing of a request via 'reactor_backend' is finished.

In spite of the fact that 'fallocate()' is not supported by Linux AIO
and therefore, discards will not be submitted to 'io_sink' and processed
by 'reactor_backend', discard requests will need to notify 'io_queue'
about completion in the same way that reads and writes do. Thus, to avoid
code duplication 'io_desc_read_write' is renamed to 'queued_io_request_completion'
as it will be used by discard requests.

Signed-off-by: Patryk Wrobel <[email protected]>
This patch is a part of the preparation to route discard requests
via I/O scheduler.

Punching holes was implemented in 'file.cc' and realized via '_thread_pool'
object that belongs to 'reactor' class. This member object is private.

Discard requests will be routed via 'io_queue'. However, they cannot be
submitted via 'io_sink' like other request types. In order for enabling
'io_queue' to submit discard requests via 'thread_pool', the functionality
responsible for issuing 'fallocate(PUNCH_HOLE)' has been moved from 'file.cc'
to 'reactor' class and exposed via the public function called 'punch_hole()'.

Signed-off-by: Patryk Wrobel <[email protected]>
In order to allow the I/O queue to process discard requests,
io-properties file format has been extended to support the
following two new fields:
 - 'req_discard_to_write_ratio'
 - 'blocks_discard_to_write_ratio'
They describe the ratio between the cost of a single discard
operation and a write operation.

Signed-off-by: Patryk Wrobel <[email protected]>
In order to support discard requests in the I/O queue,
io_direction_and_length has been extended with 'discard'
direction.

The client code in the I/O queue has also been adjusted
to support the new direction.

Signed-off-by: Patryk Wrobel <[email protected]>
This change introduces a new functionality in 'io_queue'
that enables its users to submit discard requests. To achieve
that 'io_queue' was taught that the new type of a request
shall not be submitted via io_sink -- instead, reactor shall
be used.

Signed-off-by: Patryk Wrobel <[email protected]>
In the past the discard requests were invoked directly
from 'posix_file_impl::discard()' function.

In order to route the requests via the I/O queue, instead
of directly calling 'punch_hole()' function, the requests
are queued and later executed by the I/O queue.

Signed-off-by: Patryk Wrobel <[email protected]>
In order to allow splitting the removal of a big file into
smaller steps a new function has been implemented. It removes
a regular file via blocks discarding -- step by step.

The discard requests are processed by the I/O queue. Therefore,
they are issued to the disk according to the given configuration.
The removal may be delayed when the bandwidth is not available in
the queue.

Fundamentally, the new function performs four steps:
 1. Open the file.
 2. Unlink it.
 3. Punch holes until nothing is left.
 4. Close the file.

Signed-off-by: Patryk Wrobel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants