-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework the PIO API #143
Rework the PIO API #143
Conversation
…ely. One PIO block often implements multiple functions that are used in different parts of the codebase. Previously, that would be impossible, as PIO contained all StateMachine instances. Now, StateMachine instances use atomic operations whenever accessing shared registers, so they can be used concurrently.
Multiple state machines may want to execute the same program (e.g., two state machines are used to implement two I2C buses), in which code sharing saves space.
Some operations must only be performed in a specific state. For example, pin directions must not be changed while the state machine is running, as the operation modifies PINCTRL. The new API makes wrong usage a lot harder. Also, the code now supports uninitializing state machines to free instruction space or to select a different function.
The new example in this PR currently depends on rp-rs/pio-rs#9, so the two PRs should be merged together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these improvements to the usability of the API. There are a couple of bugs I spotted, and probably some more I missed.
I didn't know that atomic access to peripherals was even possible. My only concern is that I don't exactly know whether using pointers like that is sound, and I also don't like manual pointer arithmetic in HAL crate. In general I think the PAC should provide us with atomic access. A quick search through svd2rust
found that it is probably possible as apparently MSP430 devices have similar functionality: https://github.com/rust-embedded/svd2rust/blob/master/src/generate/generic_msp430_atomic.rs
rp2040-hal/src/pio.rs
Outdated
unsafe { | ||
*(*sm_set[0].sm.block) | ||
.ctrl | ||
.as_ptr() | ||
.add(ATOMIC_SET_OFFSET / 4) = sm_mask; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use write_volatile
. I'm also not sure whether this is UB or not as the pointer is increased past the end of the peripheral's regular address space.
In principle I think the atomic translation should be handled by the PAC and we should be able to use that directly here, though I'm not sure whether this can be added to PAC or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added write_volatile in all three places. This is probably still undefined behavior - the documentation talks about the bounds of allocated memory, so I am not sure. Even if it is, I do not see a problem on this hardware platform.
rp2040-hal/src/pio.rs
Outdated
}); | ||
self.sm.sm().sm_instr.write(|w| { | ||
unsafe { | ||
const SET_PINDIRS: u16 = 0xe080; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to use instruction encoder here. See
Line 168 in 833b698
instr.encode(side_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code got slightly longer - is this what you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is what I meant.
rp2040-hal/src/pio.rs
Outdated
#[derive(Debug)] | ||
pub struct StateMachine<P: Instance> { | ||
pub struct UninitStateMachine<P: PIOExt> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see this be StateMachine<P, Uninit>
, but when I tried to draft an example of that I realized we probably need GATs for that because InstalledProgram
is generic over P
. Another option we could have today is
StateMachine<Uninit<PIO0>>;
StateMachine<StoppedPIO0>>;
StateMachine<Starged<PIO0>>;
but I don't like that either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried the latter, but did not like that it made it more difficult to implement functions that can operate on Stopped
and Running
, but not on Uninit
. There are quite some such functions that just do not make sense on an UninitStateMachine.
I do not see this as a substantial usability concern, but I can look further into it if you insist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I guess one could still have something like
trait Initialized {...}
impl Initialized for Stopped {...}
impl Initialized for Running {...}
impl<State: Initialized> StateMachine<State> {...}
if one wanted to implement something only for Stopped
and Running
. But I don't insist on changing this, I'm just trying to brainstorm some ideas on how to make the API more self-consistent.
Apparently there are already issues for |
Thanks a lot for the review. I will fix those bugs this evening. After submitting the PR, I started to think about integrating DMA. While doing so, I noticed that the API needs some further restructuring. In particular, TX and RX FIFOs need to be managed by different objects than the state machine itself, so that the user is able to use TX and RX DMA in parallel. For example, an API such as the following may be possible:
However, in the last line, the API should ensure that only the right rx and tx objects can be used. This could, for example, be done by adding another generic parameter, where there would be the following types and
etc. That, however, means that any function/type that receives a state machine object needs to be parametrized for the specific state machine index. I do not think that this would be a large problem, as we already have similar behavior for objects/functions operating on GPIO pins, where they need to be written as generics which take a specific GPIO pin type. What do you think? |
Oh, and with regards to MSP430 atomics: The scenario is somewhat different as MSP430, if I understand it correctly, uses atomic instructions to operate at the same addresses, which is probably much easier to implement in svd2rust. This kind of atomic register access (register aliases) would probably have to be implemented in rp2040-pac, but I have no idea how that would be done. |
I think having everything be generic over the state machine won't be a problem. As you said, similar API is used elsewhere. Maybe one way to make it little easier to use would be to have types fn foo<SM: ValidStateMachine>(_: StateMachine<SM, Running>) {} instead of fn foo<PIO: ValidPIO, SM: ValidStateMachine>(_: StateMachine<PIO, SM, Running>) {} Regarding the atomics, I think I managed to hack something together already, though it is currently still a hack. Basically I just copied https://github.com/rust-embedded/svd2rust/blob/master/src/generate/generic_msp430_atomic.rs, changed the writes to something like self.register
.as_ptr()
.add(0x2000 / core::mem::size_of::<REG::Ux>())
.write_volatile(bits); and force-added the file to every compiled crate. This has a couple of problems I haven't resolved yet: The most obvious one is that not every PAC crate should have this added, so a new flag or target needs to be added. More subtle problem is that the RP2040 documentation states that SIO device doesn't support atomic access (as it already is atomic by design), so this API shouldn't be exposed for SIO device. |
Eventually, the read and write FIFOs need to be split into separate objects for DMA. To be able to safely rejoin them only when they belong to the same state machine, the state machine index needs to be encoded into the type.
I added parametrization for the state machines. I have to admit that my type design skills are not really great, so someone might want to check whether that's really the most elegant way to specify state machine indices. Currently, the state machines still carry references around to the registers:
These fields have basically become unnecessary now, I will remove them in the next commit. |
If you need atomics, what you'll want is the atomic- polyfill crate. It's not multi-core safe, but we are planning on making a fork that is. |
Definitely sounds like something that should be opt-in. |
We need separate types for any blocking or DMA operations - otherwise, it would not be possible to perform both RX and TX transfers at the same time.
Currently, the code only needs atomic access to the CTRL register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking whether having a transceiver type would be helpful or not, the idea being that it would provide convenient methods read
and write
, and split
to create separate Rx
and Tx
channels.
Exactly. My implementation was just a quick POC to just check whether adding atomic methods to PAC is possible or not.
I don't think copying the peripheral should be the first solution. Especially as then the Instead I suggest that the SVD file should contain an attribute for every register telling whether the register has atomic counterparts or not. This would then instruct the codegen to implement |
@mgottschlag Could you add an entry to the changelog for this? |
I added a changelog entry. |
I will have a look at the failures this evening. |
I fixed all clippy warnings, except for two warnings about very complex types that I somehow can't see in the CI output. I also marked most doc comment code examples as I told clippy to ignore the return types of split() and build(), as those tuples are too complex for clippy. We could also package those into structs, but I do not think that it would help make the code more readable. GPIOs are placed into a struct instead of a tuple, for example, but there is no free() method for GPIOs. Here, there is, and that method simply expects the variables as returned by split() at the moment, and I think this symmetry is good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Rationale
The current PIO API has a number of limitations:
PIO
, they cannot be moved around individually. Sometimes, a single PIO block implements multiple functions that are used in completely different parts of the code base.StateMachine
is not Send, although it could be.Also, during my work, I found a bug where used instruction space was not marked properly.
This PR also fixes #141 by providing a function to set pin directions.
Design
This PR uses different types to represent state machines in different states -
UninitStateMachine<P>
is not associated with any program, whereasStateMachine<P, Stopped>
andStateMachine<P, Running>
are. Program installation in instruction memory is separated from state machine initialization viaInstalledProgram
andPIO::install()
. Access from state machines to shared registers is performed via atomic operations to enableSend
.Old usage example:
New usage example:
Testing
I tested that the blink examples work, and I tested with my own code that, however, does not use much more PIO functionality yet (except for sideset pins). Those programs work fine.