Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework the PIO API #143

Merged
merged 12 commits into from
Oct 2, 2021
Merged

Rework the PIO API #143

merged 12 commits into from
Oct 2, 2021

Conversation

mgottschlag
Copy link
Contributor

Rationale

The current PIO API has a number of limitations:

  • As the state machines are contained in PIO, they cannot be moved around individually. Sometimes, a single PIO block implements multiple functions that are used in completely different parts of the code base.
  • StateMachine is not Send, although it could be.
  • Multiple state machines should be able to share code to reduce instruction space usage, yet currently they are not because PIOBuilder always allocates space. Sometimes, multiple state machines implement the same functionality (e.g., multiple SPI or I2C buses).
  • Some functions must only be called in specific configurations. For example, pin directions have to be set while the state machine is stopped, as PINCTRL is modified. The current API does not reflect such restrictions, which makes it harder to use than necessary.

Also, during my work, I found a bug where used instruction space was not marked properly.

This PR also fixes #141 by providing a function to set pin directions.

Design

This PR uses different types to represent state machines in different states - UninitStateMachine<P> is not associated with any program, whereas StateMachine<P, Stopped> and StateMachine<P, Running> are. Program installation in instruction memory is separated from state machine initialization via InstalledProgram and PIO::install(). Access from state machines to shared registers is performed via atomic operations to enable Send.

Old usage example:

let pio = rp2040_hal::pio::PIO::new(pac.PIO0, &mut pac.RESETS);
let sm = &pio.state_machines()[0];
let div = 0f32; // as slow as possible (0 is interpreted as 65536)
rp2040_hal::pio::PIOBuilder::default()
    .with_program(&program)
    .set_pins(led_pin_id, 1)
    .clock_divisor(div)
    .build(&pio, sm)
    .unwrap();

New usage example:

let (mut pio, sm0, _, _, _) = pac.PIO0.split(&mut pac.RESETS);
let installed = pio.install(&program).unwrap();
let div = 0f32; // as slow as possible (0 is interpreted as 65536)
let sm = rp2040_hal::pio::PIOBuilder::from_program(installed)
    .set_pins(led_pin_id, 1)
    .clock_divisor(div)
    .build(sm0);
sm.start();

Testing

I tested that the blink examples work, and I tested with my own code that, however, does not use much more PIO functionality yet (except for sideset pins). Those programs work fine.

…ely.

One PIO block often implements multiple functions that are used in
different parts of the codebase. Previously, that would be impossible, as
PIO contained all StateMachine instances.

Now, StateMachine instances use atomic operations whenever accessing shared
registers, so they can be used concurrently.
Multiple state machines may want to execute the same program (e.g., two
state machines are used to implement two I2C buses), in which code sharing
saves space.
Some operations must only be performed in a specific state. For example,
pin directions must not be changed while the state machine is running, as
the operation modifies PINCTRL. The new API makes wrong usage a lot harder.

Also, the code now supports uninitializing state machines to free
instruction space or to select a different function.
@mgottschlag
Copy link
Contributor Author

The new example in this PR currently depends on rp-rs/pio-rs#9, so the two PRs should be merged together.

Copy link
Contributor

@henkkuli henkkuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these improvements to the usability of the API. There are a couple of bugs I spotted, and probably some more I missed.

I didn't know that atomic access to peripherals was even possible. My only concern is that I don't exactly know whether using pointers like that is sound, and I also don't like manual pointer arithmetic in HAL crate. In general I think the PAC should provide us with atomic access. A quick search through svd2rust found that it is probably possible as apparently MSP430 devices have similar functionality: https://github.com/rust-embedded/svd2rust/blob/master/src/generate/generic_msp430_atomic.rs

rp2040-hal/src/pio.rs Outdated Show resolved Hide resolved
rp2040-hal/src/pio.rs Outdated Show resolved Hide resolved
Comment on lines 229 to 234
unsafe {
*(*sm_set[0].sm.block)
.ctrl
.as_ptr()
.add(ATOMIC_SET_OFFSET / 4) = sm_mask;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use write_volatile. I'm also not sure whether this is UB or not as the pointer is increased past the end of the peripheral's regular address space.

In principle I think the atomic translation should be handled by the PAC and we should be able to use that directly here, though I'm not sure whether this can be added to PAC or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added write_volatile in all three places. This is probably still undefined behavior - the documentation talks about the bounds of allocated memory, so I am not sure. Even if it is, I do not see a problem on this hardware platform.

});
self.sm.sm().sm_instr.write(|w| {
unsafe {
const SET_PINDIRS: u16 = 0xe080;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to use instruction encoder here. See

instr.encode(side_set)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code got slightly longer - is this what you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is what I meant.

#[derive(Debug)]
pub struct StateMachine<P: Instance> {
pub struct UninitStateMachine<P: PIOExt> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see this be StateMachine<P, Uninit>, but when I tried to draft an example of that I realized we probably need GATs for that because InstalledProgram is generic over P. Another option we could have today is

StateMachine<Uninit<PIO0>>;
StateMachine<StoppedPIO0>>;
StateMachine<Starged<PIO0>>;

but I don't like that either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the latter, but did not like that it made it more difficult to implement functions that can operate on Stopped and Running, but not on Uninit. There are quite some such functions that just do not make sense on an UninitStateMachine.

I do not see this as a substantial usability concern, but I can look further into it if you insist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I guess one could still have something like

trait Initialized {...}
impl Initialized for Stopped {...}
impl Initialized for Running {...}
impl<State: Initialized> StateMachine<State> {...}

if one wanted to implement something only for Stopped and Running. But I don't insist on changing this, I'm just trying to brainstorm some ideas on how to make the API more self-consistent.

@henkkuli
Copy link
Contributor

Apparently there are already issues for rp2040-pac and svd2rust.

@mgottschlag
Copy link
Contributor Author

Thanks a lot for the review. I will fix those bugs this evening.

After submitting the PR, I started to think about integrating DMA. While doing so, I noticed that the API needs some further restructuring. In particular, TX and RX FIFOs need to be managed by different objects than the state machine itself, so that the user is able to use TX and RX DMA in parallel. For example, an API such as the following may be possible:

let (sm, rx, tx) = rp2040_hal::pio::PIOBuilder::from_program(installed)
    .set_pins(led_pin_id, 1)
    .clock_divisor(div)
    .build(sm0);
let rx_dma = rx.read_with_dma(buffer);
sm.start();
// ...
let rx = rx_dma.wait().
let sm0 = sm.uninit(rx, tx);

However, in the last line, the API should ensure that only the right rx and tx objects can be used. This could, for example, be done by adding another generic parameter, where there would be the following types and sm.id would be removed:

UninitStateMachine<PIO, SM0>
StateMachine<PIO0, SM0, Running>
PIOTx<PIO0, SM0>

etc.

That, however, means that any function/type that receives a state machine object needs to be parametrized for the specific state machine index. I do not think that this would be a large problem, as we already have similar behavior for objects/functions operating on GPIO pins, where they need to be written as generics which take a specific GPIO pin type.

What do you think?

@mgottschlag
Copy link
Contributor Author

Oh, and with regards to MSP430 atomics: The scenario is somewhat different as MSP430, if I understand it correctly, uses atomic instructions to operate at the same addresses, which is probably much easier to implement in svd2rust. This kind of atomic register access (register aliases) would probably have to be implemented in rp2040-pac, but I have no idea how that would be done.

@henkkuli
Copy link
Contributor

I think having everything be generic over the state machine won't be a problem. As you said, similar API is used elsewhere. Maybe one way to make it little easier to use would be to have types PIO0SM0, PIO0SM1, etc. and be generic over those instead. This way a function taking a state machine could have signature

fn foo<SM: ValidStateMachine>(_: StateMachine<SM, Running>) {}

instead of

fn foo<PIO: ValidPIO, SM: ValidStateMachine>(_: StateMachine<PIO, SM, Running>) {}

Regarding the atomics, I think I managed to hack something together already, though it is currently still a hack. Basically I just copied https://github.com/rust-embedded/svd2rust/blob/master/src/generate/generic_msp430_atomic.rs, changed the writes to something like

self.register
    .as_ptr()
    .add(0x2000 / core::mem::size_of::<REG::Ux>())
    .write_volatile(bits);

and force-added the file to every compiled crate. This has a couple of problems I haven't resolved yet: The most obvious one is that not every PAC crate should have this added, so a new flag or target needs to be added. More subtle problem is that the RP2040 documentation states that SIO device doesn't support atomic access (as it already is atomic by design), so this API shouldn't be exposed for SIO device.

Eventually, the read and write FIFOs need to be split into separate
objects for DMA. To be able to safely rejoin them only when they belong to
the same state machine, the state machine index needs to be encoded into
the type.
@mgottschlag
Copy link
Contributor Author

I added parametrization for the state machines. I have to admit that my type design skills are not really great, so someone might want to check whether that's really the most elegant way to specify state machine indices.

Currently, the state machines still carry references around to the registers:

block: *const rp2040_pac::pio0::RegisterBlock,
sm: *const rp2040_pac::pio0::SM,

These fields have basically become unnecessary now, I will remove them in the next commit.

@9names
Copy link
Member

9names commented Sep 29, 2021

If you need atomics, what you'll want is the atomic- polyfill crate. It's not multi-core safe, but we are planning on making a fork that is.
[edit]
Oh, you want access to the atomic_set register alias. Yeah, that's something that should go in the PAC.
How many registers do you need to add this for? Adding a few isn't a big deal.

@9names
Copy link
Member

9names commented Sep 30, 2021

and force-added the file to every compiled crate. This has a couple of problems I haven't resolved yet: The most obvious one is that not every PAC crate should have this added, so a new flag or target needs to be added. More subtle problem is that the RP2040 documentation states that SIO device doesn't support atomic access (as it already is atomic by design), so this API shouldn't be exposed for SIO device.

Definitely sounds like something that should be opt-in.
I was thinking more along the lines of using svdtools to copy the peripheral, add a suffix _or, and change the peripheral offset. That would work with current svd2rust, won't break any existing uses but might make the interface a little less nice to deal with.

We need separate types for any blocking or DMA operations - otherwise, it
would not be possible to perform both RX and TX transfers at the same time.
@mgottschlag
Copy link
Contributor Author

How many registers do you need to add this for? Adding a few isn't a big deal.

Currently, the code only needs atomic access to the CTRL register.

Copy link
Contributor

@henkkuli henkkuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking whether having a transceiver type would be helpful or not, the idea being that it would provide convenient methods read and write, and split to create separate Rx and Tx channels.

rp2040-hal/src/pio.rs Outdated Show resolved Hide resolved
@henkkuli
Copy link
Contributor

Definitely sounds like something that should be opt-in.

Exactly. My implementation was just a quick POC to just check whether adding atomic methods to PAC is possible or not.

I was thinking more along the lines of using svdtools to copy the peripheral, add a suffix _or, and change the peripheral offset. That would work with current svd2rust, won't break any existing uses but might make the interface a little less nice to deal with.

I don't think copying the peripheral should be the first solution. Especially as then the Peripherals struct would have both PIO0 and PIO0_or fields, and both would need to be given to HAL, degrading the UX of the HAL greatly. Otherwise the user could use PIO0_or and PIO0_nand to modify the registers even if they had given access to PIO0 away.

Instead I suggest that the SVD file should contain an attribute for every register telling whether the register has atomic counterparts or not. This would then instruct the codegen to implement AtomicAccess trait on the register, allowing us to gate the access to the atomic methods by this trait. I'm just not familiar enough with SVD to know whether this is possible, or how difficult that would be to implement.

@9names
Copy link
Member

9names commented Oct 1, 2021

@mgottschlag Could you add an entry to the changelog for this?
I want CI to run with the latest pio-rs now that rp-rs/pio-rs#9 is merged, after that I think this PR is good merge.

@mgottschlag
Copy link
Contributor Author

I added a changelog entry.

@mgottschlag
Copy link
Contributor Author

I will have a look at the failures this evening.

@mgottschlag
Copy link
Contributor Author

mgottschlag commented Oct 1, 2021

I fixed all clippy warnings, except for two warnings about very complex types that I somehow can't see in the CI output. I also marked most doc comment code examples as ignore, except for one which I extended with use statements and a simple PIO program and marked as no-run.

I told clippy to ignore the return types of split() and build(), as those tuples are too complex for clippy. We could also package those into structs, but I do not think that it would help make the code more readable. GPIOs are placed into a struct instead of a tuple, for example, but there is no free() method for GPIOs. Here, there is, and that method simply expects the variables as returned by split() at the moment, and I think this symmetry is good.

Copy link
Member

@9names 9names left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@9names 9names merged commit ede25a4 into rp-rs:main Oct 2, 2021
@mgottschlag mgottschlag deleted the pio-rework branch October 2, 2021 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PIO sideset does not work (wrong pindir)
3 participants