Skip to content

⚡️[OPTIMIZATION] Do NRZI encoding in TX PIO #187

@raphlinus

Description

@raphlinus

Hi! I'm studying this library carefully, considering doing an independent re-implementation in Rust (see https://wiki.thejpster.org.uk/index.php?title=USB). I'm very impressed overall, but am looking at possibilities for doing things more efficiently.

One such optimization is doing NRZI encoding in the TX PIO instead of in software. There is an encoding step, expanding each bit of the input to two bits sent to the PIO (thus leaving 00 available to represent EOP), but that's lightweight and could easily be done with a byte or nibble lookup table. This could also be done on the fly, possibly interleaved with CRC calculation, replacing DMA with pushing into the TX FIFO from the CPU, much as is already done on RX.

Below is the PIO program I came up with. The x scratch register represents the current line state, and y is the bit stuffing counter. I haven't run this yet, so it's quite possible I made a mistake, but I believe the concept is sound.

// encoding: 00 = stop, 01 = bit 1, 11 = bit 0

eop: 
    jmp eop_impl [1]
bit_one: 
    jmp y-- nostuff [2]
    nop
bit_zero: 
    set y, 5
    jmp !x j_zero
start:
    set x, 0 side J
    out pc, 2
j_zero:
    set x, 1 side K
nostuff:
    out pc, 2
eop_impl:
    nop side SE0 [7]
    nop side J [3]
.wrap_target
    set pindirs, 0
.wrap

As a side note, I believe the description of the K/J polarity is wrong in the code; I believe the PIO_USB_TX_ENCODED_DATA enum has K and J swapped, and also that FJ_LK should just be J and correspondingly FK_LJ should just be K, as it seems to me the selection of lines for these states is done in configure_tx_program rather than in encoding; the latter would be almost ok except for the fallthrough from the SE0 state to J. Of course, it's also possible I'm missing something.

Let me know if these kinds of musings are welcome. I'm also looking at trying to get RX to work at 48MHz, which I think may be possible if more work is transferred to the CPU. My motivation here is largely other constraints on sysclk for correct DVI video timing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions