Skip to content

Conversation

@ArcaneNibble
Copy link
Collaborator

This code expects Timer0.TIM34 to count from [0, 256*256 - 1] (inclusive) and controls LED pulse widths based on TIM34[15:8], which is exactly how the existing firmware worked.

Requesting a review of only the code for now. More documentation on how it works and how to set up a build environment will be written later.

@ArcaneNibble ArcaneNibble requested a review from dlech July 14, 2025 23:11
#define LED1 10
#define LED2 13
#define LED3 11

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
register uint32_t __R30 asm ("r30");

main.c Outdated
static inline void update_pwm(uint8_t val, uint8_t time_now, uint32_t gpio_bit) {
// We want to force generation of the optimized set/clr opcodes
if (time_now < val) {
asm volatile("set r30, r30, %0"::"r"(gpio_bit));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
asm volatile("set r30, r30, %0"::"r"(gpio_bit));
__R30 &= ~(1 << gpio_bit);

Copy link
Member

@dlech dlech Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler may or may not emit the same code depending on if there is a better optimization. (I've been playing around with it locally just now - been a while since I did PRU asm 😄 )

In this case, it finds a better optimization in terms of code size and saves a few instructions without using set/clr instructions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No matter what I try I cannot get GCC to emit a clr opcode (even though I can get it to emit set)

However, I did realize I can optimize the code even further as the set/clr opcodes can take immediates. I've changed the code to use the "I" constraint in order to get it to do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the changes I suggested still ended up with the same .text size when compiled even though it didn't generate the set/clr instructions.

main.o:     file format elf32-pru


Disassembly of section .text:

00000000 <update_pwm>:
   0:	240001e0 	ldi	r0, 1
   4:	084ee0e0 	lsl	r0, r0, r14.b2
   8:	580e2e00 	qble	8 <update_pwm+0x8>, r14.b1, r14.b0
   c:	12fee0e0 	or	r0, r0, r30
  10:	1300e0fe 	mov	r30, r0
  14:	20c30000 	ret
  18:	1700e0e0 	not	r0, r0
  1c:	10fee0e0 	and	r0, r0, r30
  20:	21000000 	jmp	0 <update_pwm>

Disassembly of section .text.startup:

00000000 <main>:
   0:	240000e5 	ldi	r5, 0
   4:	91142180 	lbco	r0.b0, 1, 20, 4
   8:	0b08e0e6 	lsr	r6, r0, 8
   c:	69002000 	qbne	c <main+0xc>, r0.b1, 0
  10:	91003e85 	lbco	r5.b0, 30, 0, 4
  14:	1300062e 	mov	r14.b1, r6.b0
  18:	1300050e 	mov	r14.b0, r5.b0
  1c:	24000c4e 	ldi	r14.b2, 12
  20:	230000c3 	call	0 <main>
  24:	1300062e 	mov	r14.b1, r6.b0
  28:	1300250e 	mov	r14.b0, r5.b1
  2c:	24000a4e 	ldi	r14.b2, 10
  30:	230000c3 	call	0 <main>
  34:	1300062e 	mov	r14.b1, r6.b0
  38:	1300450e 	mov	r14.b0, r5.b2
  3c:	24000d4e 	ldi	r14.b2, 13
  40:	230000c3 	call	0 <main>
  44:	24000b4e 	ldi	r14.b2, 11
  48:	1300062e 	mov	r14.b1, r6.b0
  4c:	1300650e 	mov	r14.b0, r5.b3
  50:	230000c3 	call	0 <main>
  54:	21000000 	jmp	0 <main>

And interestingly, if you only change the 2nd one to C code and leave the 1st one assembly, the second one will generate the expected instruction.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I did notice that. GCC would generate one or the other optimized opcode, but somehow not both.

uint32_t pwms;
} shared_ram;
// XXX: The real address in use here is 0x80010000
// There appears to be a compiler bug where ctable entries with the MSB set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a link for the bug we could add here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on filing a bug right this moment, but unfortunately GCC's bug tracker seems to want manual approval for new account creation.

I have produced a minimal testcase though:

#pragma ctable_entry 30 0x00beef00
void good() {
    *(volatile unsigned int *)0x00beef00 = 0xdead;
}

#pragma ctable_entry 31 0x80beef00
void bad() {
    *(volatile unsigned int *)0x80beef00 = 0xf00d;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121124 to track this issue.

#define LED2 13
#define LED3 11

static inline void update_pwm(uint8_t val, uint8_t time_now, uint32_t gpio_bit) {
Copy link
Member

@dlech dlech Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static inline void update_pwm(uint8_t val, uint8_t time_now, uint32_t gpio_bit) {
static inline void update_pwm(uint8_t val, uint8_t time_now, uint8_t gpio_bit) {

@ArcaneNibble ArcaneNibble force-pushed the ledpwm branch 2 times, most recently from 8e87480 to 20aff3f Compare July 15, 2025 13:39
@ArcaneNibble
Copy link
Collaborator Author

Latest version swaps the order of the PWM channels to better match what software wants (R, G, R, G) rather than what the schematic is labeled as.

@dlech
Copy link
Member

dlech commented Jul 15, 2025

rather than what the schematic is labeled as.

Let's keeps the macro names matching the schematic though, otherwise someone will try to fix the "typo" in the future.

@ArcaneNibble
Copy link
Collaborator Author

Ok, I added explicit comments about what is going on to the code.

@dlech
Copy link
Member

dlech commented Jul 15, 2025

I think this is fine now.

I'm still not convinced that the inline assembly is necessary - we asked the compile to optimize for size and the generated code size is the same either way, so it is doing what we asked even if it isn't using the "obvious" instructions. So I would prefer the more readable regular C code. But you've taken the lead here, so I'll leave it up to you to do what you think is best.

@ArcaneNibble ArcaneNibble merged commit 4f87d35 into main Jul 15, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants