Skip to content

Conversation

@Yurlungur
Copy link

This is (currently) a very rough draft of an implementation of a batched serial QR decomposition. I basically took the team vector implementation and just swapped the relevant loops and single calls for serial for loops or undecorated code.

This is currently totally untested, I don't even know if it compiles. I wasn't quite sure how to thread it through the testing infrastructure. I also wanted to ask for feedback before moving further forward, as I worry a bit that this is a lot of code duplication and perhaps the implementations could be unified with some thought. Thanks in advance for taking a look.

Copy link
Contributor

@lucbv lucbv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow turn around... here are a few comments, should be easy to fix : )

KOKKOS_INLINE_FUNCTION static int invoke(const AViewType &A, const tViewType &t, const pViewType &p,
const wViewType &w,
/* */ int &matrix_rank) {
return SerialQR_WithColumnPivotingInternal::invoke(A.extent(0), A.extent(1), A.data(), A.stride(0), A.stride(1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to guess that some assumptions are made here. For instance you are pulling extents and strides 0 and 1 from A so you expect a rank 2 view. There should be a static assertion on the rank of A then.
You might also want to check that p is storing some kind of integral type?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, will do.

SerialFindAmaxInternal::invoke(n_AR, norm_part1x2.AR, 1, pividx);

// apply pivot
SerialApplyPivotVectorForwardInternal::invoke(*pividx, norm_part1x2.AR, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case where you would call the struct SerialApplyPivot declared in the apply pivot header? It looks like the struct is unused and only the specialization for Vector and Matrices are actually being called?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need it for the tests once those are threaded through, as the pivot will need to be applied if the QR decomposition is utilized to, e.g., solve a linear system. In the team vector batched test the team version is utilized, so I was copying that strategy.

@Yurlungur
Copy link
Author

Sorry for the slow turn around... here are a few comments, should be easy to fix : )

No problem, thanks for the review! I'll address your suggestions. Regarding tests---should I copy the test for the teamvector implementation? Or should I try to unify them somehow?

@cwpearson
Copy link
Contributor

cwpearson commented Sep 4, 2025

@Yurlungur thanks! You will need to sign off on each commit, and also run clang-format 16 (or you can wait for the format check to fail and look in the output and apply the diff it prints)

@Yurlungur
Copy link
Author

@Yurlungur thanks! You will need to sign off on each commit, and also run clang-format 16 (or you can wait for the format check to fail and look in the output and apply the diff it prints)

Gotcha. Will do. Sorry for the delay---I was traveling this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants