Skip to content

Fix partial any/all reduction#959

Merged
cliffburdick merged 5 commits into
NVIDIA:mainfrom
simonbyrne:sbyrne/reduce-all
May 19, 2025
Merged

Fix partial any/all reduction#959
cliffburdick merged 5 commits into
NVIDIA:mainfrom
simonbyrne:sbyrne/reduce-all

Conversation

@simonbyrne
Copy link
Copy Markdown
Collaborator

Fixes #931.

@simonbyrne simonbyrne force-pushed the sbyrne/reduce-all branch from e777e9e to 749864b Compare May 8, 2025 18:25
@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

Comment thread include/matx/transforms/reduce.h Outdated
@cliffburdick
Copy link
Copy Markdown
Collaborator

Looks good! Did you happen to reproduce the original bug?

@simonbyrne simonbyrne force-pushed the sbyrne/reduce-all branch from 5533afc to 749864b Compare May 8, 2025 18:53
@simonbyrne
Copy link
Copy Markdown
Collaborator Author

Yes, it occurs when you do a partial reduction of a tensor rank greater than 2 (basically when CUB isn't used).

@simonbyrne simonbyrne force-pushed the sbyrne/reduce-all branch from 25ab8bc to 59412bc Compare May 8, 2025 18:55
@cliffburdick
Copy link
Copy Markdown
Collaborator

Yes, it occurs when you do a partial reduction of a tensor rank greater than 2 (basically when CUB isn't used).

Since this code hasn't been touched in a while, is there a reason we can't use CUB in that case? We used to not use CUB in that situation before we had operators as input into CUB, now using ReduceInput we should support all ranks.

@simonbyrne
Copy link
Copy Markdown
Collaborator Author

ah, I thought CUB required contiguous segments. Let me see if I can figure it out.

@simonbyrne
Copy link
Copy Markdown
Collaborator Author

Okay, it took me a bit to figure out how it all worked, but my reading of it is that we can now use CUB for all reductions.

Comment thread include/matx/transforms/reduce.h Outdated
@simonbyrne
Copy link
Copy Markdown
Collaborator Author

Okay, I've deleted all the non-CUB reduction code.

@simonbyrne simonbyrne requested a review from cliffburdick May 15, 2025 21:29
@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

@cliffburdick
Copy link
Copy Markdown
Collaborator

Looks like docs are failing:


/home/jenkins/workspace/unit-tests/docs_input/api/manipulation/selecting/reduce.rst:8: WARNING: doxygenfunction: Unable to resolve function "reduce" with arguments (OutType, TensorIndexType, const InType&, ReduceOp, cudaStream_t, bool) in doxygen xml output for project "MatX" from directory: /home/jenkins/workspace/unit-tests/build/docs_input/doxygen/xml.
Potential matches:
- template<typename InType, int D, typename ReduceOp> __MATX_INLINE__ auto reduce(const InType &in, const int (&dims)[D], ReduceOp op, bool init = true)
- template<typename InType, typename ReduceOp> __MATX_INLINE__ auto reduce(const InType &in, ReduceOp op, bool init = true)
- template<typename OutType, typename InType, typename ReduceOp> void __MATX_INLINE__ reduce(OutType dest, const InType &in, ReduceOp op, cudaStream_t stream = 0, bool init = true) [docutils]
- ```

@simonbyrne
Copy link
Copy Markdown
Collaborator Author

Ah, I see what happened. So it turns out there are two different reduce methods

  1. the MatX operators in operators/reduce.h

    • they don't appear in the current documentation
  2. the mutating ones in transforms/reduce.h where the output tensor is provided as an argument to the function

    • are not actually MatX operators: e.g. they accept a stream as an argument, instead of returning an operator and calling run().
    • the first of these is mentioned in the docs
    • are removed by this PR

I can either remove 2 completely, or simply make it fall back on calling the operator?

@cliffburdick
Copy link
Copy Markdown
Collaborator

Ah, I see what happened. So it turns out there are two different reduce methods

  1. the MatX operators in operators/reduce.h

    • they don't appear in the current documentation
  2. the mutating ones in transforms/reduce.h where the output tensor is provided as an argument to the function

    • are not actually MatX operators: e.g. they accept a stream as an argument, instead of returning an operator and calling run().
    • the first of these is mentioned in the docs
    • are removed by this PR

I can either remove 2 completely, or simply make it fall back on calling the operator?

Let's remove it from the docs and only have the operator

@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

@cliffburdick cliffburdick merged commit 266b2a3 into NVIDIA:main May 19, 2025
1 check passed
@simonbyrne simonbyrne deleted the sbyrne/reduce-all branch May 20, 2025 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Function "all" and "any" can not work normally

2 participants