Unable to fully install xFormers in auxilliary notebook

## 🐛 Bug

I`m trying to install xFormers and a few other packages in an auxillliary notebook in order to use it as a utility script in a submission notebook I am preparing for the UBC-OCEAN competition. However, after installation, I try running my submission notebook and obtain the following error message:

```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[6], line 37
     35 coords = coords.squeeze(0)
     36 X = tiles.float().to(device=device, non_blocking=True)
---> 37 y_prob, pred, features = model(X, coords)
     38 query_preds.append((image_id.item(), labels[pred.to(device='cpu').item()]))
     39 query_features.append(features.view(-1).to(device='cpu'))

File /kaggle/usr/lib/ubc_ocean_packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

Cell In[4], line 227, in WSINet.forward(self, x, coords)
    225 def forward(self, x, coords):
    226     features = self.encoder(x).unsqueeze(0)
--> 227     features, mask = self.roformer(features, coords)
    228     y_prob, y_hat, attention = self.attention(features)
    230     return y_prob, y_hat, attention

File /kaggle/usr/lib/ubc_ocean_packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

Cell In[4], line 97, in RoFormerLayer.forward(self, x, coords)
     95 q, k = apply_rotary_position_embeddings(self.rope(h, grid_h, grid_w), q, k)
     96 q, k, v = q.reshape(bs, n, self.heads, self.head_dim), k.reshape(bs, n, self.heads, self.head_dim), v.reshape(bs, n, self.heads, self.head_dim)
---> 97 att = fmha.memory_efficient_attention(q, k, v, attn_bias=mask, p = self.dropout, op=(fmha.cutlass.FwOp, fmha.cutlass.BwOp))
     98 o = self.norm2(h + att.reshape(bs, n, h.size(-1)))
     99 ff = self.mlp(o)

File /kaggle/usr/lib/ubc_ocean_packages/xformers/ops/fmha/__init__.py:223, in memory_efficient_attention(query, key, value, attn_bias, p, scale, op)
    116 def memory_efficient_attention(
    117     query: torch.Tensor,
    118     key: torch.Tensor,
   (...)
    124     op: Optional[AttentionOp] = None,
    125 ) -> torch.Tensor:
    126     """Implements the memory-efficient attention mechanism following
    127     `"Self-Attention Does Not Need O(n^2) Memory" <[http://arxiv.org/abs/2112.05682>`_.](http://arxiv.org/abs/2112.05682%3E%60_.%3C/span%3E)
    128 
   (...)
    221     :return: multi-head attention Tensor with shape ``[B, Mq, H, Kv]``
    222     """
--> 223     return _memory_efficient_attention(
    224         Inputs(
    225             query=query, key=key, value=value, p=p, attn_bias=attn_bias, scale=scale
    226         ),
    227         op=op,
    228     )

File /kaggle/usr/lib/ubc_ocean_packages/xformers/ops/fmha/__init__.py:321, in _memory_efficient_attention(inp, op)
    316 def _memory_efficient_attention(
    317     inp: Inputs, op: Optional[AttentionOp] = None
    318 ) -> torch.Tensor:
    319     # fast-path that doesn't require computing the logsumexp for backward computation
    320     if all(x.requires_grad is False for x in [inp.query, inp.key, inp.value]):
--> 321         return _memory_efficient_attention_forward(
    322             inp, op=op[0] if op is not None else None
    323         )
    325     output_shape = inp.normalize_bmhk()
    326     return _fMHA.apply(
    327         op, inp.query, inp.key, inp.value, inp.attn_bias, inp.p, inp.scale
    328     ).reshape(output_shape)

File /kaggle/usr/lib/ubc_ocean_packages/xformers/ops/fmha/__init__.py:339, in _memory_efficient_attention_forward(inp, op)
    337     op = _dispatch_fw(inp, False)
    338 else:
--> 339     _ensure_op_supports_or_raise(ValueError, "memory_efficient_attention", op, inp)
    341 out, *_ = op.apply(inp, needs_gradient=False)
    342 return out.reshape(output_shape)

File /kaggle/usr/lib/ubc_ocean_packages/xformers/ops/fmha/dispatch.py:39, in _ensure_op_supports_or_raise(exc_type, name, op, inp)
     37     if not reasons:
     38         return
---> 39     raise exc_type(
     40         f"""Operator `{name}` does not support inputs:
     41 {textwrap.indent(_format_inputs_description(inp), '     ')}
     42 {_format_not_supported_reasons(op, reasons)}"""
     43     )

ValueError: Operator `memory_efficient_attention` does not support inputs:
     query       : shape=(1, 7040, 8, 96) (torch.float32)
     key         : shape=(1, 7040, 8, 96) (torch.float32)
     value       : shape=(1, 7040, 8, 96) (torch.float32)
     attn_bias   : <class 'xformers.ops.fmha.attn_bias.BlockDiagonalMask'>
     p           : 0.25
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
    operator wasn't built - see `python -m xformers.info` for more info
```

The 'python -m xformers.info' command returns the following (note the unavailability of the memory_efficient_attention methods; I need these):

```
xFormers 0.0.22.post7+cu118
memory_efficient_attention.cutlassF:               unavailable
memory_efficient_attention.cutlassB:               unavailable
memory_efficient_attention.decoderF:               unavailable
memory_efficient_attention.flshattF@0.0.0:         unavailable
memory_efficient_attention.flshattB@0.0.0:         unavailable
memory_efficient_attention.smallkF:                unavailable
memory_efficient_attention.smallkB:                unavailable
memory_efficient_attention.tritonflashattF:        unavailable
memory_efficient_attention.tritonflashattB:        unavailable
memory_efficient_attention.triton_splitKF:         unavailable
indexing.scaled_index_addF:                        available
indexing.scaled_index_addB:                        available
indexing.index_select:                             available
swiglu.dual_gemm_silu:                             unavailable
swiglu.gemm_fused_operand_sum:                     unavailable
swiglu.fused.p.cpp:                                not built
is_triton_available:                               True
pytorch.version:                                   2.0.1+cu118
pytorch.cuda:                                      available
gpu.compute_capability:                            6.0
gpu.name:                                          Tesla P100-PCIE-16GB
build.info:                                        available
build.cuda_version:                                1108
build.python_version:                              3.10.13
build.torch_version:                               2.1.0+cu118
build.env.TORCH_CUDA_ARCH_LIST:                    5.0+PTX 6.0 6.1 7.0 7.5 8.0+PTX 9.0
build.env.XFORMERS_BUILD_TYPE:                     Release
build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS:        None
build.env.NVCC_FLAGS:                              None
build.env.XFORMERS_PACKAGE_FROM:                   wheel-v0.0.22.post7
source.privacy:                                    open source
```

Also, I have noticed the following warnings on the xformers part of the install log for the auxilliary notebook, despite having successully installed the required cupy-cuda11x:

```
cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
cuml 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
dask-cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
```

I believe the missing cupy-cuda11x is responsible for the fact that CUDA is not available for this xFormers install.

In short, I need one or both the following:
1- the package cupy-cuda11x (current version is 12.2.0 and works with CUDA 11.2-11.8) installed in the environment for GPUs
2- (even better) the package xFormers cu118 installed in the environment for GPUs (however, I am conscious of the fact that the P100s are using CUDA 11.4 only, an upgrade of Nvidia driver might be required)

### To Reproduce

On a GPU P100 notebook with Persistence - Files only and set up as a utility script, run and save the following:

```
!pip install cupy-cuda11x --target=/kaggle/working/
!pip install torchstain --target=/kaggle/working
!pip install faiss-cpu --target=/kaggle/working/
!pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --target=/kaggle/working/ --index-url https://download.pytorch.org/whl/cu118
!pip3 install xformers --target=/kaggle/working/ --index-url https://download.pytorch.org/whl/cu118
```

Then, in another GPU P100 notebook, add the utility script built above and run the command '!python -m xformers.info'

### Expected behavior

The notebook should be able to contain and execute any correctly written piece of code calling for the 'memory_efficient_attention' component of xFormers, at least for the base CUTLASS methods (the other methods are reliant on the Triton and Flash-Attention packages being installed; for the purpose of this request, I do not need them)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to fully install xFormers in auxilliary notebook #1335

🐛 Bug

To Reproduce

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to fully install xFormers in auxilliary notebook #1335

Description

🐛 Bug

To Reproduce

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions