Pallas on CPU

## 🐛 Bug

I am attempting to implement custom Pallas kernels locally on a CPU for use with a TPU. I'm attempting to follow the official example [here](https://github.com/pytorch/xla/blob/master/docs/pallas.md), with the minor modification being that I run the script on a CPU using interpret mode. After investigating, it appears that the main branch's latest code for a [custom kernel](https://github.com/pytorch/xla/blob/master/torch_xla/experimental/custom_kernel.py) should fix any issues with this error. 

## To Reproduce

Please use the colab [here](https://colab.research.google.com/drive/1BB1wX21xqNfBYRyRz98JiUATMx7EAT1S#scrollTo=N6U1YKvYD9xE): 

Steps to reproduce the behavior:

1. Run the colab
2. Observe errors in the last two cells

## Expected behavior

It should execute the code without any errors

## Environment

 - Reproducible on XLA backend [CPU/TPU/CUDA]: CPU
 - torch_xla version: ~=2.3.0


## Additional context

N/A


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pallas on CPU #7599

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pallas on CPU #7599

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions