Skip to content

Unnecessary slowness of PyRAM.profl ? #6

@clapeyre

Description

@clapeyre

Hi,

Thanks for this great Python / numba implementation!

I was doing some profiling and noticed what seemed like a low-hanging fruit to get a small speed boost. Most of PyRAM.profl relies on vectorized numpy methods. Yet, for some reason I can't figure out, the final bit doesn't and reverts to a native Python loop:

        for i in range(self.nz + 2):
            self.ksqw[i] = (self.omega / self.cw[i])**2 - self.k0**2
            self.ksqb[i] = ((self.omega / self.cb[i]) *
                            (1 + 1j * self.eta * self.attn[i]))**2 - self.k0**2
            self.alpw[i] = numpy.sqrt(self.cw[i] / self._c0)
            self.alpb[i] = numpy.sqrt(self.rhob[i] * self.cb[i] / self._c0)

In my testing, this single loop is responsible for 50% of the overall compute time. Switching it to a natural numpy call seems to yield the same outputs in my case, and is dramatically faster:

        self.ksqw = (self.omega / self.cw)**2 - self.k0**2
        self.ksqb = ((self.omega / self.cb) *
                    (1 + 1j * self.eta * self.attn))**2 - self.k0**2
        self.alpw = numpy.sqrt(self.cw / self._c0)
        self.alpb = numpy.sqrt(self.rhob * self.cb / self._c0)

Is there a reason this loop was the only one kept Python-native instead of numpy-vectorized?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions