-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
Hi,
Thanks for this great Python / numba implementation!
I was doing some profiling and noticed what seemed like a low-hanging fruit to get a small speed boost. Most of PyRAM.profl relies on vectorized numpy methods. Yet, for some reason I can't figure out, the final bit doesn't and reverts to a native Python loop:
for i in range(self.nz + 2):
self.ksqw[i] = (self.omega / self.cw[i])**2 - self.k0**2
self.ksqb[i] = ((self.omega / self.cb[i]) *
(1 + 1j * self.eta * self.attn[i]))**2 - self.k0**2
self.alpw[i] = numpy.sqrt(self.cw[i] / self._c0)
self.alpb[i] = numpy.sqrt(self.rhob[i] * self.cb[i] / self._c0)In my testing, this single loop is responsible for 50% of the overall compute time. Switching it to a natural numpy call seems to yield the same outputs in my case, and is dramatically faster:
self.ksqw = (self.omega / self.cw)**2 - self.k0**2
self.ksqb = ((self.omega / self.cb) *
(1 + 1j * self.eta * self.attn))**2 - self.k0**2
self.alpw = numpy.sqrt(self.cw / self._c0)
self.alpb = numpy.sqrt(self.rhob * self.cb / self._c0)Is there a reason this loop was the only one kept Python-native instead of numpy-vectorized?
Metadata
Metadata
Assignees
Labels
No labels