Improve our memory allocator support: - Use a memory pool for CPU memory. - Use our own internal memory pool for GPU memory (not necessarily CUB, since we want more flexibility in bucket sizing, etc.). - Support pinned memory.