nerfstudio-project / gsplat
CUDA accelerated rasterization of gaussian splatting
See what the GitHub community is most excited about today.
CUDA accelerated rasterization of gaussian splatting
Tile primitives for speedy kernels
CUDA Library Samples
A massively parallel, optimal functional runtime in Rust
Instant neural graphics primitives: lightning fast NeRF and more
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
NCCL Tests
Causal depthwise conv1d in CUDA, with a PyTorch interface
CUDA Kernel Benchmarking Library
cuGraph - RAPIDS Graph Analytics Library