Inspiration
My initial interest came from my school research where I had used CUDA for benchmarking various NVIDIA GPUs. I was deeply intrigued by the possibilities of porting this code to a wider range of platforms, especially after learning about Intel’s oneAPI, DPCPP SYCL, and their new Data Center GPU Max Series.
What I Learnt
This hackathon was an incredible learning curve. I gained firsthand experience with oneAPI and the SYCL framework, realizing the potential of cross-platform accelerator programming. The process taught me how to make migrations from CUDA to SYCL, giving me a broader understanding of various vendor-specific architectures beyond NVIDIA.
How I Built It
Starting with my original CUDA code, which performed Hermitian matrix multiplication, I was intrigued by the task of migrating it to SYCL. By following the provided guidelines and leveraging Intel Developer discussions, I transitioned my codebase to be compatible with Intel's ecosystem. My focus was not just on making the code work but ensuring optimal performance on new platforms.
Challenges I Faced
Migrating from CUDA to SYCL was not completely straightforward. Some libraries in CUDA did not have direct equivalents in SYCL(After automatic porting using c2s), so I had to find alternatives or come up with custom subroutines. Performance tuning was another hurdle, given the various differences between NVIDIA and Intel hardware architectures. Insights from Intel VTune and Advisor were invaluable in pinpointing bottlenecks and areas of improvement.
Log in or sign up for Devpost to join the conversation.