We found 1 job matching to your search

Advance Search

Skills

Locations

Experience

Job Description

Senior Engineer Develop and optimize GPU compute kernels using CUDA, Tensor Cores, and NVIDIA libraries Implement and tune high-performance matrix operations using cuBLAScuBLASLt, CUTLASS, and cuSPARSE Design batch grouped execution flows for large-scale GEMM SpMM workloads Optimize memory layouts, data movement, and kernel launch efficiency Conduct profiling and performance tuning using Nsight Compute Nsight Systems Build clean and stable CC APIs and integrate with Python (PyBind11 PyTorch extensions) Ensure numerical correctness, mixed-precision stability, and robust error handling Essential Skills: We are seeking a highly skilled Senior CCUDA Engineer with strong experience in GPU-accelerated compu-tation and deep learning inference. The role involves developing high-performance GPU kernels, optimizing large-scale matrix operations, and integrating with Python-based orchestration layers.Responsibilities De-velop and optimize GPU compute kernels using CUDA, Tensor Cores, and NVIDIA libraries Implement and tune high-performance matrix operations using cuBLAScuBLASLt, CUTLASS, and cuSPARSE Design batch grouped execution flows for large-scale GEMM SpMM workloads Optimize memory layouts, data move-ment, and kernel launch efficiency Conduct profiling and performance tuning using Nsight Compute Nsight Systems Build clean and stable CC APIs and integrate with Python (PyBind11 PyTorch extensions) Ensure numerical correctness, mixed-precision stability, and robust error handling .

Responsibilities

  • Salary : As per industry standard.
  • Industry :IT-Software / Software Services
  • Functional Area : IT Software - Application Programming , Maintenance
  • Role Category :Programming & Design
  • Role :Senior Engineer