Skip to content
  • Burc Eryilmaz's avatar
    fix CUBLAS guards (#1162) · 54b93919
    Burc Eryilmaz authored
    
    
    * support for fused dense layer with cublasLt, fusion in both fprop and bprop
    
    * fix typo causing syntax error
    
    * add fused GEMM+gelu+GEMM modue
    
    * fix typo for workspace size
    
    * update cublas check for 11600
    
    * add tests for fused dense layer
    
    * fix CUDA 10.x path
    
    * safer guard around CUBLAS constants, remove unreferenced variable
    
    * more guard changes
    
    * guard against cublas version instead of cuda
    
    Co-authored-by: default avatarSukru Eryilmaz <seryilmaz@computelab-dgx1v-32.nvidia.com>
    54b93919