minuet.nn.functional.convolution#
Functions
| 
 | Benchmark the gather operation | 
| 
 | Benchmark the GEMM operation | 
| 
 | Benchmark the scatter operation | 
| 
 | Set the parallelization level (number of CUDA streams) for executing GEMM operations | 
| 
 | Executes the forward pass of a sparse convolution | 
- cuda_time_gather(weights: Tensor, source_masks: Tensor, target_masks: Tensor, kernel_map_sizes: Tensor, tile_size: int, allow_shortcut_matmul: bool = False, threshold: float = 0) float#
- Benchmark the gather operation - Parameters:
- weights – the weight tensors 
- source_masks – the source masks from the kernel map 
- target_masks – the target masks from the kernel map 
- kernel_map_sizes – the sizes of each weight in the kernel map 
- tile_size – the tile size for the gather operation 
- allow_shortcut_matmul – whether allows shortcut of computing trivial weight 
- threshold – the threshold that controls the padding of the GEMM operands 
 
- Returns:
- the measured time of the gather operation 
 
- cuda_time_gemm(weights: Tensor, source_masks: Tensor, target_masks: Tensor, kernel_map_sizes: Tensor, allow_shortcut_matmul: bool = False, parallel: int | None = None, threshold: float = 0)#
- Benchmark the GEMM operation - Parameters:
- weights – the weight tensors 
- source_masks – the source masks from the kernel map 
- target_masks – the target masks from the kernel map 
- kernel_map_sizes – the sizes of each weight in the kernel map 
- allow_shortcut_matmul – whether allows shortcut of computing trivial weight 
- parallel – the parallelization level of the GEMM operation 
- threshold – the threshold that controls the padding of the GEMM operands 
 
- Returns:
- the measured time of the GEMM operation 
 
- cuda_time_scatter(weights: Tensor, source_masks: Tensor, target_masks: Tensor, kernel_map_sizes: Tensor, tile_size: int, allow_shortcut_matmul: bool = False, threshold: float = 0)#
- Benchmark the scatter operation - Parameters:
- weights – the weight tensors 
- source_masks – the source masks from the kernel map 
- target_masks – the target masks from the kernel map 
- kernel_map_sizes – the sizes of each weight in the kernel map 
- tile_size – the tile size for the scatter operation 
- allow_shortcut_matmul – whether allows shortcut of computing trivial weight 
- threshold – the threshold that controls the padding of the GEMM operands 
 
- Returns:
- the measured time of the scatter operation 
 
- set_gemm_parallel_level(level: int)#
- Set the parallelization level (number of CUDA streams) for executing GEMM operations - Parameters:
- level – the parallelization level 
 
- sparse_convolution_forward(sources: Tensor, weights: Tensor, source_masks: Tensor, target_masks: Tensor, kernel_map_order: Tensor | None, kernel_map_sizes: Tensor, gather_tile_size: int, scatter_tile_size: int, allow_shortcut_matmul: bool = False, parallel: int | None = None, threshold: float | None = 0) Tensor#
- Executes the forward pass of a sparse convolution - Parameters:
- sources – the feature tensor of the input - SparseTensor
- weights – the weight tensor of the - SparseConv
- source_masks – the source masks from the kernel map 
- target_masks – the target masks from the kernel map 
- kernel_map_order – the order of the sorted weights by the kernel map sizes 
- kernel_map_sizes – the sizes of each weight in the kernel map 
- gather_tile_size – the tile size for the gather operation 
- scatter_tile_size – the tile size for the scatter operation 
- allow_shortcut_matmul – whether allows shortcut of computing trivial weight 
- parallel – the parallelization level of the GEMM operation 
- threshold – the threshold that controls the padding of the GEMM operands 
 
- Returns:
- the feature tensor of the output - SparseTensor