Conversion from strided to batched sparse compressed tensor with a non-constant number of zeros in batches fails #104193
Labels
module: sparse
Related to torch.sparse
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Issue description
As in the title.
The issue is created to discuss various approaches to supporting the strided-to-sparse-compressed conversion for the cases where the number of zeros in different batches is not equal.
Code example
Consider the following batched tensor of 2-by-2 tensors:
that can be represented as a batched CSR tensor:
because both batches have equal number zeros: 2.
Next, consider a batched tensor with an unequal number of zeros in batches:
that currently cannot be represented as a batched CSR tensor:
because the number of zeros in batches is different: 1 and 2, respectively.
Discussion
The following approaches exist to create a batched CSR tensor from batches having unequal numbers of zeros.
Approach 1: allow materialization of certain zeros
Notice that in the conversion of a strided to a CSR tensor, non-zero elements and specified elements are considered as synonyms. If we relax this condition and allow certain zero elements to become specified elements for the CSR representation, the example tensor
y
defined above can be represented as a batched CSR tensor. In fact, there exist many such representations, for example:that differ in the choice of materialized zeros.
Pros:
Cons:
Approach 2: allow a variable number of specified elements in batches
A prototype of this approach is implemented at #84843
The example tensor
y
defined above can be represented as a batched CSR tensor uniquely:where each batch has the expected NSE count:
Pros:
to_sparse_csr()
on CUDA tensors increased by 15%Cons:
compressed_index[..., -1] == nnz
becomescompressed_index[..., -1] <= nnz
System Info
cc @alexsamardzic @nikitaved @cpuhrsch @amjames @bhosmer
The text was updated successfully, but these errors were encountered: