I want to initialize tensor to sparse tensor.
When tensor’s dimensional is 2, I can use torch.nn.init.sparse(tensor, sparsity=0.1)
import torch dim = torch.Size([3,2]) w = torch.Tensor(dim) torch.nn.init.sparse_(w, sparsity=0.1)
Result
tensor([[ 0.0000, 0.0147], [-0.0190, 0.0004], [-0.0004, 0.0000]])
But when tensor dimensions > 2, this function isn’t work.
v = torch.Tensor(torch.Size([5,5,30,2])) torch.nn.init.sparse_(v, sparsity=0.1)
Result
ValueError: Only tensors with 2 dimensions are supported
I need this because I want to use it to initialize the convolution weights.
torch.nn.init.sparse_()
function’s def is below
def sparse_(tensor, sparsity, std=0.01): r"""Fills the 2D input `Tensor` as a sparse matrix, where the non-zero elements will be drawn from the normal distribution :math:`mathcal{N}(0, 0.01)`, as described in `Deep learning via Hessian-free optimization` - Martens, J. (2010). Args: tensor: an n-dimensional `torch.Tensor` sparsity: The fraction of elements in each column to be set to zero std: the standard deviation of the normal distribution used to generate the non-zero values Examples: >>> w = torch.empty(3, 5) >>> nn.init.sparse_(w, sparsity=0.1) """ if tensor.ndimension() != 2: raise ValueError("Only tensors with 2 dimensions are supported") rows, cols = tensor.shape num_zeros = int(math.ceil(sparsity * rows)) with torch.no_grad(): tensor.normal_(0, std) for col_idx in range(cols): row_indices = torch.randperm(rows) zero_indices = row_indices[:num_zeros] tensor[zero_indices, col_idx] = 0 return tensor
How could I make n-dimensional sparse tensor?
Is there a way in pytorch to create this kind of tensor?
Or can I make it another way?
Advertisement
Answer
This function is an implementation of the following method:
The best random initialization scheme we found was one of our own design, “sparse initialization”. In this scheme we hard limit the number of non-zero incoming connection weights to each unit (we used 15 in our experiments) and set the biases to 0 (or 0.5 for tanh units).
- Deep learning via Hessian-free optimization – Martens, J. (2010).
The reason it is not supported for higher order tensors is because it maintains the same proportion of zeros in each column, and it is not clear which [subset of] dimensions this condition should be maintained across for higher order tensors.
You can implement this initialization strategy with dropout or an equivalent function e.g:
def sparse_(tensor, sparsity, std=0.01): with torch.no_grad(): tensor.normal_(0, std) tensor = F.dropout(tensor, sparsity) return tensor
If you wish to enforce column, channel, etc-wise proportions of zeros (as opposed to just total proportion) you can implement logic similar to the original function.