I want to initialize tensor to sparse tensor.
When tensor’s dimensional is 2, I can use torch.nn.init.sparse(tensor, sparsity=0.1)
import torch
dim = torch.Size([3,2])
w = torch.Tensor(dim)
torch.nn.init.sparse_(w, sparsity=0.1)
Result
tensor([[ 0.0000, 0.0147],
[-0.0190, 0.0004],
[-0.0004, 0.0000]])
But when tensor dimensions > 2, this function isn’t work.
v = torch.Tensor(torch.Size([5,5,30,2]))
torch.nn.init.sparse_(v, sparsity=0.1)
Result
ValueError: Only tensors with 2 dimensions are supported
I need this because I want to use it to initialize the convolution weights.
torch.nn.init.sparse_()
function’s def is below
def sparse_(tensor, sparsity, std=0.01):
r"""Fills the 2D input `Tensor` as a sparse matrix, where the
non-zero elements will be drawn from the normal distribution
:math:`mathcal{N}(0, 0.01)`, as described in `Deep learning via
Hessian-free optimization` - Martens, J. (2010).
Args:
tensor: an n-dimensional `torch.Tensor`
sparsity: The fraction of elements in each column to be set to zero
std: the standard deviation of the normal distribution used to generate
the non-zero values
Examples:
>>> w = torch.empty(3, 5)
>>> nn.init.sparse_(w, sparsity=0.1)
"""
if tensor.ndimension() != 2:
raise ValueError("Only tensors with 2 dimensions are supported")
rows, cols = tensor.shape
num_zeros = int(math.ceil(sparsity * rows))
with torch.no_grad():
tensor.normal_(0, std)
for col_idx in range(cols):
row_indices = torch.randperm(rows)
zero_indices = row_indices[:num_zeros]
tensor[zero_indices, col_idx] = 0
return tensor
How could I make n-dimensional sparse tensor?
Is there a way in pytorch to create this kind of tensor?
Or can I make it another way?
Advertisement
Answer
This function is an implementation of the following method:
The best random initialization scheme we found was one of our own design, “sparse initialization”. In this scheme we hard limit the number of non-zero incoming connection weights to each unit (we used 15 in our experiments) and set the biases to 0 (or 0.5 for tanh units).
- Deep learning via Hessian-free optimization – Martens, J. (2010).
The reason it is not supported for higher order tensors is because it maintains the same proportion of zeros in each column, and it is not clear which [subset of] dimensions this condition should be maintained across for higher order tensors.
You can implement this initialization strategy with dropout or an equivalent function e.g:
def sparse_(tensor, sparsity, std=0.01):
with torch.no_grad():
tensor.normal_(0, std)
tensor = F.dropout(tensor, sparsity)
return tensor
If you wish to enforce column, channel, etc-wise proportions of zeros (as opposed to just total proportion) you can implement logic similar to the original function.