Skip to content
Advertisement

numpy genfromtxt collapse recarray when data file has only one row

I am using genfromtxt function to read data from a csv file.

data = np.genfromtxt(file_name, dtype=np.dtype(input_vars), delimiter=",")

Then I can access the array columns with e.g.:

data["My column name"]

which then returns a 1-dimensional vector. Only when the source file has exactly one data row, the array is collapsed – its shape==() and therefore the vector returned by data["My column name"] is not a vector but just a value and some subsequent functions fail because they expect a vector.

What I need is to make it always a vector. In other words, I need that genfromtxt does not collapse the dimensionality of the array even if the data file has only one row.

In other words, if the source data file has two rows, the data.shape==(2,). But if the source data file has only one row, the data.shape==() but I need it to be (1,). Then, if I am correct, data["My column name"] would return a vector (though with one element) and the subsequent functions would not fail.

How to do it? data.reshape((1,)) and np.atleast_1d(data) do not work for me for some strange reason, not sure why…

Update:

I made a simple example to illustarate my problem.

Suppose I have two files:

mydata1.csv which is one row:

1,2,3

and mydata2.csv which has two rows:

1,2,3
4,5,6

This is the code snippet (problem described in the comments):

import numpy as np
dt = [("A", "<i4"), ("B", "<i4"), ("C", "<i4")]
data2 = np.genfromtxt("mydata2.csv", dtype=dt, delimiter=",")
print(data2.shape)  # returns (2,)
data1 = np.genfromtxt("mydata1.csv", dtype=dt, delimiter=",")
print(data1.shape)  # returns () but I need it to return (1,)

data2["A"]  # returns a 1D vector with two values
data1["A"]  # returns a value (zero dimensional) bt I need a 1D vector with one value

All workarounds that I can come up with are a way too ugly and result in too much code refactoring. Ideally I would need to have always a 1-D recarray as the result of genfromtxt.

Advertisement

Answer

When you have only one line in the csv file you are obtaining data as a np.void object. You can use force data to be a np.ndarray doing:

data = np.atleast_1d(data)
Advertisement