How do you remove corresponding x values of missing y-data from lists?

Question

When I plot this graph the y-axis looks very off. Realising I need to remove the &#8220;&#8221; spaces in the list, I tried this method: This showed me a dimensional error. I don&#8217;t know how to only extract the corresponding values of x to the y values. Answer You can use pandas&#8217; pd.to_numeric(&#82…

Accepted Answer

You can use pandas&#8217; pd.to_numeric(..., errors='coerce') to convert each of the strings in the lists to &#8216;nan&#8217;.  (Numpy&#8217;s np.genfromtxt(np.array(..., dtype=str)) does something similar, but also removes the empty strings).nan values will be skipped while plotting. Matplotlib puts its list of x-values next to the corresponding y-values, e.g. 2011.005, 1.4356 for the first pair and 2012.6543, np.nan for the second. Each pair that has one or two nan values will not be plotted.Here is some example code:import matplotlib.pyplot as pltimport pandas as pdcol1 = [2011.005, 2012.6543, 2013.3456, 2014.7821, 2015.3421, 2016.7891, 2017.0173, 2018.1974]col2 = [1.4356, "", 5.32245, 6.542, 7.567, .77558, "", ""]col3 = [1.3345, 2.345, "", 5.356, 3.124, 6.12, "", ""]col4 = [0.67, 4.235, "", 6.78, "", "", 9.56, ""]col1 = pd.to_numeric(col1, errors='coerce')col2 = pd.to_numeric(col2, errors='coerce')col3 = pd.to_numeric(col3, errors='coerce')col4 = pd.to_numeric(col4, errors='coerce')plt.figure()plt.plot(col1, col2, "b.", linewidth=1, label="Sample 1")plt.plot(col1, col3, "g.", linewidth=1, label="Sample 2")plt.plot(col1, col4, "r.", linewidth=1, label="Sample 3")plt.legend()plt.show()It is unclear how your csv file looks like. The following example supposes the file looks like csv_as_str.  (StringIO is a function that lets you mimic a file with a string, so it is easier to add to a post. Reading from a file would just be df = pd.read_csv('your_file.csv').)import pandas as pdfrom io import StringIOcsv_as_str ='''col1,col2,col3,col42011.005,1.4356,1.3345,0.672012.6543,,2.345,4.2352013.3456,5.32245,,2014.7821,6.542,5.356,6.782015.3421,7.567,3.124,2016.7891,0.77558,6.12,2017.0173,,,9.562018.1974,,,'''df = pd.read_csv(StringIO(csv_as_str))Then the dataframe already has nan for the empty spots:        col1     col2    col3   col40  2011.0050  1.43560  1.3345  0.6701  2012.6543      NaN  2.3450  4.2352  2013.3456  5.32245     NaN    NaN3  2014.7821  6.54200  5.3560  6.7804  2015.3421  7.56700  3.1240    NaN5  2016.7891  0.77558  6.1200    NaN6  2017.0173      NaN     NaN  9.5607  2018.1974      NaN     NaN    NaN

Advertisement

Answer