Skip to content
Advertisement

Appending new value to the dataframe

print(stocksList.tail(1))
stocksList.loc[len(stocksList.index)] = ["NSEI"]
print(stocksList.tail(1))

Above code prints same value twice i.e.

         Symbol
1684  ZUARIGLOB

         Symbol
1684  ZUARIGLOB

Why is it not appending NSEI at the end of the stocksList dataframe?

Full code:

folPath = "D:\MyDocs\STKS\YT\"

nifty50 = pd.read_csv(folPath + "n50.csv")
stocksList = pd.read_csv(folPath + "stocksList.csv")
stocksList = stocksList[~stocksList['Symbol'].isin(nifty50['Symbol'])]
print(stocksList.tail(1))
stocksList.loc[len(stocksList), 'Symbol'] = "NSEI"
print(stocksList.tail(1))
print(stocksList)

Advertisement

Answer

how your code is flawed

Relying on the length of the index on a dataframe with a reworked index is not reliable. Here is a simple example demonstrating how it can fail.

input:

df = pd.DataFrame({'Symbol': list('ABCD')},
                  index=np.arange(4))
  Symbol
0      A
1      B
2      C
3      D

Pre-processing:

>>> bad_symbols = ['A', 'B']
>>> df = df[~df['Symbol'].isin(bad_symbols)]
>>> df
  Symbol
2      C
3      D

Attempt to append a row at the end using index length:

>>> df.loc[len(df.index), 'Symbol'] = 'E'
>>> df
  Symbol
2      E
3      D

See what happended here? len(df.index) is 2, but 2 is an already existing row.

how to fix it

Use a reliable method to append a new row. Let’s start again from:

  Symbol
2      C
3      D
>>> df = df.append(pd.Series({'Symbol': 'E'}, name=max(df.index)+1))
>>> df
  Symbol
2      C
3      D
4      E

Or, aternatively:

df.loc[max(df.index)+1, 'Symbol'] = 'E'

but be careful of SettingWithCopyWarning

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement