I’m reading from a sqlite3 db into a df:
id symbol name 0 1 QCLR Global X Funds Global X NASDAQ 100 Collar 95-1... 1 2 LCW Learn CW Investment Corporation 2 3 BUG Global X Funds Global X Cybersecurity ETF 3 4 LDOS Leidos Holdings, Inc. 4 5 LDP COHEN & STEERS LIMITED DURATION PREFERRED AND ... ... ... ... ... 10999 11000 ERIC Ericsson American Depositary Shares 11000 11001 EDI Virtus Stone Harbor Emerging Markets Total Inc... 11001 11002 EVX VanEck Environmental Services ETF 11002 11003 QCLN First Trust NASDAQ Clean Edge Green Energy Ind... 11003 11004 DTB DTE Energy Company 2020 Series G 4.375% Junior... [11004 rows x 3 columns]
Then I have a symbols.csv
file which I want to use to filter the above df:
AKAM AKRO
Here’s how I’ve tried to do it:
origin_symbols = pd.read_sql_query("SELECT id, symbol, name from stock", conn) mikey_symbols = pd.read_csv("symbols.csv") df = origin_symbols[origin_symbols['symbol'].isin(mikey_symbols)]
But for some reason I only get the first line returned from the csv:
id symbol name 6475 6476 AKAM Akamai Technologies, Inc. Common Stock 1 df
Where am I going wrong here?
Advertisement
Answer
You need convert csv file to Series
, here is added column name and for Series
select it (e.g. by position):
mikey_symbols = pd.read_csv("symbols.csv", names=['tmp']).iloc[:, 0] #or by column name #mikey_symbols = pd.read_csv("symbols.csv", names=['tmp'])['tmp']
And then remove possible traling spaces in both by Series.str.strip
:
df = origin_symbols[origin_symbols['symbol'].str.strip().isin(mikey_symbols.str.strip())]