Skip to content
Advertisement

Tag: pyspark

How to filter multiple rows based on rows and columns condition in pyspark

I want to filter multiple rows based on “value” column. Ex, i want filter velocity from channel_name column where value>=1 & value <=5 and i want filter Temp from channel_name column where value>=0 & value <=2. Below id my Pysaprk DF. start_timestamp channel_name value 2020-11-02 08:51:50 velocity 1 2020-11-02 09:14:29 Temp 0 2020-11-02 09:18:32 velocity 0 2020-11-02 09:32:42 velocity 4

pyspark – How to define MapType for when/otherwise

I have a pyspark DataFrame with a MapType column that either contains the map<string, int> format or is None. I need to perform some calculations using collect_list. But collect_list excludes None values and I am trying to find a workaround, by transforming None to string similar to Include null values in collect_list in pyspark However, on my case I can’t

Splitting object data into new columns in dataframe

i have a dataframe with column business_id and attributes with thousands of rows like this: how do create new column for each attribute with the value to the business id ? and if it’s not applicable to that business id, it will specify false. example: while also noting that there are some attributes with value as object in an object

Using pyspark.sql.functions without sparkContext import problem

I have situation which can be trivialized to example with two files. filters.py main.py It appears, that F.col object cannot be created without active sparkSession/sparkContext object, so import fails. Is there any way to keep filters separated from other files and how i can import them? My situation is a little bit more complicated, this filters is used in many

extract value from a list of json in pyspark

I have a dataframe where a column is in the form of a list of json. I want to extract a specific value (score) from the column and create independent columns. I want to explode my result dataframe as: Answer Assuming you have your json looks like this You can read it, flatten it, then pivot it like so

Advertisement