Skip to content
Advertisement

No FileSystem for scheme: s3 with pyspark

I’m trying to read a txt file from S3 with Spark, but I’m getting thhis error:

JavaScript

This is my code:

JavaScript

This is the full traceback:

JavaScript

How can I fix this?

Advertisement

Answer

If you are using a local machine you can use boto3:

JavaScript

(do not forget to setup your AWS S3 credentials).

Another clean solution if you are using an AWS Virtual Machine (EC2) would be granting S3 permissions to your EC2 and launching pyspark with this command:

JavaScript

If you add other packages, make sure the format is: ‘groupId:artifactId:version’ and the packages are separated by commas.

If you are using pyspark from Jupyter Notebooks this will work:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement