The following is my PySpark startup snippet, which is pretty reliable (I’ve been using it a long time). Today I added the two Maven Coordinates shown in the spark.jars.packages option (effectively “plugging” in Kafka support). Now that normally triggers dependency downloads (performed by Spark automatically): However the plugins aren’t downloading and/or loading when I run the snippet (e.g. ./python -i
Tag: apache-kafka
Python: how to mock a kafka topic for unit tests?
We have a message scheduler that generates a hash-key from the message attributes before placing it on a Kafka topic queue with the key. This is done for de-duplication purposes. However, I am not sure how I could possibly test this deduplication without actually setting up a local cluster and checking that it is performing as expected. Searching online for