As the heading states, I’m looking for a nice visual way to check my ES client upload
I can either use:
for i in tqdm(<my_docs>): es_client.create(...)
but I want to use the recommended (by ES) way:
helpers.bulk(...) <- how to add tqdm here?
Advertisement
Answer
Yes, but instead of using bulk
, you need to use streaming_bulk
. Unlike bulk
, which only returns the final result in the end, streaming_bulk
yields results per action. With this, we can update tqdm
after each action.
The code looks more or less like this:
# Setup the client client = Elasticsearch() # Set total number of documents number_of_docs = 100 progress = tqdm.tqdm(unit="docs", total=number_of_docs) successes = 0 for ok, action in streaming_bulk( client=client, index="my-index", actions=<your_generator_here> ): progress.update(1) successes += ok print(f"Indexed {successes}/{number_of_docs} documents")