I have a large collection with fields like:
{ 'class': 'apple' }, { 'class': 'appl' }, { 'class': 'orange', 'nested': [ {'classification': 'app'}, {'classification': 'A', {'classification': 'orang'} ] }, { 'nested': [ {'classification': 'O'}, {'classification': 'unknown'} ] }
I also have a Python dictionary mapping field values like:
{ 'class': { 'apple': 'a', 'appl': 'a', 'orange': 'o' }, 'nested.classification': { 'app': 'a', 'A': 'a', 'orang': 'o', 'O': 'o', 'unknown': 'u' } }
I’m trying to (in PyMongo) update my MongoDB collection so that a string field of mapped characters is accumulated, from both the top-level class
field and the nested nested.classification
fields.
In the above, this would produce the following updates:
{ 'class': 'apple' 'standard': 'a' }, { 'class': 'appl' 'standard': 'a' }, { 'class': 'orange', 'nested': [ {'classification': 'app'}, {'classification': 'A', {'classification': 'orang'} ] 'standard': 'oaao' }, { 'nested': [ {'classification': 'O'}, {'classification': 'unknown'} ] 'standard': 'ou' }
How can I effectively do this at scale? Within an aggregation framework?
Advertisement
Answer
You may get the desired result in 3 steps
Note: MongoDB can only iterate arrays, so we need to transform your dictionaries into {k:"key", v: "value"}
array (we can use $objectToArray, but it’s not worth it)
- We map
class
field by iterating Pythonclass
dictionary - We map
nested classification
values by iterating Pythonnested.classification
dictionary - We concat mapped values into a single value
- (Optional) If you need to persist it, run
$merge
stage
Disclamer: MongoDB >=4.2 + I am not sure if this solution scales good
db.collection.aggregate([ { "$addFields": { standard: { $reduce: { input: [ { k: "apple", v: "a" }, { k: "appl", v: "a" }, { k: "orange", v: "o" } ], initialValue: "", in: { $cond: [ { $eq: ["$$this.k", "$class"] }, "$$this.v", "$$value" ] } } } } }, { "$addFields": { standard: { $reduce: { input: { "$ifNull": [ "$nested", [] ] }, initialValue: [ { v: "$standard" } ], in: { $concatArrays: [ "$$value", { $filter: { input: [ { k: "app", v: "a" }, { k: "A", v: "a" }, { k: "orang", v: "o" }, { k: "O", v: "o" }, { k: "unknown", v: "u" } ], as: "nested", cond: { $eq: [ "$$this.classification", "$$nested.k" ] } } } ] } } } } }, { "$addFields": { "standard": { $reduce: { input: "$standard.v", initialValue: "", in: { "$concat": [ "$$value", "$$this" ] } } } } }, //Optional - If you need to persist it { $merge: { into: "collection", on: "_id", whenMatched: "replace" } } ])