I have a large collection with fields like:
JavaScript
x
21
21
1
{
2
'class': 'apple'
3
},
4
{
5
'class': 'appl'
6
},
7
{
8
'class': 'orange',
9
'nested': [
10
{'classification': 'app'},
11
{'classification': 'A',
12
{'classification': 'orang'}
13
]
14
},
15
{
16
'nested': [
17
{'classification': 'O'},
18
{'classification': 'unknown'}
19
]
20
}
21
I also have a Python dictionary mapping field values like:
JavaScript
1
15
15
1
{
2
'class': {
3
'apple': 'a',
4
'appl': 'a',
5
'orange': 'o'
6
},
7
'nested.classification': {
8
'app': 'a',
9
'A': 'a',
10
'orang': 'o',
11
'O': 'o',
12
'unknown': 'u'
13
}
14
}
15
I’m trying to (in PyMongo) update my MongoDB collection so that a string field of mapped characters is accumulated, from both the top-level class
field and the nested nested.classification
fields.
In the above, this would produce the following updates:
JavaScript
1
25
25
1
{
2
'class': 'apple'
3
'standard': 'a'
4
},
5
{
6
'class': 'appl'
7
'standard': 'a'
8
},
9
{
10
'class': 'orange',
11
'nested': [
12
{'classification': 'app'},
13
{'classification': 'A',
14
{'classification': 'orang'}
15
]
16
'standard': 'oaao'
17
},
18
{
19
'nested': [
20
{'classification': 'O'},
21
{'classification': 'unknown'}
22
]
23
'standard': 'ou'
24
}
25
How can I effectively do this at scale? Within an aggregation framework?
Advertisement
Answer
You may get the desired result in 3 steps
Note: MongoDB can only iterate arrays, so we need to transform your dictionaries into {k:"key", v: "value"}
array (we can use $objectToArray, but it’s not worth it)
- We map
class
field by iterating Pythonclass
dictionary - We map
nested classification
values by iterating Pythonnested.classification
dictionary - We concat mapped values into a single value
- (Optional) If you need to persist it, run
$merge
stage
Disclamer: MongoDB >=4.2 + I am not sure if this solution scales good
JavaScript
1
79
79
1
db.collection.aggregate([
2
{
3
"$addFields": {
4
standard: {
5
$reduce: {
6
input: [
7
{ k: "apple", v: "a" },
8
{ k: "appl", v: "a" },
9
{ k: "orange", v: "o" }
10
],
11
initialValue: "",
12
in: {
13
$cond: [
14
{
15
$eq: ["$$this.k", "$class"]
16
},
17
"$$this.v",
18
"$$value"
19
]
20
}
21
}
22
}
23
}
24
},
25
{
26
"$addFields": {
27
standard: {
28
$reduce: {
29
input: {
30
"$ifNull": [ "$nested", [] ]
31
},
32
initialValue: [ { v: "$standard" } ],
33
in: {
34
$concatArrays: [
35
"$$value",
36
{
37
$filter: {
38
input: [
39
{ k: "app", v: "a" },
40
{ k: "A", v: "a" },
41
{ k: "orang", v: "o" },
42
{ k: "O", v: "o" },
43
{ k: "unknown", v: "u" }
44
],
45
as: "nested",
46
cond: {
47
$eq: [ "$$this.classification", "$$nested.k" ]
48
}
49
}
50
}
51
]
52
}
53
}
54
}
55
}
56
},
57
{
58
"$addFields": {
59
"standard": {
60
$reduce: {
61
input: "$standard.v",
62
initialValue: "",
63
in: {
64
"$concat": [ "$$value", "$$this" ]
65
}
66
}
67
}
68
}
69
},
70
//Optional - If you need to persist it
71
{
72
$merge: {
73
into: "collection",
74
on: "_id",
75
whenMatched: "replace"
76
}
77
}
78
])
79