I am working on Elastic Search (version 7.16) with Oython (version 3.6)
I have the below rows in Elastic Search:
JavaScript
x
10
10
1
{"owner": "john", "database": "postgres", "table": "sales_tab"},
2
{"owner": "hannah", "database": "mongodb", "table": "dept_tab"},
3
{"owner": "peter", "database": "mysql", "table": "new_tab"},
4
{"owner": "jim", "database": "postgres", "table": "cust_tab"},
5
{"owner": "lima", "database": "postgres", "table": "sales_tab"},
6
{"owner": "tory", "database": "oracle", "table": "store_tab"},
7
{"owner": "kane", "database": "mysql", "table": "trasit_tab"},
8
{"owner": "roma", "database": "mongodb", "table": "common_tab"},
9
{"owner": "ashley", "database": "mongodb", "table": "common_tab"},
10
With the below query:
JavaScript
1
12
12
1
{
2
"size": 0,
3
"aggs": {
4
"table_grouped": {
5
"terms": {
6
"field": "table",
7
"size": 100000
8
}
9
}
10
}
11
}
12
I get distinct table values, something like below:
JavaScript
1
6
1
{'aggregations': {'table_grouped': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0, ,
2
'buckets': [{'key': 'sales_tab', 'doc_count': 3}, {'key': 'dept_tab', 'doc_count': 1},
3
{'key': 'new_tab', 'doc_count': 1}, {'key': 'cust_tab', 'doc_count': 1},
4
{'key': 'store_tab', 'doc_count': 1}, {'key': 'trasit_tab', 'doc_count': 1},
5
{'key': 'common_tab', 'doc_count': 2}]}}}
6
But what I actually want is:
JavaScript
1
7
1
{'aggregations': {'table_grouped': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0, ,
2
'buckets': [{'key': 'sales_tab', 'doc_count': 2, "database": "postgres"}, {'key': 'dept_tab',
3
'doc_count': 1, "database": "mongodb"}, {'key': 'new_tab', 'doc_count': 1,
4
"database": "mysql"}, {'key': 'cust_tab', 'doc_count': 1, "database": "postgres"},
5
{'key': 'store_tab', 'doc_count': 1, "database": "oracle"}, {'key': 'trasit_tab', 'doc_count': 1, "database": "mysql"},
6
{'key': 'common_tab', 'doc_count': 2, "database": "mongodb"}}]}}}
7
I want to know from which database is this table coming from, not just {'key': 'sales_tab', 'doc_count': 2}
like extra key: value of database {'key': 'sales_tab', 'doc_count': 2, "database": "postgres"}
value in buckets result or any other solution which will give distinct table along with the database it is coming from.
How do I achieve it?
Advertisement
Answer
You can use sub aggregation for getting database name as shown below:
JavaScript
1
20
20
1
{
2
"size": 0,
3
"aggs": {
4
"table_grouped": {
5
"terms": {
6
"field": "table",
7
"size": 10
8
},
9
"aggs": {
10
"database": {
11
"terms": {
12
"field": "database",
13
"size": 10
14
}
15
}
16
}
17
}
18
}
19
}
20
This will generate response as shown below:
JavaScript
1
107
107
1
"aggregations": {
2
"table_grouped": {
3
"doc_count_error_upper_bound": 0,
4
"sum_other_doc_count": 0,
5
"buckets": [
6
{
7
"key": "common_tab",
8
"doc_count": 2,
9
"database": {
10
"doc_count_error_upper_bound": 0,
11
"sum_other_doc_count": 0,
12
"buckets": [
13
{
14
"key": "mongodb",
15
"doc_count": 2
16
}
17
]
18
}
19
},
20
{
21
"key": "sales_tab",
22
"doc_count": 2,
23
"database": {
24
"doc_count_error_upper_bound": 0,
25
"sum_other_doc_count": 0,
26
"buckets": [
27
{
28
"key": "postgres",
29
"doc_count": 2
30
}
31
]
32
}
33
},
34
{
35
"key": "cust_tab",
36
"doc_count": 1,
37
"database": {
38
"doc_count_error_upper_bound": 0,
39
"sum_other_doc_count": 0,
40
"buckets": [
41
{
42
"key": "postgres",
43
"doc_count": 1
44
}
45
]
46
}
47
},
48
{
49
"key": "dept_tab",
50
"doc_count": 1,
51
"database": {
52
"doc_count_error_upper_bound": 0,
53
"sum_other_doc_count": 0,
54
"buckets": [
55
{
56
"key": "mongodb",
57
"doc_count": 1
58
}
59
]
60
}
61
},
62
{
63
"key": "new_tab",
64
"doc_count": 1,
65
"database": {
66
"doc_count_error_upper_bound": 0,
67
"sum_other_doc_count": 0,
68
"buckets": [
69
{
70
"key": "mysql",
71
"doc_count": 1
72
}
73
]
74
}
75
},
76
{
77
"key": "store_tab",
78
"doc_count": 1,
79
"database": {
80
"doc_count_error_upper_bound": 0,
81
"sum_other_doc_count": 0,
82
"buckets": [
83
{
84
"key": "oracle",
85
"doc_count": 1
86
}
87
]
88
}
89
},
90
{
91
"key": "trasit_tab",
92
"doc_count": 1,
93
"database": {
94
"doc_count_error_upper_bound": 0,
95
"sum_other_doc_count": 0,
96
"buckets": [
97
{
98
"key": "mysql",
99
"doc_count": 1
100
}
101
]
102
}
103
}
104
]
105
}
106
}
107