It it possible to make $unwind fast in MongoDB by having an index?
Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:
{
"transactions": [
{"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
]
}
And I would like to get all the transactions that have the id 17 within the collection. So I created this index:
db.toto.createIndex({'transactions.id': 1})
But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:
db.toto.aggregate(
[
{"$match": {"transactions.id": 17}},
{"$unwind": "$transactions"},
{"$match": {"transactions.id": 17}},
]
)
Gives me
[{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 920, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 446, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 822, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 830, 'id': 17}},
[...]
{'_id': ObjectId('5bf854f885699a394ce5ba89'),
'transactions': {'hello': 301, 'id': 17}},
{'_id': ObjectId('5bf854f985699a394ce5ba8b'),
'transactions': {'hello': 666, 'id': 17}}]
Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.
The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17})
that does use the index takes less than 0.1 second.
Here is the python file I used to study the issue. You can reproduce the issue by doing:
pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query
mongodb
add a comment |
Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:
{
"transactions": [
{"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
]
}
And I would like to get all the transactions that have the id 17 within the collection. So I created this index:
db.toto.createIndex({'transactions.id': 1})
But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:
db.toto.aggregate(
[
{"$match": {"transactions.id": 17}},
{"$unwind": "$transactions"},
{"$match": {"transactions.id": 17}},
]
)
Gives me
[{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 920, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 446, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 822, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 830, 'id': 17}},
[...]
{'_id': ObjectId('5bf854f885699a394ce5ba89'),
'transactions': {'hello': 301, 'id': 17}},
{'_id': ObjectId('5bf854f985699a394ce5ba8b'),
'transactions': {'hello': 666, 'id': 17}}]
Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.
The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17})
that does use the index takes less than 0.1 second.
Here is the python file I used to study the issue. You can reproduce the issue by doing:
pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query
mongodb
1
Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match
– Alex Blex
Nov 26 '18 at 10:25
I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.
– nevare
Nov 26 '18 at 10:31
No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.
– Alex Blex
Nov 26 '18 at 11:42
$unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..
– Chai Halfon
Nov 26 '18 at 17:35
add a comment |
Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:
{
"transactions": [
{"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
]
}
And I would like to get all the transactions that have the id 17 within the collection. So I created this index:
db.toto.createIndex({'transactions.id': 1})
But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:
db.toto.aggregate(
[
{"$match": {"transactions.id": 17}},
{"$unwind": "$transactions"},
{"$match": {"transactions.id": 17}},
]
)
Gives me
[{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 920, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 446, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 822, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 830, 'id': 17}},
[...]
{'_id': ObjectId('5bf854f885699a394ce5ba89'),
'transactions': {'hello': 301, 'id': 17}},
{'_id': ObjectId('5bf854f985699a394ce5ba8b'),
'transactions': {'hello': 666, 'id': 17}}]
Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.
The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17})
that does use the index takes less than 0.1 second.
Here is the python file I used to study the issue. You can reproduce the issue by doing:
pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query
mongodb
Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:
{
"transactions": [
{"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
]
}
And I would like to get all the transactions that have the id 17 within the collection. So I created this index:
db.toto.createIndex({'transactions.id': 1})
But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:
db.toto.aggregate(
[
{"$match": {"transactions.id": 17}},
{"$unwind": "$transactions"},
{"$match": {"transactions.id": 17}},
]
)
Gives me
[{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 920, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 446, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 822, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 830, 'id': 17}},
[...]
{'_id': ObjectId('5bf854f885699a394ce5ba89'),
'transactions': {'hello': 301, 'id': 17}},
{'_id': ObjectId('5bf854f985699a394ce5ba8b'),
'transactions': {'hello': 666, 'id': 17}}]
Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.
The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17})
that does use the index takes less than 0.1 second.
Here is the python file I used to study the issue. You can reproduce the issue by doing:
pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query
mongodb
mongodb
asked Nov 26 '18 at 10:16
nevarenevare
614
614
1
Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match
– Alex Blex
Nov 26 '18 at 10:25
I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.
– nevare
Nov 26 '18 at 10:31
No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.
– Alex Blex
Nov 26 '18 at 11:42
$unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..
– Chai Halfon
Nov 26 '18 at 17:35
add a comment |
1
Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match
– Alex Blex
Nov 26 '18 at 10:25
I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.
– nevare
Nov 26 '18 at 10:31
No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.
– Alex Blex
Nov 26 '18 at 11:42
$unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..
– Chai Halfon
Nov 26 '18 at 17:35
1
1
Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match
– Alex Blex
Nov 26 '18 at 10:25
Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match
– Alex Blex
Nov 26 '18 at 10:25
I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.
– nevare
Nov 26 '18 at 10:31
I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.
– nevare
Nov 26 '18 at 10:31
No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.
– Alex Blex
Nov 26 '18 at 11:42
No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.
– Alex Blex
Nov 26 '18 at 11:42
$unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..
– Chai Halfon
Nov 26 '18 at 17:35
$unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..
– Chai Halfon
Nov 26 '18 at 17:35
add a comment |
0
active
oldest
votes
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53478941%2fit-it-possible-to-make-unwind-fast-in-mongodb-by-having-an-index%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53478941%2fit-it-possible-to-make-unwind-fast-in-mongodb-by-having-an-index%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match
– Alex Blex
Nov 26 '18 at 10:25
I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.
– nevare
Nov 26 '18 at 10:31
No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.
– Alex Blex
Nov 26 '18 at 11:42
$unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..
– Chai Halfon
Nov 26 '18 at 17:35