It it possible to make $unwind fast in MongoDB by having an index?












2















Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:



{
"transactions": [
{"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
]
}


And I would like to get all the transactions that have the id 17 within the collection. So I created this index:



db.toto.createIndex({'transactions.id': 1})


But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:



db.toto.aggregate(
[
{"$match": {"transactions.id": 17}},
{"$unwind": "$transactions"},
{"$match": {"transactions.id": 17}},
]
)


Gives me



    [{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 920, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 446, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 822, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 830, 'id': 17}},
[...]
{'_id': ObjectId('5bf854f885699a394ce5ba89'),
'transactions': {'hello': 301, 'id': 17}},
{'_id': ObjectId('5bf854f985699a394ce5ba8b'),
'transactions': {'hello': 666, 'id': 17}}]


Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.



The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17}) that does use the index takes less than 0.1 second.



Here is the python file I used to study the issue. You can reproduce the issue by doing:



pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query









share|improve this question


















  • 1





    Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match

    – Alex Blex
    Nov 26 '18 at 10:25











  • I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.

    – nevare
    Nov 26 '18 at 10:31













  • No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.

    – Alex Blex
    Nov 26 '18 at 11:42











  • $unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..

    – Chai Halfon
    Nov 26 '18 at 17:35


















2















Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:



{
"transactions": [
{"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
]
}


And I would like to get all the transactions that have the id 17 within the collection. So I created this index:



db.toto.createIndex({'transactions.id': 1})


But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:



db.toto.aggregate(
[
{"$match": {"transactions.id": 17}},
{"$unwind": "$transactions"},
{"$match": {"transactions.id": 17}},
]
)


Gives me



    [{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 920, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 446, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 822, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 830, 'id': 17}},
[...]
{'_id': ObjectId('5bf854f885699a394ce5ba89'),
'transactions': {'hello': 301, 'id': 17}},
{'_id': ObjectId('5bf854f985699a394ce5ba8b'),
'transactions': {'hello': 666, 'id': 17}}]


Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.



The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17}) that does use the index takes less than 0.1 second.



Here is the python file I used to study the issue. You can reproduce the issue by doing:



pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query









share|improve this question


















  • 1





    Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match

    – Alex Blex
    Nov 26 '18 at 10:25











  • I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.

    – nevare
    Nov 26 '18 at 10:31













  • No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.

    – Alex Blex
    Nov 26 '18 at 11:42











  • $unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..

    – Chai Halfon
    Nov 26 '18 at 17:35
















2












2








2


2






Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:



{
"transactions": [
{"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
]
}


And I would like to get all the transactions that have the id 17 within the collection. So I created this index:



db.toto.createIndex({'transactions.id': 1})


But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:



db.toto.aggregate(
[
{"$match": {"transactions.id": 17}},
{"$unwind": "$transactions"},
{"$match": {"transactions.id": 17}},
]
)


Gives me



    [{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 920, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 446, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 822, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 830, 'id': 17}},
[...]
{'_id': ObjectId('5bf854f885699a394ce5ba89'),
'transactions': {'hello': 301, 'id': 17}},
{'_id': ObjectId('5bf854f985699a394ce5ba8b'),
'transactions': {'hello': 666, 'id': 17}}]


Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.



The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17}) that does use the index takes less than 0.1 second.



Here is the python file I used to study the issue. You can reproduce the issue by doing:



pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query









share|improve this question














Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:



{
"transactions": [
{"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
]
}


And I would like to get all the transactions that have the id 17 within the collection. So I created this index:



db.toto.createIndex({'transactions.id': 1})


But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:



db.toto.aggregate(
[
{"$match": {"transactions.id": 17}},
{"$unwind": "$transactions"},
{"$match": {"transactions.id": 17}},
]
)


Gives me



    [{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 920, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba82'),
'transactions': {'hello': 446, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 822, 'id': 17}},
{'_id': ObjectId('5bf854f685699a394ce5ba84'),
'transactions': {'hello': 830, 'id': 17}},
[...]
{'_id': ObjectId('5bf854f885699a394ce5ba89'),
'transactions': {'hello': 301, 'id': 17}},
{'_id': ObjectId('5bf854f985699a394ce5ba8b'),
'transactions': {'hello': 666, 'id': 17}}]


Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.



The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17}) that does use the index takes less than 0.1 second.



Here is the python file I used to study the issue. You can reproduce the issue by doing:



pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query






mongodb






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 26 '18 at 10:16









nevarenevare

614




614








  • 1





    Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match

    – Alex Blex
    Nov 26 '18 at 10:25











  • I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.

    – nevare
    Nov 26 '18 at 10:31













  • No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.

    – Alex Blex
    Nov 26 '18 at 11:42











  • $unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..

    – Chai Halfon
    Nov 26 '18 at 17:35
















  • 1





    Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match

    – Alex Blex
    Nov 26 '18 at 10:25











  • I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.

    – nevare
    Nov 26 '18 at 10:31













  • No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.

    – Alex Blex
    Nov 26 '18 at 11:42











  • $unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..

    – Chai Halfon
    Nov 26 '18 at 17:35










1




1





Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match

– Alex Blex
Nov 26 '18 at 10:25





Indexes can be used only on the first stage and only for some stages. Try if docs.mongodb.com/manual/reference/operator/aggregation/filter is any faster than unwind + match

– Alex Blex
Nov 26 '18 at 10:25













I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.

– nevare
Nov 26 '18 at 10:31







I did use a filter too and it is slightly faster but it still takes more that a second to finish the query so it clearly does not use the index either. The query is in the python file.

– nevare
Nov 26 '18 at 10:31















No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.

– Alex Blex
Nov 26 '18 at 11:42





No it doesn't and won't do in foreseeable future: docs.mongodb.com/manual/core/aggregation-pipeline/… says "Only $match and $sort can benefit from indexes and only when used in first stage". It's db side, doesn't matter which client driver/language you use.

– Alex Blex
Nov 26 '18 at 11:42













$unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..

– Chai Halfon
Nov 26 '18 at 17:35







$unwind will just iterate on all the elements in an array for each document, and will expend thrm to to different documents. Index cannot help you here, the only thong you can do is (as mentioned above) filter the results as much as possible before the unwind stage, or consider storing your data in a different format (not in arrays)..

– Chai Halfon
Nov 26 '18 at 17:35














0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53478941%2fit-it-possible-to-make-unwind-fast-in-mongodb-by-having-an-index%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53478941%2fit-it-possible-to-make-unwind-fast-in-mongodb-by-having-an-index%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Wiesbaden

Marschland

Dieringhausen