Caching results from a web API using with open
I am using a lot of data returned from a WEB-api.The function below calls the API 22 times, decodes and loads json into python format. Then I store the results in a big list of 22 pages with each 100 art objects as data.
fourteen_list = return_14th_century_works_list()
To limit the necessary API-calls, I want to build a function that stores this list as a file if it is not present, and when it is present I want to load the file from my computer. I came up with the following:
with open('fourteenth_century_list.txt', 'w') as fourteenth_century_file:
print(fourteen_list, file=fourteenth_century_file)
try:
with open('fourteenth_century_list.txt', 'r') as fourteenth_century_file:
fourteenth_list_cache = fourteenth_century_file.read()
count_objects(fourteenth_list_cache)
except FileNotFoundError:
fourteenth_list = return_14th_century_works_list() Calls API again
count_objects(fourteen_list)
I use the count_objects function to check if everything still works, but the file that is opened in the try block doesn't seem to return in the same way I saved it; When I run this code, the function call in the try-block returns a type-error. For me this is an indication that the file opened from disk is in a somewhat different format then if I load it directly from the API.
When i call the function count_objects() with the non-cached version of my list, so fourteen_list in this case,works fine.
Does with_open(filename, 'w') and then with_open(filename, 'r') mutate your data, and if not what am I doing wrong here?
python api caching
add a comment |
I am using a lot of data returned from a WEB-api.The function below calls the API 22 times, decodes and loads json into python format. Then I store the results in a big list of 22 pages with each 100 art objects as data.
fourteen_list = return_14th_century_works_list()
To limit the necessary API-calls, I want to build a function that stores this list as a file if it is not present, and when it is present I want to load the file from my computer. I came up with the following:
with open('fourteenth_century_list.txt', 'w') as fourteenth_century_file:
print(fourteen_list, file=fourteenth_century_file)
try:
with open('fourteenth_century_list.txt', 'r') as fourteenth_century_file:
fourteenth_list_cache = fourteenth_century_file.read()
count_objects(fourteenth_list_cache)
except FileNotFoundError:
fourteenth_list = return_14th_century_works_list() Calls API again
count_objects(fourteen_list)
I use the count_objects function to check if everything still works, but the file that is opened in the try block doesn't seem to return in the same way I saved it; When I run this code, the function call in the try-block returns a type-error. For me this is an indication that the file opened from disk is in a somewhat different format then if I load it directly from the API.
When i call the function count_objects() with the non-cached version of my list, so fourteen_list in this case,works fine.
Does with_open(filename, 'w') and then with_open(filename, 'r') mutate your data, and if not what am I doing wrong here?
python api caching
correct the indentation first please :-)
– Nimish Bansal
Nov 22 '18 at 20:42
@NimishBansal mb
– Psychotechnopath
Nov 22 '18 at 20:43
using with to open the file ensures that file will be closed as soon as with block ends
– Nimish Bansal
Nov 22 '18 at 20:44
What are the type of objects held by the list - are theydict
s?
– Will Keeling
Nov 22 '18 at 21:43
@WillKeeling Yes, they are dicts. The API returns pages of JSON data, when I check it in my browser this data looks like a dictionary, and after I deserialize (With data.decode('utf-8) and then json.loads(data.decoded), and do a type-check on the pages it returns 'dict'....
– Psychotechnopath
Nov 23 '18 at 19:02
add a comment |
I am using a lot of data returned from a WEB-api.The function below calls the API 22 times, decodes and loads json into python format. Then I store the results in a big list of 22 pages with each 100 art objects as data.
fourteen_list = return_14th_century_works_list()
To limit the necessary API-calls, I want to build a function that stores this list as a file if it is not present, and when it is present I want to load the file from my computer. I came up with the following:
with open('fourteenth_century_list.txt', 'w') as fourteenth_century_file:
print(fourteen_list, file=fourteenth_century_file)
try:
with open('fourteenth_century_list.txt', 'r') as fourteenth_century_file:
fourteenth_list_cache = fourteenth_century_file.read()
count_objects(fourteenth_list_cache)
except FileNotFoundError:
fourteenth_list = return_14th_century_works_list() Calls API again
count_objects(fourteen_list)
I use the count_objects function to check if everything still works, but the file that is opened in the try block doesn't seem to return in the same way I saved it; When I run this code, the function call in the try-block returns a type-error. For me this is an indication that the file opened from disk is in a somewhat different format then if I load it directly from the API.
When i call the function count_objects() with the non-cached version of my list, so fourteen_list in this case,works fine.
Does with_open(filename, 'w') and then with_open(filename, 'r') mutate your data, and if not what am I doing wrong here?
python api caching
I am using a lot of data returned from a WEB-api.The function below calls the API 22 times, decodes and loads json into python format. Then I store the results in a big list of 22 pages with each 100 art objects as data.
fourteen_list = return_14th_century_works_list()
To limit the necessary API-calls, I want to build a function that stores this list as a file if it is not present, and when it is present I want to load the file from my computer. I came up with the following:
with open('fourteenth_century_list.txt', 'w') as fourteenth_century_file:
print(fourteen_list, file=fourteenth_century_file)
try:
with open('fourteenth_century_list.txt', 'r') as fourteenth_century_file:
fourteenth_list_cache = fourteenth_century_file.read()
count_objects(fourteenth_list_cache)
except FileNotFoundError:
fourteenth_list = return_14th_century_works_list() Calls API again
count_objects(fourteen_list)
I use the count_objects function to check if everything still works, but the file that is opened in the try block doesn't seem to return in the same way I saved it; When I run this code, the function call in the try-block returns a type-error. For me this is an indication that the file opened from disk is in a somewhat different format then if I load it directly from the API.
When i call the function count_objects() with the non-cached version of my list, so fourteen_list in this case,works fine.
Does with_open(filename, 'w') and then with_open(filename, 'r') mutate your data, and if not what am I doing wrong here?
python api caching
python api caching
edited Nov 22 '18 at 20:43
Psychotechnopath
asked Nov 22 '18 at 20:40
PsychotechnopathPsychotechnopath
413313
413313
correct the indentation first please :-)
– Nimish Bansal
Nov 22 '18 at 20:42
@NimishBansal mb
– Psychotechnopath
Nov 22 '18 at 20:43
using with to open the file ensures that file will be closed as soon as with block ends
– Nimish Bansal
Nov 22 '18 at 20:44
What are the type of objects held by the list - are theydict
s?
– Will Keeling
Nov 22 '18 at 21:43
@WillKeeling Yes, they are dicts. The API returns pages of JSON data, when I check it in my browser this data looks like a dictionary, and after I deserialize (With data.decode('utf-8) and then json.loads(data.decoded), and do a type-check on the pages it returns 'dict'....
– Psychotechnopath
Nov 23 '18 at 19:02
add a comment |
correct the indentation first please :-)
– Nimish Bansal
Nov 22 '18 at 20:42
@NimishBansal mb
– Psychotechnopath
Nov 22 '18 at 20:43
using with to open the file ensures that file will be closed as soon as with block ends
– Nimish Bansal
Nov 22 '18 at 20:44
What are the type of objects held by the list - are theydict
s?
– Will Keeling
Nov 22 '18 at 21:43
@WillKeeling Yes, they are dicts. The API returns pages of JSON data, when I check it in my browser this data looks like a dictionary, and after I deserialize (With data.decode('utf-8) and then json.loads(data.decoded), and do a type-check on the pages it returns 'dict'....
– Psychotechnopath
Nov 23 '18 at 19:02
correct the indentation first please :-)
– Nimish Bansal
Nov 22 '18 at 20:42
correct the indentation first please :-)
– Nimish Bansal
Nov 22 '18 at 20:42
@NimishBansal mb
– Psychotechnopath
Nov 22 '18 at 20:43
@NimishBansal mb
– Psychotechnopath
Nov 22 '18 at 20:43
using with to open the file ensures that file will be closed as soon as with block ends
– Nimish Bansal
Nov 22 '18 at 20:44
using with to open the file ensures that file will be closed as soon as with block ends
– Nimish Bansal
Nov 22 '18 at 20:44
What are the type of objects held by the list - are they
dict
s?– Will Keeling
Nov 22 '18 at 21:43
What are the type of objects held by the list - are they
dict
s?– Will Keeling
Nov 22 '18 at 21:43
@WillKeeling Yes, they are dicts. The API returns pages of JSON data, when I check it in my browser this data looks like a dictionary, and after I deserialize (With data.decode('utf-8) and then json.loads(data.decoded), and do a type-check on the pages it returns 'dict'....
– Psychotechnopath
Nov 23 '18 at 19:02
@WillKeeling Yes, they are dicts. The API returns pages of JSON data, when I check it in my browser this data looks like a dictionary, and after I deserialize (With data.decode('utf-8) and then json.loads(data.decoded), and do a type-check on the pages it returns 'dict'....
– Psychotechnopath
Nov 23 '18 at 19:02
add a comment |
1 Answer
1
active
oldest
votes
The issue here is that when you print
the list of dictionaries to a file, you create a string representation of the list. You then read that string back and pass it to count_objects()
, but that falls over because it expects a list of dictionaries, not a big string.
Rather than printing, a better approach would be to serialize the list to back JSON - which would preserve its structure. You also want to write the list to the cache in the except
block after you've retrieved the data from the API.
import json
try:
with open('fourteenth_century_list.json', 'r') as fourteenth_century_file:
fourteenth_list_cache = json.load(fourteenth_century_file)
count_objects(fourteenth_list_cache)
except FileNotFoundError:
# Calls API again
fourteenth_list = return_14th_century_works_list()
count_objects(fourteen_list)
# Cache the API data
with open('fourteenth_century_list.json', 'w') as fourteenth_century_file:
json.dump(fourteen_list, fourteenth_century_file)
So when printing python data-structures to a file by using with open, are they always converted to a string representation? Would it have been better to use something like shelve here? It is indeed much more logical to only cache after trying to open, thinking mistake from my part (am still learning). Thanks for this very comprehensive answer sir =)
– Psychotechnopath
Nov 23 '18 at 21:22
1
Yesprint
will just output the string representation of whatever is passed to it. Usingshelve
(orpickle
) would also work, although keeping the format as JSON means the file is human readable should you need to ever edit it with a text editor.
– Will Keeling
Nov 23 '18 at 21:28
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53437757%2fcaching-results-from-a-web-api-using-with-open%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The issue here is that when you print
the list of dictionaries to a file, you create a string representation of the list. You then read that string back and pass it to count_objects()
, but that falls over because it expects a list of dictionaries, not a big string.
Rather than printing, a better approach would be to serialize the list to back JSON - which would preserve its structure. You also want to write the list to the cache in the except
block after you've retrieved the data from the API.
import json
try:
with open('fourteenth_century_list.json', 'r') as fourteenth_century_file:
fourteenth_list_cache = json.load(fourteenth_century_file)
count_objects(fourteenth_list_cache)
except FileNotFoundError:
# Calls API again
fourteenth_list = return_14th_century_works_list()
count_objects(fourteen_list)
# Cache the API data
with open('fourteenth_century_list.json', 'w') as fourteenth_century_file:
json.dump(fourteen_list, fourteenth_century_file)
So when printing python data-structures to a file by using with open, are they always converted to a string representation? Would it have been better to use something like shelve here? It is indeed much more logical to only cache after trying to open, thinking mistake from my part (am still learning). Thanks for this very comprehensive answer sir =)
– Psychotechnopath
Nov 23 '18 at 21:22
1
Yesprint
will just output the string representation of whatever is passed to it. Usingshelve
(orpickle
) would also work, although keeping the format as JSON means the file is human readable should you need to ever edit it with a text editor.
– Will Keeling
Nov 23 '18 at 21:28
add a comment |
The issue here is that when you print
the list of dictionaries to a file, you create a string representation of the list. You then read that string back and pass it to count_objects()
, but that falls over because it expects a list of dictionaries, not a big string.
Rather than printing, a better approach would be to serialize the list to back JSON - which would preserve its structure. You also want to write the list to the cache in the except
block after you've retrieved the data from the API.
import json
try:
with open('fourteenth_century_list.json', 'r') as fourteenth_century_file:
fourteenth_list_cache = json.load(fourteenth_century_file)
count_objects(fourteenth_list_cache)
except FileNotFoundError:
# Calls API again
fourteenth_list = return_14th_century_works_list()
count_objects(fourteen_list)
# Cache the API data
with open('fourteenth_century_list.json', 'w') as fourteenth_century_file:
json.dump(fourteen_list, fourteenth_century_file)
So when printing python data-structures to a file by using with open, are they always converted to a string representation? Would it have been better to use something like shelve here? It is indeed much more logical to only cache after trying to open, thinking mistake from my part (am still learning). Thanks for this very comprehensive answer sir =)
– Psychotechnopath
Nov 23 '18 at 21:22
1
Yesprint
will just output the string representation of whatever is passed to it. Usingshelve
(orpickle
) would also work, although keeping the format as JSON means the file is human readable should you need to ever edit it with a text editor.
– Will Keeling
Nov 23 '18 at 21:28
add a comment |
The issue here is that when you print
the list of dictionaries to a file, you create a string representation of the list. You then read that string back and pass it to count_objects()
, but that falls over because it expects a list of dictionaries, not a big string.
Rather than printing, a better approach would be to serialize the list to back JSON - which would preserve its structure. You also want to write the list to the cache in the except
block after you've retrieved the data from the API.
import json
try:
with open('fourteenth_century_list.json', 'r') as fourteenth_century_file:
fourteenth_list_cache = json.load(fourteenth_century_file)
count_objects(fourteenth_list_cache)
except FileNotFoundError:
# Calls API again
fourteenth_list = return_14th_century_works_list()
count_objects(fourteen_list)
# Cache the API data
with open('fourteenth_century_list.json', 'w') as fourteenth_century_file:
json.dump(fourteen_list, fourteenth_century_file)
The issue here is that when you print
the list of dictionaries to a file, you create a string representation of the list. You then read that string back and pass it to count_objects()
, but that falls over because it expects a list of dictionaries, not a big string.
Rather than printing, a better approach would be to serialize the list to back JSON - which would preserve its structure. You also want to write the list to the cache in the except
block after you've retrieved the data from the API.
import json
try:
with open('fourteenth_century_list.json', 'r') as fourteenth_century_file:
fourteenth_list_cache = json.load(fourteenth_century_file)
count_objects(fourteenth_list_cache)
except FileNotFoundError:
# Calls API again
fourteenth_list = return_14th_century_works_list()
count_objects(fourteen_list)
# Cache the API data
with open('fourteenth_century_list.json', 'w') as fourteenth_century_file:
json.dump(fourteen_list, fourteenth_century_file)
answered Nov 23 '18 at 20:24
Will KeelingWill Keeling
11.7k22434
11.7k22434
So when printing python data-structures to a file by using with open, are they always converted to a string representation? Would it have been better to use something like shelve here? It is indeed much more logical to only cache after trying to open, thinking mistake from my part (am still learning). Thanks for this very comprehensive answer sir =)
– Psychotechnopath
Nov 23 '18 at 21:22
1
Yesprint
will just output the string representation of whatever is passed to it. Usingshelve
(orpickle
) would also work, although keeping the format as JSON means the file is human readable should you need to ever edit it with a text editor.
– Will Keeling
Nov 23 '18 at 21:28
add a comment |
So when printing python data-structures to a file by using with open, are they always converted to a string representation? Would it have been better to use something like shelve here? It is indeed much more logical to only cache after trying to open, thinking mistake from my part (am still learning). Thanks for this very comprehensive answer sir =)
– Psychotechnopath
Nov 23 '18 at 21:22
1
Yesprint
will just output the string representation of whatever is passed to it. Usingshelve
(orpickle
) would also work, although keeping the format as JSON means the file is human readable should you need to ever edit it with a text editor.
– Will Keeling
Nov 23 '18 at 21:28
So when printing python data-structures to a file by using with open, are they always converted to a string representation? Would it have been better to use something like shelve here? It is indeed much more logical to only cache after trying to open, thinking mistake from my part (am still learning). Thanks for this very comprehensive answer sir =)
– Psychotechnopath
Nov 23 '18 at 21:22
So when printing python data-structures to a file by using with open, are they always converted to a string representation? Would it have been better to use something like shelve here? It is indeed much more logical to only cache after trying to open, thinking mistake from my part (am still learning). Thanks for this very comprehensive answer sir =)
– Psychotechnopath
Nov 23 '18 at 21:22
1
1
Yes
print
will just output the string representation of whatever is passed to it. Using shelve
(or pickle
) would also work, although keeping the format as JSON means the file is human readable should you need to ever edit it with a text editor.– Will Keeling
Nov 23 '18 at 21:28
Yes
print
will just output the string representation of whatever is passed to it. Using shelve
(or pickle
) would also work, although keeping the format as JSON means the file is human readable should you need to ever edit it with a text editor.– Will Keeling
Nov 23 '18 at 21:28
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53437757%2fcaching-results-from-a-web-api-using-with-open%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
correct the indentation first please :-)
– Nimish Bansal
Nov 22 '18 at 20:42
@NimishBansal mb
– Psychotechnopath
Nov 22 '18 at 20:43
using with to open the file ensures that file will be closed as soon as with block ends
– Nimish Bansal
Nov 22 '18 at 20:44
What are the type of objects held by the list - are they
dict
s?– Will Keeling
Nov 22 '18 at 21:43
@WillKeeling Yes, they are dicts. The API returns pages of JSON data, when I check it in my browser this data looks like a dictionary, and after I deserialize (With data.decode('utf-8) and then json.loads(data.decoded), and do a type-check on the pages it returns 'dict'....
– Psychotechnopath
Nov 23 '18 at 19:02