how easy is to cause an outOfMemoryExeption in python?
Im dong a spelling bee program in python using pygame, and it works fine, but i have been testing it with 7 words, not more.
Im worried that, if used with 300 words it might cause the memory to fill.
remember there are 2 arrays: One holds the default list of words, and the other holds the randomized words.
python memory out-of-memory
|
show 9 more comments
Im dong a spelling bee program in python using pygame, and it works fine, but i have been testing it with 7 words, not more.
Im worried that, if used with 300 words it might cause the memory to fill.
remember there are 2 arrays: One holds the default list of words, and the other holds the randomized words.
python memory out-of-memory
2
It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.
– Carcigenicate
Nov 24 '18 at 17:17
It's easy to use all available memory:a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.
– juanpa.arrivillaga
Nov 24 '18 at 17:20
300 words will not use up all your memory, no.
– Martijn Pieters♦
Nov 24 '18 at 17:20
1): Remember they are 600 due to being the normal list and the randomized list.
– Gabriel Mation
Nov 24 '18 at 17:20
2) Lets say its an 8Gb RAM
– Gabriel Mation
Nov 24 '18 at 17:21
|
show 9 more comments
Im dong a spelling bee program in python using pygame, and it works fine, but i have been testing it with 7 words, not more.
Im worried that, if used with 300 words it might cause the memory to fill.
remember there are 2 arrays: One holds the default list of words, and the other holds the randomized words.
python memory out-of-memory
Im dong a spelling bee program in python using pygame, and it works fine, but i have been testing it with 7 words, not more.
Im worried that, if used with 300 words it might cause the memory to fill.
remember there are 2 arrays: One holds the default list of words, and the other holds the randomized words.
python memory out-of-memory
python memory out-of-memory
asked Nov 24 '18 at 17:14
Gabriel MationGabriel Mation
85
85
2
It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.
– Carcigenicate
Nov 24 '18 at 17:17
It's easy to use all available memory:a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.
– juanpa.arrivillaga
Nov 24 '18 at 17:20
300 words will not use up all your memory, no.
– Martijn Pieters♦
Nov 24 '18 at 17:20
1): Remember they are 600 due to being the normal list and the randomized list.
– Gabriel Mation
Nov 24 '18 at 17:20
2) Lets say its an 8Gb RAM
– Gabriel Mation
Nov 24 '18 at 17:21
|
show 9 more comments
2
It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.
– Carcigenicate
Nov 24 '18 at 17:17
It's easy to use all available memory:a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.
– juanpa.arrivillaga
Nov 24 '18 at 17:20
300 words will not use up all your memory, no.
– Martijn Pieters♦
Nov 24 '18 at 17:20
1): Remember they are 600 due to being the normal list and the randomized list.
– Gabriel Mation
Nov 24 '18 at 17:20
2) Lets say its an 8Gb RAM
– Gabriel Mation
Nov 24 '18 at 17:21
2
2
It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.
– Carcigenicate
Nov 24 '18 at 17:17
It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.
– Carcigenicate
Nov 24 '18 at 17:17
It's easy to use all available memory:
a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.– juanpa.arrivillaga
Nov 24 '18 at 17:20
It's easy to use all available memory:
a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.– juanpa.arrivillaga
Nov 24 '18 at 17:20
300 words will not use up all your memory, no.
– Martijn Pieters♦
Nov 24 '18 at 17:20
300 words will not use up all your memory, no.
– Martijn Pieters♦
Nov 24 '18 at 17:20
1): Remember they are 600 due to being the normal list and the randomized list.
– Gabriel Mation
Nov 24 '18 at 17:20
1): Remember they are 600 due to being the normal list and the randomized list.
– Gabriel Mation
Nov 24 '18 at 17:20
2) Lets say its an 8Gb RAM
– Gabriel Mation
Nov 24 '18 at 17:21
2) Lets say its an 8Gb RAM
– Gabriel Mation
Nov 24 '18 at 17:21
|
show 9 more comments
2 Answers
2
active
oldest
votes
One good way to find out is to try it.
You can put a line midway through your program to print out how much memory it is using:
import os
import psutil
process = psutil.Process(os.getpid())
print(process.memory_info().rss)
Try running your program with different numbers of words and plotting the results:

Then you can predict how many words it would take to use up all your memory.
A few other points to keep in mind:
- If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.
- Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.
- For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.
1
OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).
– Martijn Pieters♦
Nov 24 '18 at 17:25
2
Instead, usetracemallocsnapshots.
– Martijn Pieters♦
Nov 24 '18 at 17:26
so how can i measure it?
– Gabriel Mation
Nov 24 '18 at 17:26
ok ill se if it works
– Gabriel Mation
Nov 24 '18 at 17:26
@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.
– Owen
Nov 24 '18 at 17:26
|
show 2 more comments
You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.
With a bit of care, you can measure memory requirements directly. The sys.getsizeof() function lets you measure the direct memory requirements of a given Python object (only direct memory, not anything that it references!). You could use this to measure individual strings:
>>> import sys
>>> sys.getsizeof("Hello!")
55
>>> sys.getsizeof("memoryfootprint")
64
Exact sizes depend on the Python version and your OS. A Python string object needs a base amount of memory for a lot of book-keeping information, and then 1, 2 or 4 bytes per character, depending on the highest Unicode code point. For ASCII, that's just one byte per letter. Python 3.7, on my Mac OS X system uses 49 bytes for the bookkeeping portion.
Getting the size of a Python list object means you get just the list object memory requirements, not anything that's stored 'in' the list. You can repeatedly add the same object to a list and you'd not get copies, because Python uses references for everything, including list contents. Take that into account.
So lets load 300 random words, and create two lists, to see what the memory needs will be:
>>> import random
>>> words = list(map(str.strip, open('/usr/share/dict/words'))) # big file of words, present on many computers
>>> words = random.sample(words, 300) # just use 300
>>> words[:10]
['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']
>>> import statistics
>>> statistics.mean(map(len, words))
9.346666666666666
>>> statistics.median(map(len, words))
9.0
>>> statistics.mode(map(len, words))
10
>>> sys.getsizeof(words)
2464
>>> sum(sys.getsizeof(word) for word in words)
17504
That's one list, with 300 unique words with an average length of just over 9 characters, and that required 2464 bytes for the list, and 17504 bytes for the words themselves. That's less that not even 20KB.
But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.
For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.
On an 8GB machine, you will not have any problems. Note that I loaded the whole words file in one go into my computer, and then cut that back to 300 random words. Here is how much memory that whole initial list requires:
>>> words = list(map(str.strip, open('/usr/share/dict/words')))
>>> len(words)
235886
>>> sum(sys.getsizeof(word) for word in words)
13815637
>>> sys.getsizeof(words)
2007112
That's about 15MB of memory, for close to 236 thousand words.
If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:
last = None
def display_memory_change(msg):
global last
snap = tracemalloc.take_snapshot()
statdiff, last = snap.compare_to(last, 'filename', True), snap
tot = sum(s.size for s in statdiff)
change = sum(s.size_diff for s in statdiff)
print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(
msg, tot / 2 ** 20, change / 2 ** 20))
# at the start, get a baseline
tracemalloc.start()
last = tracemalloc.take_snapshot()
# create objects, run more code, etc.
display_memory_change("Some message as to what has been done")
# run some more code.
display_memory_change("Show some more statistics")
Using the above code to measure reading all those words:
tracemalloc.start()
last = tracemalloc.take_snapshot()
display_memory_change("Baseline")
words = list(map(str.strip, open('/usr/share/dict/words')))
display_memory_change("Loaded words list")
gives an output of
Baseline (Tot: 0.0 MiB, Inc: 0.0 MiB)
Loaded words list (Tot: 15.1 MiB, Inc: 15.1 MiB)
confirming my sys.getsizeof() measurements.
Very usefull too
– Gabriel Mation
Nov 24 '18 at 18:03
if i could choose both, i would
– Gabriel Mation
Nov 24 '18 at 18:03
Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.
– Martijn Pieters♦
Nov 24 '18 at 18:07
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53460568%2fhow-easy-is-to-cause-an-outofmemoryexeption-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
One good way to find out is to try it.
You can put a line midway through your program to print out how much memory it is using:
import os
import psutil
process = psutil.Process(os.getpid())
print(process.memory_info().rss)
Try running your program with different numbers of words and plotting the results:

Then you can predict how many words it would take to use up all your memory.
A few other points to keep in mind:
- If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.
- Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.
- For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.
1
OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).
– Martijn Pieters♦
Nov 24 '18 at 17:25
2
Instead, usetracemallocsnapshots.
– Martijn Pieters♦
Nov 24 '18 at 17:26
so how can i measure it?
– Gabriel Mation
Nov 24 '18 at 17:26
ok ill se if it works
– Gabriel Mation
Nov 24 '18 at 17:26
@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.
– Owen
Nov 24 '18 at 17:26
|
show 2 more comments
One good way to find out is to try it.
You can put a line midway through your program to print out how much memory it is using:
import os
import psutil
process = psutil.Process(os.getpid())
print(process.memory_info().rss)
Try running your program with different numbers of words and plotting the results:

Then you can predict how many words it would take to use up all your memory.
A few other points to keep in mind:
- If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.
- Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.
- For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.
1
OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).
– Martijn Pieters♦
Nov 24 '18 at 17:25
2
Instead, usetracemallocsnapshots.
– Martijn Pieters♦
Nov 24 '18 at 17:26
so how can i measure it?
– Gabriel Mation
Nov 24 '18 at 17:26
ok ill se if it works
– Gabriel Mation
Nov 24 '18 at 17:26
@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.
– Owen
Nov 24 '18 at 17:26
|
show 2 more comments
One good way to find out is to try it.
You can put a line midway through your program to print out how much memory it is using:
import os
import psutil
process = psutil.Process(os.getpid())
print(process.memory_info().rss)
Try running your program with different numbers of words and plotting the results:

Then you can predict how many words it would take to use up all your memory.
A few other points to keep in mind:
- If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.
- Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.
- For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.
One good way to find out is to try it.
You can put a line midway through your program to print out how much memory it is using:
import os
import psutil
process = psutil.Process(os.getpid())
print(process.memory_info().rss)
Try running your program with different numbers of words and plotting the results:

Then you can predict how many words it would take to use up all your memory.
A few other points to keep in mind:
- If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.
- Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.
- For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.
answered Nov 24 '18 at 17:24
OwenOwen
28.8k1080112
28.8k1080112
1
OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).
– Martijn Pieters♦
Nov 24 '18 at 17:25
2
Instead, usetracemallocsnapshots.
– Martijn Pieters♦
Nov 24 '18 at 17:26
so how can i measure it?
– Gabriel Mation
Nov 24 '18 at 17:26
ok ill se if it works
– Gabriel Mation
Nov 24 '18 at 17:26
@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.
– Owen
Nov 24 '18 at 17:26
|
show 2 more comments
1
OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).
– Martijn Pieters♦
Nov 24 '18 at 17:25
2
Instead, usetracemallocsnapshots.
– Martijn Pieters♦
Nov 24 '18 at 17:26
so how can i measure it?
– Gabriel Mation
Nov 24 '18 at 17:26
ok ill se if it works
– Gabriel Mation
Nov 24 '18 at 17:26
@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.
– Owen
Nov 24 '18 at 17:26
1
1
OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).
– Martijn Pieters♦
Nov 24 '18 at 17:25
OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).
– Martijn Pieters♦
Nov 24 '18 at 17:25
2
2
Instead, use
tracemalloc snapshots.– Martijn Pieters♦
Nov 24 '18 at 17:26
Instead, use
tracemalloc snapshots.– Martijn Pieters♦
Nov 24 '18 at 17:26
so how can i measure it?
– Gabriel Mation
Nov 24 '18 at 17:26
so how can i measure it?
– Gabriel Mation
Nov 24 '18 at 17:26
ok ill se if it works
– Gabriel Mation
Nov 24 '18 at 17:26
ok ill se if it works
– Gabriel Mation
Nov 24 '18 at 17:26
@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.
– Owen
Nov 24 '18 at 17:26
@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.
– Owen
Nov 24 '18 at 17:26
|
show 2 more comments
You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.
With a bit of care, you can measure memory requirements directly. The sys.getsizeof() function lets you measure the direct memory requirements of a given Python object (only direct memory, not anything that it references!). You could use this to measure individual strings:
>>> import sys
>>> sys.getsizeof("Hello!")
55
>>> sys.getsizeof("memoryfootprint")
64
Exact sizes depend on the Python version and your OS. A Python string object needs a base amount of memory for a lot of book-keeping information, and then 1, 2 or 4 bytes per character, depending on the highest Unicode code point. For ASCII, that's just one byte per letter. Python 3.7, on my Mac OS X system uses 49 bytes for the bookkeeping portion.
Getting the size of a Python list object means you get just the list object memory requirements, not anything that's stored 'in' the list. You can repeatedly add the same object to a list and you'd not get copies, because Python uses references for everything, including list contents. Take that into account.
So lets load 300 random words, and create two lists, to see what the memory needs will be:
>>> import random
>>> words = list(map(str.strip, open('/usr/share/dict/words'))) # big file of words, present on many computers
>>> words = random.sample(words, 300) # just use 300
>>> words[:10]
['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']
>>> import statistics
>>> statistics.mean(map(len, words))
9.346666666666666
>>> statistics.median(map(len, words))
9.0
>>> statistics.mode(map(len, words))
10
>>> sys.getsizeof(words)
2464
>>> sum(sys.getsizeof(word) for word in words)
17504
That's one list, with 300 unique words with an average length of just over 9 characters, and that required 2464 bytes for the list, and 17504 bytes for the words themselves. That's less that not even 20KB.
But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.
For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.
On an 8GB machine, you will not have any problems. Note that I loaded the whole words file in one go into my computer, and then cut that back to 300 random words. Here is how much memory that whole initial list requires:
>>> words = list(map(str.strip, open('/usr/share/dict/words')))
>>> len(words)
235886
>>> sum(sys.getsizeof(word) for word in words)
13815637
>>> sys.getsizeof(words)
2007112
That's about 15MB of memory, for close to 236 thousand words.
If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:
last = None
def display_memory_change(msg):
global last
snap = tracemalloc.take_snapshot()
statdiff, last = snap.compare_to(last, 'filename', True), snap
tot = sum(s.size for s in statdiff)
change = sum(s.size_diff for s in statdiff)
print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(
msg, tot / 2 ** 20, change / 2 ** 20))
# at the start, get a baseline
tracemalloc.start()
last = tracemalloc.take_snapshot()
# create objects, run more code, etc.
display_memory_change("Some message as to what has been done")
# run some more code.
display_memory_change("Show some more statistics")
Using the above code to measure reading all those words:
tracemalloc.start()
last = tracemalloc.take_snapshot()
display_memory_change("Baseline")
words = list(map(str.strip, open('/usr/share/dict/words')))
display_memory_change("Loaded words list")
gives an output of
Baseline (Tot: 0.0 MiB, Inc: 0.0 MiB)
Loaded words list (Tot: 15.1 MiB, Inc: 15.1 MiB)
confirming my sys.getsizeof() measurements.
Very usefull too
– Gabriel Mation
Nov 24 '18 at 18:03
if i could choose both, i would
– Gabriel Mation
Nov 24 '18 at 18:03
Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.
– Martijn Pieters♦
Nov 24 '18 at 18:07
add a comment |
You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.
With a bit of care, you can measure memory requirements directly. The sys.getsizeof() function lets you measure the direct memory requirements of a given Python object (only direct memory, not anything that it references!). You could use this to measure individual strings:
>>> import sys
>>> sys.getsizeof("Hello!")
55
>>> sys.getsizeof("memoryfootprint")
64
Exact sizes depend on the Python version and your OS. A Python string object needs a base amount of memory for a lot of book-keeping information, and then 1, 2 or 4 bytes per character, depending on the highest Unicode code point. For ASCII, that's just one byte per letter. Python 3.7, on my Mac OS X system uses 49 bytes for the bookkeeping portion.
Getting the size of a Python list object means you get just the list object memory requirements, not anything that's stored 'in' the list. You can repeatedly add the same object to a list and you'd not get copies, because Python uses references for everything, including list contents. Take that into account.
So lets load 300 random words, and create two lists, to see what the memory needs will be:
>>> import random
>>> words = list(map(str.strip, open('/usr/share/dict/words'))) # big file of words, present on many computers
>>> words = random.sample(words, 300) # just use 300
>>> words[:10]
['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']
>>> import statistics
>>> statistics.mean(map(len, words))
9.346666666666666
>>> statistics.median(map(len, words))
9.0
>>> statistics.mode(map(len, words))
10
>>> sys.getsizeof(words)
2464
>>> sum(sys.getsizeof(word) for word in words)
17504
That's one list, with 300 unique words with an average length of just over 9 characters, and that required 2464 bytes for the list, and 17504 bytes for the words themselves. That's less that not even 20KB.
But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.
For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.
On an 8GB machine, you will not have any problems. Note that I loaded the whole words file in one go into my computer, and then cut that back to 300 random words. Here is how much memory that whole initial list requires:
>>> words = list(map(str.strip, open('/usr/share/dict/words')))
>>> len(words)
235886
>>> sum(sys.getsizeof(word) for word in words)
13815637
>>> sys.getsizeof(words)
2007112
That's about 15MB of memory, for close to 236 thousand words.
If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:
last = None
def display_memory_change(msg):
global last
snap = tracemalloc.take_snapshot()
statdiff, last = snap.compare_to(last, 'filename', True), snap
tot = sum(s.size for s in statdiff)
change = sum(s.size_diff for s in statdiff)
print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(
msg, tot / 2 ** 20, change / 2 ** 20))
# at the start, get a baseline
tracemalloc.start()
last = tracemalloc.take_snapshot()
# create objects, run more code, etc.
display_memory_change("Some message as to what has been done")
# run some more code.
display_memory_change("Show some more statistics")
Using the above code to measure reading all those words:
tracemalloc.start()
last = tracemalloc.take_snapshot()
display_memory_change("Baseline")
words = list(map(str.strip, open('/usr/share/dict/words')))
display_memory_change("Loaded words list")
gives an output of
Baseline (Tot: 0.0 MiB, Inc: 0.0 MiB)
Loaded words list (Tot: 15.1 MiB, Inc: 15.1 MiB)
confirming my sys.getsizeof() measurements.
Very usefull too
– Gabriel Mation
Nov 24 '18 at 18:03
if i could choose both, i would
– Gabriel Mation
Nov 24 '18 at 18:03
Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.
– Martijn Pieters♦
Nov 24 '18 at 18:07
add a comment |
You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.
With a bit of care, you can measure memory requirements directly. The sys.getsizeof() function lets you measure the direct memory requirements of a given Python object (only direct memory, not anything that it references!). You could use this to measure individual strings:
>>> import sys
>>> sys.getsizeof("Hello!")
55
>>> sys.getsizeof("memoryfootprint")
64
Exact sizes depend on the Python version and your OS. A Python string object needs a base amount of memory for a lot of book-keeping information, and then 1, 2 or 4 bytes per character, depending on the highest Unicode code point. For ASCII, that's just one byte per letter. Python 3.7, on my Mac OS X system uses 49 bytes for the bookkeeping portion.
Getting the size of a Python list object means you get just the list object memory requirements, not anything that's stored 'in' the list. You can repeatedly add the same object to a list and you'd not get copies, because Python uses references for everything, including list contents. Take that into account.
So lets load 300 random words, and create two lists, to see what the memory needs will be:
>>> import random
>>> words = list(map(str.strip, open('/usr/share/dict/words'))) # big file of words, present on many computers
>>> words = random.sample(words, 300) # just use 300
>>> words[:10]
['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']
>>> import statistics
>>> statistics.mean(map(len, words))
9.346666666666666
>>> statistics.median(map(len, words))
9.0
>>> statistics.mode(map(len, words))
10
>>> sys.getsizeof(words)
2464
>>> sum(sys.getsizeof(word) for word in words)
17504
That's one list, with 300 unique words with an average length of just over 9 characters, and that required 2464 bytes for the list, and 17504 bytes for the words themselves. That's less that not even 20KB.
But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.
For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.
On an 8GB machine, you will not have any problems. Note that I loaded the whole words file in one go into my computer, and then cut that back to 300 random words. Here is how much memory that whole initial list requires:
>>> words = list(map(str.strip, open('/usr/share/dict/words')))
>>> len(words)
235886
>>> sum(sys.getsizeof(word) for word in words)
13815637
>>> sys.getsizeof(words)
2007112
That's about 15MB of memory, for close to 236 thousand words.
If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:
last = None
def display_memory_change(msg):
global last
snap = tracemalloc.take_snapshot()
statdiff, last = snap.compare_to(last, 'filename', True), snap
tot = sum(s.size for s in statdiff)
change = sum(s.size_diff for s in statdiff)
print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(
msg, tot / 2 ** 20, change / 2 ** 20))
# at the start, get a baseline
tracemalloc.start()
last = tracemalloc.take_snapshot()
# create objects, run more code, etc.
display_memory_change("Some message as to what has been done")
# run some more code.
display_memory_change("Show some more statistics")
Using the above code to measure reading all those words:
tracemalloc.start()
last = tracemalloc.take_snapshot()
display_memory_change("Baseline")
words = list(map(str.strip, open('/usr/share/dict/words')))
display_memory_change("Loaded words list")
gives an output of
Baseline (Tot: 0.0 MiB, Inc: 0.0 MiB)
Loaded words list (Tot: 15.1 MiB, Inc: 15.1 MiB)
confirming my sys.getsizeof() measurements.
You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.
With a bit of care, you can measure memory requirements directly. The sys.getsizeof() function lets you measure the direct memory requirements of a given Python object (only direct memory, not anything that it references!). You could use this to measure individual strings:
>>> import sys
>>> sys.getsizeof("Hello!")
55
>>> sys.getsizeof("memoryfootprint")
64
Exact sizes depend on the Python version and your OS. A Python string object needs a base amount of memory for a lot of book-keeping information, and then 1, 2 or 4 bytes per character, depending on the highest Unicode code point. For ASCII, that's just one byte per letter. Python 3.7, on my Mac OS X system uses 49 bytes for the bookkeeping portion.
Getting the size of a Python list object means you get just the list object memory requirements, not anything that's stored 'in' the list. You can repeatedly add the same object to a list and you'd not get copies, because Python uses references for everything, including list contents. Take that into account.
So lets load 300 random words, and create two lists, to see what the memory needs will be:
>>> import random
>>> words = list(map(str.strip, open('/usr/share/dict/words'))) # big file of words, present on many computers
>>> words = random.sample(words, 300) # just use 300
>>> words[:10]
['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']
>>> import statistics
>>> statistics.mean(map(len, words))
9.346666666666666
>>> statistics.median(map(len, words))
9.0
>>> statistics.mode(map(len, words))
10
>>> sys.getsizeof(words)
2464
>>> sum(sys.getsizeof(word) for word in words)
17504
That's one list, with 300 unique words with an average length of just over 9 characters, and that required 2464 bytes for the list, and 17504 bytes for the words themselves. That's less that not even 20KB.
But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.
For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.
On an 8GB machine, you will not have any problems. Note that I loaded the whole words file in one go into my computer, and then cut that back to 300 random words. Here is how much memory that whole initial list requires:
>>> words = list(map(str.strip, open('/usr/share/dict/words')))
>>> len(words)
235886
>>> sum(sys.getsizeof(word) for word in words)
13815637
>>> sys.getsizeof(words)
2007112
That's about 15MB of memory, for close to 236 thousand words.
If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:
last = None
def display_memory_change(msg):
global last
snap = tracemalloc.take_snapshot()
statdiff, last = snap.compare_to(last, 'filename', True), snap
tot = sum(s.size for s in statdiff)
change = sum(s.size_diff for s in statdiff)
print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(
msg, tot / 2 ** 20, change / 2 ** 20))
# at the start, get a baseline
tracemalloc.start()
last = tracemalloc.take_snapshot()
# create objects, run more code, etc.
display_memory_change("Some message as to what has been done")
# run some more code.
display_memory_change("Show some more statistics")
Using the above code to measure reading all those words:
tracemalloc.start()
last = tracemalloc.take_snapshot()
display_memory_change("Baseline")
words = list(map(str.strip, open('/usr/share/dict/words')))
display_memory_change("Loaded words list")
gives an output of
Baseline (Tot: 0.0 MiB, Inc: 0.0 MiB)
Loaded words list (Tot: 15.1 MiB, Inc: 15.1 MiB)
confirming my sys.getsizeof() measurements.
edited Nov 24 '18 at 18:11
answered Nov 24 '18 at 17:50
Martijn Pieters♦Martijn Pieters
716k13825012313
716k13825012313
Very usefull too
– Gabriel Mation
Nov 24 '18 at 18:03
if i could choose both, i would
– Gabriel Mation
Nov 24 '18 at 18:03
Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.
– Martijn Pieters♦
Nov 24 '18 at 18:07
add a comment |
Very usefull too
– Gabriel Mation
Nov 24 '18 at 18:03
if i could choose both, i would
– Gabriel Mation
Nov 24 '18 at 18:03
Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.
– Martijn Pieters♦
Nov 24 '18 at 18:07
Very usefull too
– Gabriel Mation
Nov 24 '18 at 18:03
Very usefull too
– Gabriel Mation
Nov 24 '18 at 18:03
if i could choose both, i would
– Gabriel Mation
Nov 24 '18 at 18:03
if i could choose both, i would
– Gabriel Mation
Nov 24 '18 at 18:03
Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.
– Martijn Pieters♦
Nov 24 '18 at 18:07
Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.
– Martijn Pieters♦
Nov 24 '18 at 18:07
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53460568%2fhow-easy-is-to-cause-an-outofmemoryexeption-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.
– Carcigenicate
Nov 24 '18 at 17:17
It's easy to use all available memory:
a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.– juanpa.arrivillaga
Nov 24 '18 at 17:20
300 words will not use up all your memory, no.
– Martijn Pieters♦
Nov 24 '18 at 17:20
1): Remember they are 600 due to being the normal list and the randomized list.
– Gabriel Mation
Nov 24 '18 at 17:20
2) Lets say its an 8Gb RAM
– Gabriel Mation
Nov 24 '18 at 17:21