how easy is to cause an outOfMemoryExeption in python?

Im dong a spelling bee program in python using pygame, and it works fine, but i have been testing it with 7 words, not more.

Im worried that, if used with 300 words it might cause the memory to fill.
remember there are 2 arrays: One holds the default list of words, and the other holds the randomized words.

asked Nov 24 '18 at 17:14

Gabriel Mation

2

It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.

– Carcigenicate
Nov 24 '18 at 17:17

It's easy to use all available memory:a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.

– juanpa.arrivillaga
Nov 24 '18 at 17:20

300 words will not use up all your memory, no.

– Martijn Pieters♦
Nov 24 '18 at 17:20

1): Remember they are 600 due to being the normal list and the randomized list.

– Gabriel Mation
Nov 24 '18 at 17:20

2) Lets say its an 8Gb RAM

– Gabriel Mation
Nov 24 '18 at 17:21

|
show 9 more comments

asked Nov 24 '18 at 17:14

Gabriel Mation

2

It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.

– Carcigenicate
Nov 24 '18 at 17:17

It's easy to use all available memory:a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.

– juanpa.arrivillaga
Nov 24 '18 at 17:20

300 words will not use up all your memory, no.

– Martijn Pieters♦
Nov 24 '18 at 17:20

1): Remember they are 600 due to being the normal list and the randomized list.

– Gabriel Mation
Nov 24 '18 at 17:20

2) Lets say its an 8Gb RAM

– Gabriel Mation
Nov 24 '18 at 17:21

|
show 9 more comments

asked Nov 24 '18 at 17:14

Gabriel Mation

python memory out-of-memory

asked Nov 24 '18 at 17:14

Gabriel Mation

asked Nov 24 '18 at 17:14

Gabriel Mation

asked Nov 24 '18 at 17:14

Gabriel Mation

asked Nov 24 '18 at 17:14

Gabriel Mation

asked Nov 24 '18 at 17:14

Gabriel Mation

2

It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.

– Carcigenicate
Nov 24 '18 at 17:17

It's easy to use all available memory:a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.

– juanpa.arrivillaga
Nov 24 '18 at 17:20

300 words will not use up all your memory, no.

– Martijn Pieters♦
Nov 24 '18 at 17:20

1): Remember they are 600 due to being the normal list and the randomized list.

– Gabriel Mation
Nov 24 '18 at 17:20

2) Lets say its an 8Gb RAM

– Gabriel Mation
Nov 24 '18 at 17:21

|
show 9 more comments

2

It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.

– Carcigenicate
Nov 24 '18 at 17:17

It's easy to use all available memory:a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.

– juanpa.arrivillaga
Nov 24 '18 at 17:20

300 words will not use up all your memory, no.

– Martijn Pieters♦
Nov 24 '18 at 17:20

1): Remember they are 600 due to being the normal list and the randomized list.

– Gabriel Mation
Nov 24 '18 at 17:20

2) Lets say its an 8Gb RAM

– Gabriel Mation
Nov 24 '18 at 17:21

It would likely depend on the computer it's being run on. 300 strings alone isn't going to cause memory problems though unless you're running it on a potato.

– Carcigenicate
Nov 24 '18 at 17:17

It's easy to use all available memory:a = 'a'*100000000000000000. But 300 words is not going to cause a problem on any modern system.

– juanpa.arrivillaga
Nov 24 '18 at 17:20

300 words will not use up all your memory, no.

– Martijn Pieters♦
Nov 24 '18 at 17:20

1): Remember they are 600 due to being the normal list and the randomized list.

– Gabriel Mation
Nov 24 '18 at 17:20

2) Lets say its an 8Gb RAM

– Gabriel Mation
Nov 24 '18 at 17:21

|
show 9 more comments

2 Answers
2

active

oldest

votes

One good way to find out is to try it.

You can put a line midway through your program to print out how much memory it is using:

import os

import psutil

process = psutil.Process(os.getpid())

print(process.memory_info().rss)

Try running your program with different numbers of words and plotting the results:

graph plotting total memory vs. number of words

Then you can predict how many words it would take to use up all your memory.

A few other points to keep in mind:

If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.

Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.

For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.

answered Nov 24 '18 at 17:24

Owen

28.8k1080112

1

OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).

– Martijn Pieters♦
Nov 24 '18 at 17:25

2

Instead, use tracemalloc snapshots.

– Martijn Pieters♦
Nov 24 '18 at 17:26

so how can i measure it?

– Gabriel Mation
Nov 24 '18 at 17:26

ok ill se if it works

– Gabriel Mation
Nov 24 '18 at 17:26

@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.

– Owen
Nov 24 '18 at 17:26

|
show 2 more comments

You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.

With a bit of care, you can measure memory requirements directly. The sys.getsizeof() function lets you measure the direct memory requirements of a given Python object (only direct memory, not anything that it references!). You could use this to measure individual strings:

>>> import sys

>>> sys.getsizeof("Hello!")

55

>>> sys.getsizeof("memoryfootprint")

64

Exact sizes depend on the Python version and your OS. A Python string object needs a base amount of memory for a lot of book-keeping information, and then 1, 2 or 4 bytes per character, depending on the highest Unicode code point. For ASCII, that's just one byte per letter. Python 3.7, on my Mac OS X system uses 49 bytes for the bookkeeping portion.

Getting the size of a Python list object means you get just the list object memory requirements, not anything that's stored 'in' the list. You can repeatedly add the same object to a list and you'd not get copies, because Python uses references for everything, including list contents. Take that into account.

So lets load 300 random words, and create two lists, to see what the memory needs will be:

>>> import random

>>> words = list(map(str.strip, open('/usr/share/dict/words')))  # big file of words, present on many computers

>>> words = random.sample(words, 300)  # just use 300

>>> words[:10]

['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']

>>> import statistics

>>> statistics.mean(map(len, words))

9.346666666666666

>>> statistics.median(map(len, words))

9.0

>>> statistics.mode(map(len, words))

10

>>> sys.getsizeof(words)

2464

>>> sum(sys.getsizeof(word) for word in words)

17504

That's one list, with 300 unique words with an average length of just over 9 characters, and that required 2464 bytes for the list, and 17504 bytes for the words themselves. That's less that not even 20KB.

But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.

For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.

On an 8GB machine, you will not have any problems. Note that I loaded the whole words file in one go into my computer, and then cut that back to 300 random words. Here is how much memory that whole initial list requires:

>>> words = list(map(str.strip, open('/usr/share/dict/words')))

>>> len(words)

235886

>>> sum(sys.getsizeof(word) for word in words)

13815637

>>> sys.getsizeof(words)

2007112

That's about 15MB of memory, for close to 236 thousand words.

If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:

last = None

def display_memory_change(msg):

    global last

    snap = tracemalloc.take_snapshot()

    statdiff, last = snap.compare_to(last, 'filename', True), snap

    tot = sum(s.size for s in statdiff)

    change = sum(s.size_diff for s in statdiff)

    print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(

        msg, tot / 2 ** 20, change / 2 ** 20))





# at the start, get a baseline

tracemalloc.start()

last = tracemalloc.take_snapshot()



# create objects, run more code, etc.



display_memory_change("Some message as to what has been done")



# run some more code.



display_memory_change("Show some more statistics")

Using the above code to measure reading all those words:

tracemalloc.start()

last = tracemalloc.take_snapshot()

display_memory_change("Baseline")



words = list(map(str.strip, open('/usr/share/dict/words')))



display_memory_change("Loaded words list")

gives an output of

            Baseline (Tot:    0.0 MiB, Inc:    0.0 MiB)

   Loaded words list (Tot:   15.1 MiB, Inc:   15.1 MiB)

confirming my sys.getsizeof() measurements.

edited Nov 24 '18 at 18:11

answered Nov 24 '18 at 17:50

Martijn Pieters♦

716k13825012313

Very usefull too

– Gabriel Mation
Nov 24 '18 at 18:03

if i could choose both, i would

– Gabriel Mation
Nov 24 '18 at 18:03

Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.

– Martijn Pieters♦
Nov 24 '18 at 18:07

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53460568%2fhow-easy-is-to-cause-an-outofmemoryexeption-in-python%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

One good way to find out is to try it.

You can put a line midway through your program to print out how much memory it is using:

import os

import psutil

process = psutil.Process(os.getpid())

print(process.memory_info().rss)

Try running your program with different numbers of words and plotting the results:

graph plotting total memory vs. number of words

Then you can predict how many words it would take to use up all your memory.

A few other points to keep in mind:

If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.

Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.

For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.

answered Nov 24 '18 at 17:24

Owen

28.8k1080112

1

OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).

– Martijn Pieters♦
Nov 24 '18 at 17:25

2

Instead, use tracemalloc snapshots.

– Martijn Pieters♦
Nov 24 '18 at 17:26

so how can i measure it?

– Gabriel Mation
Nov 24 '18 at 17:26

ok ill se if it works

– Gabriel Mation
Nov 24 '18 at 17:26

@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.

– Owen
Nov 24 '18 at 17:26

|
show 2 more comments

One good way to find out is to try it.

You can put a line midway through your program to print out how much memory it is using:

import os

import psutil

process = psutil.Process(os.getpid())

print(process.memory_info().rss)

Try running your program with different numbers of words and plotting the results:

graph plotting total memory vs. number of words

Then you can predict how many words it would take to use up all your memory.

A few other points to keep in mind:

If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.

Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.

For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.

answered Nov 24 '18 at 17:24

Owen

28.8k1080112

1

OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).

– Martijn Pieters♦
Nov 24 '18 at 17:25

2

Instead, use tracemalloc snapshots.

– Martijn Pieters♦
Nov 24 '18 at 17:26

so how can i measure it?

– Gabriel Mation
Nov 24 '18 at 17:26

ok ill se if it works

– Gabriel Mation
Nov 24 '18 at 17:26

@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.

– Owen
Nov 24 '18 at 17:26

|
show 2 more comments

One good way to find out is to try it.

You can put a line midway through your program to print out how much memory it is using:

import os

import psutil

process = psutil.Process(os.getpid())

print(process.memory_info().rss)

Try running your program with different numbers of words and plotting the results:

graph plotting total memory vs. number of words

Then you can predict how many words it would take to use up all your memory.

A few other points to keep in mind:

If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.

Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.

For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.

answered Nov 24 '18 at 17:24

Owen

28.8k1080112

One good way to find out is to try it.

You can put a line midway through your program to print out how much memory it is using:

import os

import psutil

process = psutil.Process(os.getpid())

print(process.memory_info().rss)

Try running your program with different numbers of words and plotting the results:

graph plotting total memory vs. number of words

Then you can predict how many words it would take to use up all your memory.

A few other points to keep in mind:

If you are using 32 bit Python, your total memory will be limited by the 32 bit address space to about 4 GB.

Your computer likely uses the disk to increase the virtual memory beyond the RAM size. So, even if you only have 1 GB RAM, you might find you can use 3 GB of memory in your program.

For small lists of words like you are using, you will almost never run out of memory unless your program has a bug. In my experience, OutOfMemory is almost always because I made a mistake.

answered Nov 24 '18 at 17:24

Owen

28.8k1080112

answered Nov 24 '18 at 17:24

Owen

28.8k1080112

answered Nov 24 '18 at 17:24

Owen

28.8k1080112

answered Nov 24 '18 at 17:24

Owen

28.8k1080112

1

OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).

– Martijn Pieters♦
Nov 24 '18 at 17:25

2

Instead, use tracemalloc snapshots.

– Martijn Pieters♦
Nov 24 '18 at 17:26

so how can i measure it?

– Gabriel Mation
Nov 24 '18 at 17:26

ok ill se if it works

– Gabriel Mation
Nov 24 '18 at 17:26

@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.

– Owen
Nov 24 '18 at 17:26

|
show 2 more comments

1

OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).

– Martijn Pieters♦
Nov 24 '18 at 17:25

2

Instead, use tracemalloc snapshots.

– Martijn Pieters♦
Nov 24 '18 at 17:26

so how can i measure it?

– Gabriel Mation
Nov 24 '18 at 17:26

ok ill se if it works

– Gabriel Mation
Nov 24 '18 at 17:26

@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.

– Owen
Nov 24 '18 at 17:26

OS memory allocation is not a good way to measure this, as that happens in chunks and Python uses a heap model (meaning it'll request larger blocks).

– Martijn Pieters♦
Nov 24 '18 at 17:25

Instead, use tracemalloc snapshots.

– Martijn Pieters♦
Nov 24 '18 at 17:26

so how can i measure it?

– Gabriel Mation
Nov 24 '18 at 17:26

ok ill se if it works

– Gabriel Mation
Nov 24 '18 at 17:26

@MartijnPieters That's a good point. Of course, if you are nearing using your whole RAM, os memory usage will be a decent approximation.

– Owen
Nov 24 '18 at 17:26

|
show 2 more comments

You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.

>>> import sys

>>> sys.getsizeof("Hello!")

55

>>> sys.getsizeof("memoryfootprint")

64

So lets load 300 random words, and create two lists, to see what the memory needs will be:

>>> import random

>>> words = list(map(str.strip, open('/usr/share/dict/words')))  # big file of words, present on many computers

>>> words = random.sample(words, 300)  # just use 300

>>> words[:10]

['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']

>>> import statistics

>>> statistics.mean(map(len, words))

9.346666666666666

>>> statistics.median(map(len, words))

9.0

>>> statistics.mode(map(len, words))

10

>>> sys.getsizeof(words)

2464

>>> sum(sys.getsizeof(word) for word in words)

17504

But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.

For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.

>>> words = list(map(str.strip, open('/usr/share/dict/words')))

>>> len(words)

235886

>>> sum(sys.getsizeof(word) for word in words)

13815637

>>> sys.getsizeof(words)

2007112

That's about 15MB of memory, for close to 236 thousand words.

If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:

last = None

def display_memory_change(msg):

    global last

    snap = tracemalloc.take_snapshot()

    statdiff, last = snap.compare_to(last, 'filename', True), snap

    tot = sum(s.size for s in statdiff)

    change = sum(s.size_diff for s in statdiff)

    print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(

        msg, tot / 2 ** 20, change / 2 ** 20))





# at the start, get a baseline

tracemalloc.start()

last = tracemalloc.take_snapshot()



# create objects, run more code, etc.



display_memory_change("Some message as to what has been done")



# run some more code.



display_memory_change("Show some more statistics")

Using the above code to measure reading all those words:

tracemalloc.start()

last = tracemalloc.take_snapshot()

display_memory_change("Baseline")



words = list(map(str.strip, open('/usr/share/dict/words')))



display_memory_change("Loaded words list")

gives an output of

            Baseline (Tot:    0.0 MiB, Inc:    0.0 MiB)

   Loaded words list (Tot:   15.1 MiB, Inc:   15.1 MiB)

confirming my sys.getsizeof() measurements.

edited Nov 24 '18 at 18:11

answered Nov 24 '18 at 17:50

Martijn Pieters♦

716k13825012313

Very usefull too

– Gabriel Mation
Nov 24 '18 at 18:03

if i could choose both, i would

– Gabriel Mation
Nov 24 '18 at 18:03

Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.

– Martijn Pieters♦
Nov 24 '18 at 18:07

add a comment |

You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.

>>> import sys

>>> sys.getsizeof("Hello!")

55

>>> sys.getsizeof("memoryfootprint")

64

So lets load 300 random words, and create two lists, to see what the memory needs will be:

>>> import random

>>> words = list(map(str.strip, open('/usr/share/dict/words')))  # big file of words, present on many computers

>>> words = random.sample(words, 300)  # just use 300

>>> words[:10]

['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']

>>> import statistics

>>> statistics.mean(map(len, words))

9.346666666666666

>>> statistics.median(map(len, words))

9.0

>>> statistics.mode(map(len, words))

10

>>> sys.getsizeof(words)

2464

>>> sum(sys.getsizeof(word) for word in words)

17504

But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.

For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.

>>> words = list(map(str.strip, open('/usr/share/dict/words')))

>>> len(words)

235886

>>> sum(sys.getsizeof(word) for word in words)

13815637

>>> sys.getsizeof(words)

2007112

That's about 15MB of memory, for close to 236 thousand words.

If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:

last = None

def display_memory_change(msg):

    global last

    snap = tracemalloc.take_snapshot()

    statdiff, last = snap.compare_to(last, 'filename', True), snap

    tot = sum(s.size for s in statdiff)

    change = sum(s.size_diff for s in statdiff)

    print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(

        msg, tot / 2 ** 20, change / 2 ** 20))





# at the start, get a baseline

tracemalloc.start()

last = tracemalloc.take_snapshot()



# create objects, run more code, etc.



display_memory_change("Some message as to what has been done")



# run some more code.



display_memory_change("Show some more statistics")

Using the above code to measure reading all those words:

tracemalloc.start()

last = tracemalloc.take_snapshot()

display_memory_change("Baseline")



words = list(map(str.strip, open('/usr/share/dict/words')))



display_memory_change("Loaded words list")

gives an output of

            Baseline (Tot:    0.0 MiB, Inc:    0.0 MiB)

   Loaded words list (Tot:   15.1 MiB, Inc:   15.1 MiB)

confirming my sys.getsizeof() measurements.

edited Nov 24 '18 at 18:11

answered Nov 24 '18 at 17:50

Martijn Pieters♦

716k13825012313

Very usefull too

– Gabriel Mation
Nov 24 '18 at 18:03

if i could choose both, i would

– Gabriel Mation
Nov 24 '18 at 18:03

Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.

– Martijn Pieters♦
Nov 24 '18 at 18:07

add a comment |

You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.

>>> import sys

>>> sys.getsizeof("Hello!")

55

>>> sys.getsizeof("memoryfootprint")

64

So lets load 300 random words, and create two lists, to see what the memory needs will be:

>>> import random

>>> words = list(map(str.strip, open('/usr/share/dict/words')))  # big file of words, present on many computers

>>> words = random.sample(words, 300)  # just use 300

>>> words[:10]

['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']

>>> import statistics

>>> statistics.mean(map(len, words))

9.346666666666666

>>> statistics.median(map(len, words))

9.0

>>> statistics.mode(map(len, words))

10

>>> sys.getsizeof(words)

2464

>>> sum(sys.getsizeof(word) for word in words)

17504

But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.

For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.

>>> words = list(map(str.strip, open('/usr/share/dict/words')))

>>> len(words)

235886

>>> sum(sys.getsizeof(word) for word in words)

13815637

>>> sys.getsizeof(words)

2007112

That's about 15MB of memory, for close to 236 thousand words.

If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:

last = None

def display_memory_change(msg):

    global last

    snap = tracemalloc.take_snapshot()

    statdiff, last = snap.compare_to(last, 'filename', True), snap

    tot = sum(s.size for s in statdiff)

    change = sum(s.size_diff for s in statdiff)

    print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(

        msg, tot / 2 ** 20, change / 2 ** 20))





# at the start, get a baseline

tracemalloc.start()

last = tracemalloc.take_snapshot()



# create objects, run more code, etc.



display_memory_change("Some message as to what has been done")



# run some more code.



display_memory_change("Show some more statistics")

Using the above code to measure reading all those words:

tracemalloc.start()

last = tracemalloc.take_snapshot()

display_memory_change("Baseline")



words = list(map(str.strip, open('/usr/share/dict/words')))



display_memory_change("Loaded words list")

gives an output of

            Baseline (Tot:    0.0 MiB, Inc:    0.0 MiB)

   Loaded words list (Tot:   15.1 MiB, Inc:   15.1 MiB)

confirming my sys.getsizeof() measurements.

edited Nov 24 '18 at 18:11

answered Nov 24 '18 at 17:50

Martijn Pieters♦

716k13825012313

You really do not need to worry. Python is not such a memory hog as to cause issues with a mere 600 words.

>>> import sys

>>> sys.getsizeof("Hello!")

55

>>> sys.getsizeof("memoryfootprint")

64

So lets load 300 random words, and create two lists, to see what the memory needs will be:

>>> import random

>>> words = list(map(str.strip, open('/usr/share/dict/words')))  # big file of words, present on many computers

>>> words = random.sample(words, 300)  # just use 300

>>> words[:10]

['fourer', 'tampon', 'Minyadidae', 'digallic', 'euploid', 'Mograbi', 'sketchbook', 'annul', 'ambilogy', 'outtalent']

>>> import statistics

>>> statistics.mean(map(len, words))

9.346666666666666

>>> statistics.median(map(len, words))

9.0

>>> statistics.mode(map(len, words))

10

>>> sys.getsizeof(words)

2464

>>> sum(sys.getsizeof(word) for word in words)

17504

But, you say, you have 2 lists. But that second list will not have copies of your words, that's just more references to the existing words, so that'll only take another 2464 bytes, so 2KB.

For 300 random English words, in two lists, your total memory requirements are around 20KB of memory.

>>> words = list(map(str.strip, open('/usr/share/dict/words')))

>>> len(words)

235886

>>> sum(sys.getsizeof(word) for word in words)

13815637

>>> sys.getsizeof(words)

2007112

That's about 15MB of memory, for close to 236 thousand words.

If you are worried about larger programs with more objects, that you can also use the tracemalloc library to get statistics about memory use:

last = None

def display_memory_change(msg):

    global last

    snap = tracemalloc.take_snapshot()

    statdiff, last = snap.compare_to(last, 'filename', True), snap

    tot = sum(s.size for s in statdiff)

    change = sum(s.size_diff for s in statdiff)

    print('{:>20} (Tot: {:6.1f} MiB, Inc: {:6.1f} MiB)'.format(

        msg, tot / 2 ** 20, change / 2 ** 20))





# at the start, get a baseline

tracemalloc.start()

last = tracemalloc.take_snapshot()



# create objects, run more code, etc.



display_memory_change("Some message as to what has been done")



# run some more code.



display_memory_change("Show some more statistics")

Using the above code to measure reading all those words:

tracemalloc.start()

last = tracemalloc.take_snapshot()

display_memory_change("Baseline")



words = list(map(str.strip, open('/usr/share/dict/words')))



display_memory_change("Loaded words list")

gives an output of

            Baseline (Tot:    0.0 MiB, Inc:    0.0 MiB)

   Loaded words list (Tot:   15.1 MiB, Inc:   15.1 MiB)

confirming my sys.getsizeof() measurements.

edited Nov 24 '18 at 18:11

answered Nov 24 '18 at 17:50

Martijn Pieters♦

716k13825012313

edited Nov 24 '18 at 18:11

answered Nov 24 '18 at 17:50

Martijn Pieters♦

716k13825012313

answered Nov 24 '18 at 17:50

Martijn Pieters♦

716k13825012313

answered Nov 24 '18 at 17:50

Martijn Pieters♦

716k13825012313

Very usefull too

– Gabriel Mation
Nov 24 '18 at 18:03

if i could choose both, i would

– Gabriel Mation
Nov 24 '18 at 18:03

Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.

– Martijn Pieters♦
Nov 24 '18 at 18:07

add a comment |

Very usefull too

– Gabriel Mation
Nov 24 '18 at 18:03

if i could choose both, i would

– Gabriel Mation
Nov 24 '18 at 18:03

Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.

– Martijn Pieters♦
Nov 24 '18 at 18:07

Very usefull too

– Gabriel Mation
Nov 24 '18 at 18:03

if i could choose both, i would

– Gabriel Mation
Nov 24 '18 at 18:03

Sorry, you can only mark one as the accepted answer. The choice is entirely yours, and not picking one is also an option.

– Martijn Pieters♦
Nov 24 '18 at 18:07

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg