Understanding Keras LSTM NN input & output for binary classification

I am trying to create a simple LSTM network that would - based on the last 16 time frames - provide some output. Let's say I have a dataset with 112000 rows (measurements) and 7 columns (6 features + class). What I understand is that I have to "pack" the dataset into X number of 16 elements long batches. With 112000 rows that would mean 112000/16 = 7000 batches, therefore a numpy 3D array with shape (7000, 16, 7). Splitting this array for train and test data I get shapes:

xtrain.shape == (5000, 16, 6)

ytrain.shape == (5000, 16)

xtest.shape == (2000, 16, 6)

ytest.shape == (2000, 16)

My model looks like this:

model.add(keras.layers.LSTM(8, input_shape=(16, 6), stateful=True, batch_size=16, name="input"));

model.add(keras.layers.Dense(5, activation="relu", name="hidden1"));

model.add(keras.layers.Dense(1, activation="sigmoid", name="output"));

model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"]);



model.fit(xtrain, ytrain, batch_size=16, epochs=10);

However after trying to fit the model I get this error:

ValueError: Error when checking target: expected output to have shape (1,) but got array with shape (16,)

What I guess is wrong is that the model expects a single output per batch (so the ytrain shape should be (5000,)), instead of 16 outputs (one for every entry in a batch - (5000, 16)).

If that is the case, should I, instead of packing the data like this, create a 16 elements long batch for every output? Therefore having

xtrain.shape == (80000, 16, 6)

ytrain.shape == (80000,)

xtest.shape == (32000, 16, 6)

ytest.shape == (32000,)

asked Nov 22 '18 at 23:31

SEnergy

137

add a comment |

xtrain.shape == (5000, 16, 6)

ytrain.shape == (5000, 16)

xtest.shape == (2000, 16, 6)

ytest.shape == (2000, 16)

My model looks like this:

model.add(keras.layers.LSTM(8, input_shape=(16, 6), stateful=True, batch_size=16, name="input"));

model.add(keras.layers.Dense(5, activation="relu", name="hidden1"));

model.add(keras.layers.Dense(1, activation="sigmoid", name="output"));

model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"]);



model.fit(xtrain, ytrain, batch_size=16, epochs=10);

However after trying to fit the model I get this error:

ValueError: Error when checking target: expected output to have shape (1,) but got array with shape (16,)

What I guess is wrong is that the model expects a single output per batch (so the ytrain shape should be (5000,)), instead of 16 outputs (one for every entry in a batch - (5000, 16)).

If that is the case, should I, instead of packing the data like this, create a 16 elements long batch for every output? Therefore having

xtrain.shape == (80000, 16, 6)

ytrain.shape == (80000,)

xtest.shape == (32000, 16, 6)

ytest.shape == (32000,)

asked Nov 22 '18 at 23:31

SEnergy

137

add a comment |

xtrain.shape == (5000, 16, 6)

ytrain.shape == (5000, 16)

xtest.shape == (2000, 16, 6)

ytest.shape == (2000, 16)

My model looks like this:

model.add(keras.layers.LSTM(8, input_shape=(16, 6), stateful=True, batch_size=16, name="input"));

model.add(keras.layers.Dense(5, activation="relu", name="hidden1"));

model.add(keras.layers.Dense(1, activation="sigmoid", name="output"));

model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"]);



model.fit(xtrain, ytrain, batch_size=16, epochs=10);

However after trying to fit the model I get this error:

ValueError: Error when checking target: expected output to have shape (1,) but got array with shape (16,)

What I guess is wrong is that the model expects a single output per batch (so the ytrain shape should be (5000,)), instead of 16 outputs (one for every entry in a batch - (5000, 16)).

If that is the case, should I, instead of packing the data like this, create a 16 elements long batch for every output? Therefore having

xtrain.shape == (80000, 16, 6)

ytrain.shape == (80000,)

xtest.shape == (32000, 16, 6)

ytest.shape == (32000,)

asked Nov 22 '18 at 23:31

SEnergy

137

xtrain.shape == (5000, 16, 6)

ytrain.shape == (5000, 16)

xtest.shape == (2000, 16, 6)

ytest.shape == (2000, 16)

My model looks like this:

model.add(keras.layers.LSTM(8, input_shape=(16, 6), stateful=True, batch_size=16, name="input"));

model.add(keras.layers.Dense(5, activation="relu", name="hidden1"));

model.add(keras.layers.Dense(1, activation="sigmoid", name="output"));

model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"]);



model.fit(xtrain, ytrain, batch_size=16, epochs=10);

However after trying to fit the model I get this error:

ValueError: Error when checking target: expected output to have shape (1,) but got array with shape (16,)

What I guess is wrong is that the model expects a single output per batch (so the ytrain shape should be (5000,)), instead of 16 outputs (one for every entry in a batch - (5000, 16)).

If that is the case, should I, instead of packing the data like this, create a 16 elements long batch for every output? Therefore having

xtrain.shape == (80000, 16, 6)

ytrain.shape == (80000,)

xtest.shape == (32000, 16, 6)

ytest.shape == (32000,)

python tensorflow keras lstm

asked Nov 22 '18 at 23:31

SEnergy

137

asked Nov 22 '18 at 23:31

SEnergy

137

asked Nov 22 '18 at 23:31

SEnergy

137

asked Nov 22 '18 at 23:31

SEnergy

137

asked Nov 22 '18 at 23:31

SEnergy

137

add a comment |

1 Answer
1

active

oldest

votes

You are close with the last comments of the question. Since it's a binary classification problem, you should have 1 output per input, so you need to get rid of the 16 in you ys and replace it for a 1.

Besides, you need to be able to divide the train set by your batch size, so you can use 5008 for example.

In fact:

ytrain.shape == (5000, 1)

Passes the error you mention, but raises a new one:

ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 5000 samples

Which is addressed by ensuring that:

xtrain.shape == (5008, 16, 6)

ytrain.shape == (5008, 1)

answered Nov 22 '18 at 23:56

Julian Peller

8941511

so, considering I have 112000 rows of data, and I want to train the LSTM on as many rows as possible, should I create 111984 "packs" of 16 rows data that I feed into the LSTM? therefore having (111984, 16, 6) as input and (111984, 1) as output... I want to train a LSTM for class on every row, but that class requires information about last 16 time frames, so for the 16th (and first) row I need information about 0-15 rows, for 17th I need information about 1-16 rows etc, therefore 112000-16 = 11984 packs of 16 element long data?

– SEnergy
Nov 23 '18 at 16:32

I'm not sure I'm following you. You have n train samples. n should be divisible by the batch size, but you can fit the n samples and Keras will take care of the "batch" splitting. In turn, each train sample is a sequence of features. Each element of the sequence can be interpreted as a time step. And you have f features to describe that time step. Besides, for each train sample (consisting of a sequence of features), you have one unique y. This is the schema for a binary classification of a sequence.

– Julian Peller
Nov 23 '18 at 20:04

Assume I have n train samples (rows), where every sample has 3 features f (columns). Features f_n0 and f_n1 are input and f_n2 is an output. This f_n2 output, however, should be based on the last 16 rows (time frames) only, nothing before that time frame. Assume 1 time frame = 1 second: the NN tries to predict the output based on what happened in the last 16 seconds. Assuming I have 112 000 train samples (n = 112000) with 7 features (f = 7), and LSTM works with a 3D array, would be the resulting array in shape of (n-16, 16, f), or rather (n/16, 16, f) ?

– SEnergy
Nov 23 '18 at 20:41

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53439070%2funderstanding-keras-lstm-nn-input-output-for-binary-classification%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Besides, you need to be able to divide the train set by your batch size, so you can use 5008 for example.

In fact:

ytrain.shape == (5000, 1)

Passes the error you mention, but raises a new one:

ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 5000 samples

Which is addressed by ensuring that:

xtrain.shape == (5008, 16, 6)

ytrain.shape == (5008, 1)

answered Nov 22 '18 at 23:56

Julian Peller

8941511

so, considering I have 112000 rows of data, and I want to train the LSTM on as many rows as possible, should I create 111984 "packs" of 16 rows data that I feed into the LSTM? therefore having (111984, 16, 6) as input and (111984, 1) as output... I want to train a LSTM for class on every row, but that class requires information about last 16 time frames, so for the 16th (and first) row I need information about 0-15 rows, for 17th I need information about 1-16 rows etc, therefore 112000-16 = 11984 packs of 16 element long data?

– SEnergy
Nov 23 '18 at 16:32

I'm not sure I'm following you. You have n train samples. n should be divisible by the batch size, but you can fit the n samples and Keras will take care of the "batch" splitting. In turn, each train sample is a sequence of features. Each element of the sequence can be interpreted as a time step. And you have f features to describe that time step. Besides, for each train sample (consisting of a sequence of features), you have one unique y. This is the schema for a binary classification of a sequence.

– Julian Peller
Nov 23 '18 at 20:04

Assume I have n train samples (rows), where every sample has 3 features f (columns). Features f_n0 and f_n1 are input and f_n2 is an output. This f_n2 output, however, should be based on the last 16 rows (time frames) only, nothing before that time frame. Assume 1 time frame = 1 second: the NN tries to predict the output based on what happened in the last 16 seconds. Assuming I have 112 000 train samples (n = 112000) with 7 features (f = 7), and LSTM works with a 3D array, would be the resulting array in shape of (n-16, 16, f), or rather (n/16, 16, f) ?

– SEnergy
Nov 23 '18 at 20:41

add a comment |

Besides, you need to be able to divide the train set by your batch size, so you can use 5008 for example.

In fact:

ytrain.shape == (5000, 1)

Passes the error you mention, but raises a new one:

ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 5000 samples

Which is addressed by ensuring that:

xtrain.shape == (5008, 16, 6)

ytrain.shape == (5008, 1)

answered Nov 22 '18 at 23:56

Julian Peller

8941511

so, considering I have 112000 rows of data, and I want to train the LSTM on as many rows as possible, should I create 111984 "packs" of 16 rows data that I feed into the LSTM? therefore having (111984, 16, 6) as input and (111984, 1) as output... I want to train a LSTM for class on every row, but that class requires information about last 16 time frames, so for the 16th (and first) row I need information about 0-15 rows, for 17th I need information about 1-16 rows etc, therefore 112000-16 = 11984 packs of 16 element long data?

– SEnergy
Nov 23 '18 at 16:32

I'm not sure I'm following you. You have n train samples. n should be divisible by the batch size, but you can fit the n samples and Keras will take care of the "batch" splitting. In turn, each train sample is a sequence of features. Each element of the sequence can be interpreted as a time step. And you have f features to describe that time step. Besides, for each train sample (consisting of a sequence of features), you have one unique y. This is the schema for a binary classification of a sequence.

– Julian Peller
Nov 23 '18 at 20:04

Assume I have n train samples (rows), where every sample has 3 features f (columns). Features f_n0 and f_n1 are input and f_n2 is an output. This f_n2 output, however, should be based on the last 16 rows (time frames) only, nothing before that time frame. Assume 1 time frame = 1 second: the NN tries to predict the output based on what happened in the last 16 seconds. Assuming I have 112 000 train samples (n = 112000) with 7 features (f = 7), and LSTM works with a 3D array, would be the resulting array in shape of (n-16, 16, f), or rather (n/16, 16, f) ?

– SEnergy
Nov 23 '18 at 20:41

add a comment |

Besides, you need to be able to divide the train set by your batch size, so you can use 5008 for example.

In fact:

ytrain.shape == (5000, 1)

Passes the error you mention, but raises a new one:

ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 5000 samples

Which is addressed by ensuring that:

xtrain.shape == (5008, 16, 6)

ytrain.shape == (5008, 1)

answered Nov 22 '18 at 23:56

Julian Peller

8941511

Besides, you need to be able to divide the train set by your batch size, so you can use 5008 for example.

In fact:

ytrain.shape == (5000, 1)

Passes the error you mention, but raises a new one:

ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 5000 samples

Which is addressed by ensuring that:

xtrain.shape == (5008, 16, 6)

ytrain.shape == (5008, 1)

answered Nov 22 '18 at 23:56

Julian Peller

8941511

answered Nov 22 '18 at 23:56

Julian Peller

8941511

answered Nov 22 '18 at 23:56

Julian Peller

8941511

answered Nov 22 '18 at 23:56

Julian Peller

8941511

so, considering I have 112000 rows of data, and I want to train the LSTM on as many rows as possible, should I create 111984 "packs" of 16 rows data that I feed into the LSTM? therefore having (111984, 16, 6) as input and (111984, 1) as output... I want to train a LSTM for class on every row, but that class requires information about last 16 time frames, so for the 16th (and first) row I need information about 0-15 rows, for 17th I need information about 1-16 rows etc, therefore 112000-16 = 11984 packs of 16 element long data?

– SEnergy
Nov 23 '18 at 16:32

I'm not sure I'm following you. You have n train samples. n should be divisible by the batch size, but you can fit the n samples and Keras will take care of the "batch" splitting. In turn, each train sample is a sequence of features. Each element of the sequence can be interpreted as a time step. And you have f features to describe that time step. Besides, for each train sample (consisting of a sequence of features), you have one unique y. This is the schema for a binary classification of a sequence.

– Julian Peller
Nov 23 '18 at 20:04

Assume I have n train samples (rows), where every sample has 3 features f (columns). Features f_n0 and f_n1 are input and f_n2 is an output. This f_n2 output, however, should be based on the last 16 rows (time frames) only, nothing before that time frame. Assume 1 time frame = 1 second: the NN tries to predict the output based on what happened in the last 16 seconds. Assuming I have 112 000 train samples (n = 112000) with 7 features (f = 7), and LSTM works with a 3D array, would be the resulting array in shape of (n-16, 16, f), or rather (n/16, 16, f) ?

– SEnergy
Nov 23 '18 at 20:41

add a comment |

so, considering I have 112000 rows of data, and I want to train the LSTM on as many rows as possible, should I create 111984 "packs" of 16 rows data that I feed into the LSTM? therefore having (111984, 16, 6) as input and (111984, 1) as output... I want to train a LSTM for class on every row, but that class requires information about last 16 time frames, so for the 16th (and first) row I need information about 0-15 rows, for 17th I need information about 1-16 rows etc, therefore 112000-16 = 11984 packs of 16 element long data?

– SEnergy
Nov 23 '18 at 16:32

I'm not sure I'm following you. You have n train samples. n should be divisible by the batch size, but you can fit the n samples and Keras will take care of the "batch" splitting. In turn, each train sample is a sequence of features. Each element of the sequence can be interpreted as a time step. And you have f features to describe that time step. Besides, for each train sample (consisting of a sequence of features), you have one unique y. This is the schema for a binary classification of a sequence.

– Julian Peller
Nov 23 '18 at 20:04

Assume I have n train samples (rows), where every sample has 3 features f (columns). Features f_n0 and f_n1 are input and f_n2 is an output. This f_n2 output, however, should be based on the last 16 rows (time frames) only, nothing before that time frame. Assume 1 time frame = 1 second: the NN tries to predict the output based on what happened in the last 16 seconds. Assuming I have 112 000 train samples (n = 112000) with 7 features (f = 7), and LSTM works with a 3D array, would be the resulting array in shape of (n-16, 16, f), or rather (n/16, 16, f) ?

– SEnergy
Nov 23 '18 at 20:41

so, considering I have 112000 rows of data, and I want to train the LSTM on as many rows as possible, should I create 111984 "packs" of 16 rows data that I feed into the LSTM? therefore having (111984, 16, 6) as input and (111984, 1) as output... I want to train a LSTM for class on every row, but that class requires information about last 16 time frames, so for the 16th (and first) row I need information about 0-15 rows, for 17th I need information about 1-16 rows etc, therefore 112000-16 = 11984 packs of 16 element long data?

– SEnergy
Nov 23 '18 at 16:32

I'm not sure I'm following you. You have n train samples. n should be divisible by the batch size, but you can fit the n samples and Keras will take care of the "batch" splitting. In turn, each train sample is a sequence of features. Each element of the sequence can be interpreted as a time step. And you have f features to describe that time step. Besides, for each train sample (consisting of a sequence of features), you have one unique y. This is the schema for a binary classification of a sequence.

– Julian Peller
Nov 23 '18 at 20:04

Assume I have n train samples (rows), where every sample has 3 features f (columns). Features f_n0 and f_n1 are input and f_n2 is an output. This f_n2 output, however, should be based on the last 16 rows (time frames) only, nothing before that time frame. Assume 1 time frame = 1 second: the NN tries to predict the output based on what happened in the last 16 seconds. Assuming I have 112 000 train samples (n = 112000) with 7 features (f = 7), and LSTM works with a 3D array, would be the resulting array in shape of (n-16, 16, f), or rather (n/16, 16, f) ?

– SEnergy
Nov 23 '18 at 20:41

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg