Data augmentation in test/validation set?
It is common practice to augment data (add samples programmatically, such as random crops, etc. in the case of a dataset consisting of images) on both training and test set, or just the training data set?
machine-learning deep-learning
add a comment |
It is common practice to augment data (add samples programmatically, such as random crops, etc. in the case of a dataset consisting of images) on both training and test set, or just the training data set?
machine-learning deep-learning
Just training. Golden Rule - Never touch test set. Reason - Test set represents unseen data when you put your model in production.
– saurabheights
Dec 30 '17 at 20:38
add a comment |
It is common practice to augment data (add samples programmatically, such as random crops, etc. in the case of a dataset consisting of images) on both training and test set, or just the training data set?
machine-learning deep-learning
It is common practice to augment data (add samples programmatically, such as random crops, etc. in the case of a dataset consisting of images) on both training and test set, or just the training data set?
machine-learning deep-learning
machine-learning deep-learning
edited Dec 30 '17 at 21:42
rodrigo-silveira
asked Dec 29 '17 at 23:31
rodrigo-silveirarodrigo-silveira
5,89644472
5,89644472
Just training. Golden Rule - Never touch test set. Reason - Test set represents unseen data when you put your model in production.
– saurabheights
Dec 30 '17 at 20:38
add a comment |
Just training. Golden Rule - Never touch test set. Reason - Test set represents unseen data when you put your model in production.
– saurabheights
Dec 30 '17 at 20:38
Just training. Golden Rule - Never touch test set. Reason - Test set represents unseen data when you put your model in production.
– saurabheights
Dec 30 '17 at 20:38
Just training. Golden Rule - Never touch test set. Reason - Test set represents unseen data when you put your model in production.
– saurabheights
Dec 30 '17 at 20:38
add a comment |
4 Answers
4
active
oldest
votes
Only on training. Data augmentation is used to increase the size of training set and to get more different images.
Technically, you could use data augmentation on test set to see how model behaves on such images, but usually people don't do it.
1
Any reason why the test set or validation set is not augmented?
– Anuj Gupta
Sep 10 '18 at 11:51
1
In fact situation has changed a bit... There is a new method: test time augmentation. It means augmentation of test data could be used to improve predictions for cases when the object in image is too small. Here is an article with explanation: towardsdatascience.com/…
– Andrey Lukyanenko
Sep 11 '18 at 3:34
add a comment |
Data augmentation is done only on training set as it helps the model become more generalize and robust. So there's no point of augmenting the test set.
add a comment |
This answer on stats.SE makes the case for applying crops on the validation / test sets so as to make that input similar the the input in the training set that the network was trained on.
add a comment |
Do it only on the training set.
The reason why we use a training and a test set in the first place is that we want to estimate the error our system will have in reality. So the data for the test set should be as close to real data as possible.
If you do it on the test set, you might have the problem that you introduce errors. For example, say you want to recognize digits and you augment by rotating. Then a 6
might look like a 9
.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f48029542%2fdata-augmentation-in-test-validation-set%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
Only on training. Data augmentation is used to increase the size of training set and to get more different images.
Technically, you could use data augmentation on test set to see how model behaves on such images, but usually people don't do it.
1
Any reason why the test set or validation set is not augmented?
– Anuj Gupta
Sep 10 '18 at 11:51
1
In fact situation has changed a bit... There is a new method: test time augmentation. It means augmentation of test data could be used to improve predictions for cases when the object in image is too small. Here is an article with explanation: towardsdatascience.com/…
– Andrey Lukyanenko
Sep 11 '18 at 3:34
add a comment |
Only on training. Data augmentation is used to increase the size of training set and to get more different images.
Technically, you could use data augmentation on test set to see how model behaves on such images, but usually people don't do it.
1
Any reason why the test set or validation set is not augmented?
– Anuj Gupta
Sep 10 '18 at 11:51
1
In fact situation has changed a bit... There is a new method: test time augmentation. It means augmentation of test data could be used to improve predictions for cases when the object in image is too small. Here is an article with explanation: towardsdatascience.com/…
– Andrey Lukyanenko
Sep 11 '18 at 3:34
add a comment |
Only on training. Data augmentation is used to increase the size of training set and to get more different images.
Technically, you could use data augmentation on test set to see how model behaves on such images, but usually people don't do it.
Only on training. Data augmentation is used to increase the size of training set and to get more different images.
Technically, you could use data augmentation on test set to see how model behaves on such images, but usually people don't do it.
answered Dec 30 '17 at 5:02
Andrey LukyanenkoAndrey Lukyanenko
1,5592612
1,5592612
1
Any reason why the test set or validation set is not augmented?
– Anuj Gupta
Sep 10 '18 at 11:51
1
In fact situation has changed a bit... There is a new method: test time augmentation. It means augmentation of test data could be used to improve predictions for cases when the object in image is too small. Here is an article with explanation: towardsdatascience.com/…
– Andrey Lukyanenko
Sep 11 '18 at 3:34
add a comment |
1
Any reason why the test set or validation set is not augmented?
– Anuj Gupta
Sep 10 '18 at 11:51
1
In fact situation has changed a bit... There is a new method: test time augmentation. It means augmentation of test data could be used to improve predictions for cases when the object in image is too small. Here is an article with explanation: towardsdatascience.com/…
– Andrey Lukyanenko
Sep 11 '18 at 3:34
1
1
Any reason why the test set or validation set is not augmented?
– Anuj Gupta
Sep 10 '18 at 11:51
Any reason why the test set or validation set is not augmented?
– Anuj Gupta
Sep 10 '18 at 11:51
1
1
In fact situation has changed a bit... There is a new method: test time augmentation. It means augmentation of test data could be used to improve predictions for cases when the object in image is too small. Here is an article with explanation: towardsdatascience.com/…
– Andrey Lukyanenko
Sep 11 '18 at 3:34
In fact situation has changed a bit... There is a new method: test time augmentation. It means augmentation of test data could be used to improve predictions for cases when the object in image is too small. Here is an article with explanation: towardsdatascience.com/…
– Andrey Lukyanenko
Sep 11 '18 at 3:34
add a comment |
Data augmentation is done only on training set as it helps the model become more generalize and robust. So there's no point of augmenting the test set.
add a comment |
Data augmentation is done only on training set as it helps the model become more generalize and robust. So there's no point of augmenting the test set.
add a comment |
Data augmentation is done only on training set as it helps the model become more generalize and robust. So there's no point of augmenting the test set.
Data augmentation is done only on training set as it helps the model become more generalize and robust. So there's no point of augmenting the test set.
answered Dec 30 '17 at 22:06
Abhishek PatelAbhishek Patel
907
907
add a comment |
add a comment |
This answer on stats.SE makes the case for applying crops on the validation / test sets so as to make that input similar the the input in the training set that the network was trained on.
add a comment |
This answer on stats.SE makes the case for applying crops on the validation / test sets so as to make that input similar the the input in the training set that the network was trained on.
add a comment |
This answer on stats.SE makes the case for applying crops on the validation / test sets so as to make that input similar the the input in the training set that the network was trained on.
This answer on stats.SE makes the case for applying crops on the validation / test sets so as to make that input similar the the input in the training set that the network was trained on.
answered Nov 26 '18 at 13:05
Tom HaleTom Hale
7,8205166
7,8205166
add a comment |
add a comment |
Do it only on the training set.
The reason why we use a training and a test set in the first place is that we want to estimate the error our system will have in reality. So the data for the test set should be as close to real data as possible.
If you do it on the test set, you might have the problem that you introduce errors. For example, say you want to recognize digits and you augment by rotating. Then a 6
might look like a 9
.
add a comment |
Do it only on the training set.
The reason why we use a training and a test set in the first place is that we want to estimate the error our system will have in reality. So the data for the test set should be as close to real data as possible.
If you do it on the test set, you might have the problem that you introduce errors. For example, say you want to recognize digits and you augment by rotating. Then a 6
might look like a 9
.
add a comment |
Do it only on the training set.
The reason why we use a training and a test set in the first place is that we want to estimate the error our system will have in reality. So the data for the test set should be as close to real data as possible.
If you do it on the test set, you might have the problem that you introduce errors. For example, say you want to recognize digits and you augment by rotating. Then a 6
might look like a 9
.
Do it only on the training set.
The reason why we use a training and a test set in the first place is that we want to estimate the error our system will have in reality. So the data for the test set should be as close to real data as possible.
If you do it on the test set, you might have the problem that you introduce errors. For example, say you want to recognize digits and you augment by rotating. Then a 6
might look like a 9
.
answered Jan 5 '18 at 9:12
Martin ThomaMartin Thoma
44.7k61320543
44.7k61320543
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f48029542%2fdata-augmentation-in-test-validation-set%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Just training. Golden Rule - Never touch test set. Reason - Test set represents unseen data when you put your model in production.
– saurabheights
Dec 30 '17 at 20:38