Finding the appropriate polynomial fit for set of data
$begingroup$
Is there a function or library in Python to automatically compute the best polynomial fit for a set of data points? I am not really interested in the ML use case of generalizing to a set of new data, I am just focusing on the data I have. I realize the higher the degree, the better the fit. However, I want something that penalizes or looks at where the error elbows? When I say elbowing, I mean something like this (although usually it is not so drastic or obvious): 
One idea I had was to use Numpy's polyfit: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html to compute polynomial regression for a range of orders/degrees. Polyfit requires the user to specify the degree of polynomial, which poses a challenge because I don't have any assumptions or preconceived notions. The higher the degree of fit, the lower the error will be but eventually it plateaus like the image above. Therefore if I want to automatically compute the degree of polynomial where the error curve elbows: if my error is E and d is my degree, I want to maximize (E[d+1]-E[d]) - (E[d] - E[d-1]).
Is this even a valid approach? Are there other tools and approaches, perhaps using well-established Python libraries like Numpy or Scipy, that can help with finding the appropriate polynomial fit (without me having to specify the order/degree)? I would appreciate any thoughts or suggestions! Thanks!
polynomials regression regression-analysis python
$endgroup$
add a comment |
$begingroup$
Is there a function or library in Python to automatically compute the best polynomial fit for a set of data points? I am not really interested in the ML use case of generalizing to a set of new data, I am just focusing on the data I have. I realize the higher the degree, the better the fit. However, I want something that penalizes or looks at where the error elbows? When I say elbowing, I mean something like this (although usually it is not so drastic or obvious): 
One idea I had was to use Numpy's polyfit: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html to compute polynomial regression for a range of orders/degrees. Polyfit requires the user to specify the degree of polynomial, which poses a challenge because I don't have any assumptions or preconceived notions. The higher the degree of fit, the lower the error will be but eventually it plateaus like the image above. Therefore if I want to automatically compute the degree of polynomial where the error curve elbows: if my error is E and d is my degree, I want to maximize (E[d+1]-E[d]) - (E[d] - E[d-1]).
Is this even a valid approach? Are there other tools and approaches, perhaps using well-established Python libraries like Numpy or Scipy, that can help with finding the appropriate polynomial fit (without me having to specify the order/degree)? I would appreciate any thoughts or suggestions! Thanks!
polynomials regression regression-analysis python
$endgroup$
$begingroup$
maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
$endgroup$
– Claude Leibovici
Jan 7 at 6:26
$begingroup$
Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
$endgroup$
– Jane Sully
Jan 7 at 7:21
add a comment |
$begingroup$
Is there a function or library in Python to automatically compute the best polynomial fit for a set of data points? I am not really interested in the ML use case of generalizing to a set of new data, I am just focusing on the data I have. I realize the higher the degree, the better the fit. However, I want something that penalizes or looks at where the error elbows? When I say elbowing, I mean something like this (although usually it is not so drastic or obvious): 
One idea I had was to use Numpy's polyfit: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html to compute polynomial regression for a range of orders/degrees. Polyfit requires the user to specify the degree of polynomial, which poses a challenge because I don't have any assumptions or preconceived notions. The higher the degree of fit, the lower the error will be but eventually it plateaus like the image above. Therefore if I want to automatically compute the degree of polynomial where the error curve elbows: if my error is E and d is my degree, I want to maximize (E[d+1]-E[d]) - (E[d] - E[d-1]).
Is this even a valid approach? Are there other tools and approaches, perhaps using well-established Python libraries like Numpy or Scipy, that can help with finding the appropriate polynomial fit (without me having to specify the order/degree)? I would appreciate any thoughts or suggestions! Thanks!
polynomials regression regression-analysis python
$endgroup$
Is there a function or library in Python to automatically compute the best polynomial fit for a set of data points? I am not really interested in the ML use case of generalizing to a set of new data, I am just focusing on the data I have. I realize the higher the degree, the better the fit. However, I want something that penalizes or looks at where the error elbows? When I say elbowing, I mean something like this (although usually it is not so drastic or obvious): 
One idea I had was to use Numpy's polyfit: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html to compute polynomial regression for a range of orders/degrees. Polyfit requires the user to specify the degree of polynomial, which poses a challenge because I don't have any assumptions or preconceived notions. The higher the degree of fit, the lower the error will be but eventually it plateaus like the image above. Therefore if I want to automatically compute the degree of polynomial where the error curve elbows: if my error is E and d is my degree, I want to maximize (E[d+1]-E[d]) - (E[d] - E[d-1]).
Is this even a valid approach? Are there other tools and approaches, perhaps using well-established Python libraries like Numpy or Scipy, that can help with finding the appropriate polynomial fit (without me having to specify the order/degree)? I would appreciate any thoughts or suggestions! Thanks!
polynomials regression regression-analysis python
polynomials regression regression-analysis python
edited Jan 7 at 7:21
Jane Sully
asked Jan 6 at 23:09
Jane SullyJane Sully
1084
1084
$begingroup$
maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
$endgroup$
– Claude Leibovici
Jan 7 at 6:26
$begingroup$
Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
$endgroup$
– Jane Sully
Jan 7 at 7:21
add a comment |
$begingroup$
maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
$endgroup$
– Claude Leibovici
Jan 7 at 6:26
$begingroup$
Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
$endgroup$
– Jane Sully
Jan 7 at 7:21
$begingroup$
maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
$endgroup$
– Claude Leibovici
Jan 7 at 6:26
$begingroup$
maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
$endgroup$
– Claude Leibovici
Jan 7 at 6:26
$begingroup$
Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
$endgroup$
– Jane Sully
Jan 7 at 7:21
$begingroup$
Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
$endgroup$
– Jane Sully
Jan 7 at 7:21
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3064503%2ffinding-the-appropriate-polynomial-fit-for-set-of-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes
$endgroup$
add a comment |
$begingroup$
What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes
$endgroup$
add a comment |
$begingroup$
What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes
$endgroup$
What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes
answered Jan 7 at 7:31
J.G.J.G.
33.3k23252
33.3k23252
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3064503%2ffinding-the-appropriate-polynomial-fit-for-set-of-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
$endgroup$
– Claude Leibovici
Jan 7 at 6:26
$begingroup$
Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
$endgroup$
– Jane Sully
Jan 7 at 7:21