Finding the appropriate polynomial fit for set of data












0












$begingroup$


Is there a function or library in Python to automatically compute the best polynomial fit for a set of data points? I am not really interested in the ML use case of generalizing to a set of new data, I am just focusing on the data I have. I realize the higher the degree, the better the fit. However, I want something that penalizes or looks at where the error elbows? When I say elbowing, I mean something like this (although usually it is not so drastic or obvious): enter image description here



One idea I had was to use Numpy's polyfit: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html to compute polynomial regression for a range of orders/degrees. Polyfit requires the user to specify the degree of polynomial, which poses a challenge because I don't have any assumptions or preconceived notions. The higher the degree of fit, the lower the error will be but eventually it plateaus like the image above. Therefore if I want to automatically compute the degree of polynomial where the error curve elbows: if my error is E and d is my degree, I want to maximize (E[d+1]-E[d]) - (E[d] - E[d-1]).



Is this even a valid approach? Are there other tools and approaches, perhaps using well-established Python libraries like Numpy or Scipy, that can help with finding the appropriate polynomial fit (without me having to specify the order/degree)? I would appreciate any thoughts or suggestions! Thanks!










share|cite|improve this question











$endgroup$












  • $begingroup$
    maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
    $endgroup$
    – Claude Leibovici
    Jan 7 at 6:26










  • $begingroup$
    Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
    $endgroup$
    – Jane Sully
    Jan 7 at 7:21
















0












$begingroup$


Is there a function or library in Python to automatically compute the best polynomial fit for a set of data points? I am not really interested in the ML use case of generalizing to a set of new data, I am just focusing on the data I have. I realize the higher the degree, the better the fit. However, I want something that penalizes or looks at where the error elbows? When I say elbowing, I mean something like this (although usually it is not so drastic or obvious): enter image description here



One idea I had was to use Numpy's polyfit: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html to compute polynomial regression for a range of orders/degrees. Polyfit requires the user to specify the degree of polynomial, which poses a challenge because I don't have any assumptions or preconceived notions. The higher the degree of fit, the lower the error will be but eventually it plateaus like the image above. Therefore if I want to automatically compute the degree of polynomial where the error curve elbows: if my error is E and d is my degree, I want to maximize (E[d+1]-E[d]) - (E[d] - E[d-1]).



Is this even a valid approach? Are there other tools and approaches, perhaps using well-established Python libraries like Numpy or Scipy, that can help with finding the appropriate polynomial fit (without me having to specify the order/degree)? I would appreciate any thoughts or suggestions! Thanks!










share|cite|improve this question











$endgroup$












  • $begingroup$
    maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
    $endgroup$
    – Claude Leibovici
    Jan 7 at 6:26










  • $begingroup$
    Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
    $endgroup$
    – Jane Sully
    Jan 7 at 7:21














0












0








0





$begingroup$


Is there a function or library in Python to automatically compute the best polynomial fit for a set of data points? I am not really interested in the ML use case of generalizing to a set of new data, I am just focusing on the data I have. I realize the higher the degree, the better the fit. However, I want something that penalizes or looks at where the error elbows? When I say elbowing, I mean something like this (although usually it is not so drastic or obvious): enter image description here



One idea I had was to use Numpy's polyfit: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html to compute polynomial regression for a range of orders/degrees. Polyfit requires the user to specify the degree of polynomial, which poses a challenge because I don't have any assumptions or preconceived notions. The higher the degree of fit, the lower the error will be but eventually it plateaus like the image above. Therefore if I want to automatically compute the degree of polynomial where the error curve elbows: if my error is E and d is my degree, I want to maximize (E[d+1]-E[d]) - (E[d] - E[d-1]).



Is this even a valid approach? Are there other tools and approaches, perhaps using well-established Python libraries like Numpy or Scipy, that can help with finding the appropriate polynomial fit (without me having to specify the order/degree)? I would appreciate any thoughts or suggestions! Thanks!










share|cite|improve this question











$endgroup$




Is there a function or library in Python to automatically compute the best polynomial fit for a set of data points? I am not really interested in the ML use case of generalizing to a set of new data, I am just focusing on the data I have. I realize the higher the degree, the better the fit. However, I want something that penalizes or looks at where the error elbows? When I say elbowing, I mean something like this (although usually it is not so drastic or obvious): enter image description here



One idea I had was to use Numpy's polyfit: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html to compute polynomial regression for a range of orders/degrees. Polyfit requires the user to specify the degree of polynomial, which poses a challenge because I don't have any assumptions or preconceived notions. The higher the degree of fit, the lower the error will be but eventually it plateaus like the image above. Therefore if I want to automatically compute the degree of polynomial where the error curve elbows: if my error is E and d is my degree, I want to maximize (E[d+1]-E[d]) - (E[d] - E[d-1]).



Is this even a valid approach? Are there other tools and approaches, perhaps using well-established Python libraries like Numpy or Scipy, that can help with finding the appropriate polynomial fit (without me having to specify the order/degree)? I would appreciate any thoughts or suggestions! Thanks!







polynomials regression regression-analysis python






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jan 7 at 7:21







Jane Sully

















asked Jan 6 at 23:09









Jane SullyJane Sully

1084




1084












  • $begingroup$
    maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
    $endgroup$
    – Claude Leibovici
    Jan 7 at 6:26










  • $begingroup$
    Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
    $endgroup$
    – Jane Sully
    Jan 7 at 7:21


















  • $begingroup$
    maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
    $endgroup$
    – Claude Leibovici
    Jan 7 at 6:26










  • $begingroup$
    Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
    $endgroup$
    – Jane Sully
    Jan 7 at 7:21
















$begingroup$
maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
$endgroup$
– Claude Leibovici
Jan 7 at 6:26




$begingroup$
maximize (E[d+1]-E[d]) - (E[d+1] - E[d]) ? Typo, I suppose
$endgroup$
– Claude Leibovici
Jan 7 at 6:26












$begingroup$
Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
$endgroup$
– Jane Sully
Jan 7 at 7:21




$begingroup$
Yes, thanks for pointing that out. I meant (E[d+1]-E[d]) - (E[d] - E[d-1])
$endgroup$
– Jane Sully
Jan 7 at 7:21










1 Answer
1






active

oldest

votes


















0












$begingroup$

What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes






share|cite|improve this answer









$endgroup$














    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3064503%2ffinding-the-appropriate-polynomial-fit-for-set-of-data%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes






    share|cite|improve this answer









    $endgroup$


















      0












      $begingroup$

      What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes






      share|cite|improve this answer









      $endgroup$
















        0












        0








        0





        $begingroup$

        What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes






        share|cite|improve this answer









        $endgroup$



        What you're trying to do is called model selection. You may find a popular library method for this, but it's important you know how the model is selected, even if it means writing a little code yourself. The basic idea is to take a measure of fit quality, then penalise by a measure of model complexity. There are many popular criteria for this, so think about which is appropriate for your purposes







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Jan 7 at 7:31









        J.G.J.G.

        33.3k23252




        33.3k23252






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3064503%2ffinding-the-appropriate-polynomial-fit-for-set-of-data%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Tonle Sap (See)

            I get strange results when I access the Sqlitedatabase with Unity C# via XAMPP

            Guatemaltekische Davis-Cup-Mannschaft