Simple linear regression: If Y and X are both normal, what's the exact null distribution of the parameters?












5












$begingroup$


Suppose $Y sim{N(a,b)}$, $X sim{N(c,d)}$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_{0}+beta_{1}X + epsilon$. I wish to test the hypothesis $H_{0}: beta_{0}=0$ against the alternative $H_{1}: beta_{0}neq 0$.



My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_{0}$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_{0}$ under the null hypothesis.



If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.










share|cite|improve this question











$endgroup$








  • 6




    $begingroup$
    I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
    $endgroup$
    – Silverfish
    Dec 23 '18 at 9:18












  • $begingroup$
    This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
    $endgroup$
    – kjetil b halvorsen
    Dec 23 '18 at 9:59












  • $begingroup$
    Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
    $endgroup$
    – Anna Efron
    Dec 24 '18 at 6:16
















5












$begingroup$


Suppose $Y sim{N(a,b)}$, $X sim{N(c,d)}$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_{0}+beta_{1}X + epsilon$. I wish to test the hypothesis $H_{0}: beta_{0}=0$ against the alternative $H_{1}: beta_{0}neq 0$.



My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_{0}$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_{0}$ under the null hypothesis.



If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.










share|cite|improve this question











$endgroup$








  • 6




    $begingroup$
    I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
    $endgroup$
    – Silverfish
    Dec 23 '18 at 9:18












  • $begingroup$
    This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
    $endgroup$
    – kjetil b halvorsen
    Dec 23 '18 at 9:59












  • $begingroup$
    Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
    $endgroup$
    – Anna Efron
    Dec 24 '18 at 6:16














5












5








5





$begingroup$


Suppose $Y sim{N(a,b)}$, $X sim{N(c,d)}$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_{0}+beta_{1}X + epsilon$. I wish to test the hypothesis $H_{0}: beta_{0}=0$ against the alternative $H_{1}: beta_{0}neq 0$.



My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_{0}$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_{0}$ under the null hypothesis.



If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.










share|cite|improve this question











$endgroup$




Suppose $Y sim{N(a,b)}$, $X sim{N(c,d)}$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_{0}+beta_{1}X + epsilon$. I wish to test the hypothesis $H_{0}: beta_{0}=0$ against the alternative $H_{1}: beta_{0}neq 0$.



My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_{0}$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_{0}$ under the null hypothesis.



If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.







regression inference






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Dec 23 '18 at 9:16









Silverfish

15k1565146




15k1565146










asked Dec 23 '18 at 4:56









Anna EfronAnna Efron

363




363








  • 6




    $begingroup$
    I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
    $endgroup$
    – Silverfish
    Dec 23 '18 at 9:18












  • $begingroup$
    This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
    $endgroup$
    – kjetil b halvorsen
    Dec 23 '18 at 9:59












  • $begingroup$
    Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
    $endgroup$
    – Anna Efron
    Dec 24 '18 at 6:16














  • 6




    $begingroup$
    I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
    $endgroup$
    – Silverfish
    Dec 23 '18 at 9:18












  • $begingroup$
    This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
    $endgroup$
    – kjetil b halvorsen
    Dec 23 '18 at 9:59












  • $begingroup$
    Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
    $endgroup$
    – Anna Efron
    Dec 24 '18 at 6:16








6




6




$begingroup$
I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
$endgroup$
– Silverfish
Dec 23 '18 at 9:18






$begingroup$
I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
$endgroup$
– Silverfish
Dec 23 '18 at 9:18














$begingroup$
This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
$endgroup$
– kjetil b halvorsen
Dec 23 '18 at 9:59






$begingroup$
This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
$endgroup$
– kjetil b halvorsen
Dec 23 '18 at 9:59














$begingroup$
Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
$endgroup$
– Anna Efron
Dec 24 '18 at 6:16




$begingroup$
Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
$endgroup$
– Anna Efron
Dec 24 '18 at 6:16










2 Answers
2






active

oldest

votes


















5












$begingroup$

Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



$$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$



which implies that:



$$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$



In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
    $endgroup$
    – Anna Efron
    Dec 24 '18 at 6:21








  • 1




    $begingroup$
    Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
    $endgroup$
    – Ben
    Dec 24 '18 at 6:44



















0












$begingroup$

In simple linear regression the computation of the estimate of $beta_0$ is:



$$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$



with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $



You could say it will be a linear sum of the $y_i$



$$hatbeta_0 = frac {1} {n} sum c_i y_i $$



with



$$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$



This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



$$hatbeta_0 sim N(mu, sigma^2)$$



where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.






share|cite|improve this answer











$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "65"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384254%2fsimple-linear-regression-if-y-and-x-are-both-normal-whats-the-exact-null-dist%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    5












    $begingroup$

    Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



    $$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$



    which implies that:



    $$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$



    In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



    Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
      $endgroup$
      – Anna Efron
      Dec 24 '18 at 6:21








    • 1




      $begingroup$
      Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
      $endgroup$
      – Ben
      Dec 24 '18 at 6:44
















    5












    $begingroup$

    Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



    $$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$



    which implies that:



    $$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$



    In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



    Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
      $endgroup$
      – Anna Efron
      Dec 24 '18 at 6:21








    • 1




      $begingroup$
      Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
      $endgroup$
      – Ben
      Dec 24 '18 at 6:44














    5












    5








    5





    $begingroup$

    Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



    $$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$



    which implies that:



    $$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$



    In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



    Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.






    share|cite|improve this answer









    $endgroup$



    Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:



    $$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$



    which implies that:



    $$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$



    In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.



    Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Dec 23 '18 at 7:05









    BenBen

    26k227121




    26k227121












    • $begingroup$
      Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
      $endgroup$
      – Anna Efron
      Dec 24 '18 at 6:21








    • 1




      $begingroup$
      Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
      $endgroup$
      – Ben
      Dec 24 '18 at 6:44


















    • $begingroup$
      Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
      $endgroup$
      – Anna Efron
      Dec 24 '18 at 6:21








    • 1




      $begingroup$
      Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
      $endgroup$
      – Ben
      Dec 24 '18 at 6:44
















    $begingroup$
    Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
    $endgroup$
    – Anna Efron
    Dec 24 '18 at 6:21






    $begingroup$
    Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
    $endgroup$
    – Anna Efron
    Dec 24 '18 at 6:21






    1




    1




    $begingroup$
    Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
    $endgroup$
    – Ben
    Dec 24 '18 at 6:44




    $begingroup$
    Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
    $endgroup$
    – Ben
    Dec 24 '18 at 6:44













    0












    $begingroup$

    In simple linear regression the computation of the estimate of $beta_0$ is:



    $$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$



    with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $



    You could say it will be a linear sum of the $y_i$



    $$hatbeta_0 = frac {1} {n} sum c_i y_i $$



    with



    $$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$



    This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



    $$hatbeta_0 sim N(mu, sigma^2)$$



    where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



    However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



    In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.






    share|cite|improve this answer











    $endgroup$


















      0












      $begingroup$

      In simple linear regression the computation of the estimate of $beta_0$ is:



      $$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$



      with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $



      You could say it will be a linear sum of the $y_i$



      $$hatbeta_0 = frac {1} {n} sum c_i y_i $$



      with



      $$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$



      This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



      $$hatbeta_0 sim N(mu, sigma^2)$$



      where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



      However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



      In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.






      share|cite|improve this answer











      $endgroup$
















        0












        0








        0





        $begingroup$

        In simple linear regression the computation of the estimate of $beta_0$ is:



        $$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$



        with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $



        You could say it will be a linear sum of the $y_i$



        $$hatbeta_0 = frac {1} {n} sum c_i y_i $$



        with



        $$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$



        This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



        $$hatbeta_0 sim N(mu, sigma^2)$$



        where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



        However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



        In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.






        share|cite|improve this answer











        $endgroup$



        In simple linear regression the computation of the estimate of $beta_0$ is:



        $$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$



        with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $



        You could say it will be a linear sum of the $y_i$



        $$hatbeta_0 = frac {1} {n} sum c_i y_i $$



        with



        $$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$



        This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:



        $$hatbeta_0 sim N(mu, sigma^2)$$



        where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)



        However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .



        In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Dec 25 '18 at 13:34

























        answered Dec 25 '18 at 13:08









        Martijn WeteringsMartijn Weterings

        14.1k1762




        14.1k1762






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384254%2fsimple-linear-regression-if-y-and-x-are-both-normal-whats-the-exact-null-dist%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Wiesbaden

            Marschland

            Dieringhausen