Simple linear regression: If Y and X are both normal, what's the exact null distribution of the parameters?
$begingroup$
Suppose $Y sim{N(a,b)}$, $X sim{N(c,d)}$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_{0}+beta_{1}X + epsilon$. I wish to test the hypothesis $H_{0}: beta_{0}=0$ against the alternative $H_{1}: beta_{0}neq 0$.
My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_{0}$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_{0}$ under the null hypothesis.
If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.
regression inference
$endgroup$
add a comment |
$begingroup$
Suppose $Y sim{N(a,b)}$, $X sim{N(c,d)}$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_{0}+beta_{1}X + epsilon$. I wish to test the hypothesis $H_{0}: beta_{0}=0$ against the alternative $H_{1}: beta_{0}neq 0$.
My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_{0}$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_{0}$ under the null hypothesis.
If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.
regression inference
$endgroup$
6
$begingroup$
I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
$endgroup$
– Silverfish
Dec 23 '18 at 9:18
$begingroup$
This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
$endgroup$
– kjetil b halvorsen
Dec 23 '18 at 9:59
$begingroup$
Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
$endgroup$
– Anna Efron
Dec 24 '18 at 6:16
add a comment |
$begingroup$
Suppose $Y sim{N(a,b)}$, $X sim{N(c,d)}$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_{0}+beta_{1}X + epsilon$. I wish to test the hypothesis $H_{0}: beta_{0}=0$ against the alternative $H_{1}: beta_{0}neq 0$.
My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_{0}$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_{0}$ under the null hypothesis.
If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.
regression inference
$endgroup$
Suppose $Y sim{N(a,b)}$, $X sim{N(c,d)}$, and $Y$ is independent of $X$. After sampling 25 observations from both $Y$ and $X$, I run the following regression model: $Y=beta_{0}+beta_{1}X + epsilon$. I wish to test the hypothesis $H_{0}: beta_{0}=0$ against the alternative $H_{1}: beta_{0}neq 0$.
My question is, since the distributions of $Y$ and $X$ are known, is there an exact 'null distribution' for the parameter $beta_{0}$? If so, what is the distribution? By null distribution, I mean the sampling distribution of $beta_{0}$ under the null hypothesis.
If anyone knows the answer assuming the true correlation coefficient between $Y$ and $X$ is 0.1, rather than assuming independence, that would be a big help also. This is all for a simulation study I'm working on.
regression inference
regression inference
edited Dec 23 '18 at 9:16
Silverfish
15k1565146
15k1565146
asked Dec 23 '18 at 4:56
Anna EfronAnna Efron
363
363
6
$begingroup$
I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
$endgroup$
– Silverfish
Dec 23 '18 at 9:18
$begingroup$
This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
$endgroup$
– kjetil b halvorsen
Dec 23 '18 at 9:59
$begingroup$
Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
$endgroup$
– Anna Efron
Dec 24 '18 at 6:16
add a comment |
6
$begingroup$
I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
$endgroup$
– Silverfish
Dec 23 '18 at 9:18
$begingroup$
This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
$endgroup$
– kjetil b halvorsen
Dec 23 '18 at 9:59
$begingroup$
Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
$endgroup$
– Anna Efron
Dec 24 '18 at 6:16
6
6
$begingroup$
I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
$endgroup$
– Silverfish
Dec 23 '18 at 9:18
$begingroup$
I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
$endgroup$
– Silverfish
Dec 23 '18 at 9:18
$begingroup$
This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
$endgroup$
– kjetil b halvorsen
Dec 23 '18 at 9:59
$begingroup$
This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
$endgroup$
– kjetil b halvorsen
Dec 23 '18 at 9:59
$begingroup$
Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
$endgroup$
– Anna Efron
Dec 24 '18 at 6:16
$begingroup$
Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
$endgroup$
– Anna Efron
Dec 24 '18 at 6:16
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:
$$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$
which implies that:
$$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$
In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.
Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.
$endgroup$
$begingroup$
Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
$endgroup$
– Anna Efron
Dec 24 '18 at 6:21
1
$begingroup$
Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
$endgroup$
– Ben
Dec 24 '18 at 6:44
add a comment |
$begingroup$
In simple linear regression the computation of the estimate of $beta_0$ is:
$$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$
with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $
You could say it will be a linear sum of the $y_i$
$$hatbeta_0 = frac {1} {n} sum c_i y_i $$
with
$$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$
This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:
$$hatbeta_0 sim N(mu, sigma^2)$$
where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)
However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .
In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384254%2fsimple-linear-regression-if-y-and-x-are-both-normal-whats-the-exact-null-dist%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:
$$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$
which implies that:
$$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$
In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.
Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.
$endgroup$
$begingroup$
Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
$endgroup$
– Anna Efron
Dec 24 '18 at 6:21
1
$begingroup$
Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
$endgroup$
– Ben
Dec 24 '18 at 6:44
add a comment |
$begingroup$
Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:
$$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$
which implies that:
$$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$
In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.
Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.
$endgroup$
$begingroup$
Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
$endgroup$
– Anna Efron
Dec 24 '18 at 6:21
1
$begingroup$
Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
$endgroup$
– Ben
Dec 24 '18 at 6:44
add a comment |
$begingroup$
Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:
$$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$
which implies that:
$$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$
In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.
Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.
$endgroup$
Since you have specified that $X$ and $Y$ are independent, the conditional mean of $Y$ given $X$ is:
$$mathbb{E}(Y|X) = mathbb{E}(Y) = c,$$
which implies that:
$$beta_0 = c quad quad quad beta_1 = 0 quad quad quad varepsilon sim text{N}(0, d).$$
In this case there is nothing to test --- your regression parameters are fully determined by the distributional assumptions you have made at the start of the question.
Remember that a regression model is a model designed to describe the conditional distribution of $Y$ given $X$. If you assume independence of these variables then this pre-empts the entire modelling exercise.
answered Dec 23 '18 at 7:05
BenBen
26k227121
26k227121
$begingroup$
Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
$endgroup$
– Anna Efron
Dec 24 '18 at 6:21
1
$begingroup$
Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
$endgroup$
– Ben
Dec 24 '18 at 6:44
add a comment |
$begingroup$
Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
$endgroup$
– Anna Efron
Dec 24 '18 at 6:21
1
$begingroup$
Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
$endgroup$
– Ben
Dec 24 '18 at 6:44
$begingroup$
Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
$endgroup$
– Anna Efron
Dec 24 '18 at 6:21
$begingroup$
Thank you. I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$) the usual way, I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this sampling distribution is normal. But since $X$ and $Y$ are both normal and $n$ is quite small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, s.t. the the coverage probability is exactly $(1-alpha)$? And what if $rho_{XY}=0.1$ (say) instead of 0?
$endgroup$
– Anna Efron
Dec 24 '18 at 6:21
1
1
$begingroup$
Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
$endgroup$
– Ben
Dec 24 '18 at 6:44
$begingroup$
Once you remove the assumption that $X$ and $Y$ are independent, the regression model is your specification of their conditional relationship. Much of the information you have given in your comment unfortunately contradicts your original question. It is also unclear why you would test $H_0: beta_0 = 0$ if you know from some other source (your simulation) that $beta_0 = c$. I think at this point you will probably need to ask a new question where all this information is made clear.
$endgroup$
– Ben
Dec 24 '18 at 6:44
add a comment |
$begingroup$
In simple linear regression the computation of the estimate of $beta_0$ is:
$$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$
with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $
You could say it will be a linear sum of the $y_i$
$$hatbeta_0 = frac {1} {n} sum c_i y_i $$
with
$$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$
This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:
$$hatbeta_0 sim N(mu, sigma^2)$$
where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)
However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .
In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.
$endgroup$
add a comment |
$begingroup$
In simple linear regression the computation of the estimate of $beta_0$ is:
$$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$
with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $
You could say it will be a linear sum of the $y_i$
$$hatbeta_0 = frac {1} {n} sum c_i y_i $$
with
$$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$
This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:
$$hatbeta_0 sim N(mu, sigma^2)$$
where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)
However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .
In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.
$endgroup$
add a comment |
$begingroup$
In simple linear regression the computation of the estimate of $beta_0$ is:
$$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$
with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $
You could say it will be a linear sum of the $y_i$
$$hatbeta_0 = frac {1} {n} sum c_i y_i $$
with
$$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$
This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:
$$hatbeta_0 sim N(mu, sigma^2)$$
where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)
However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .
In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.
$endgroup$
In simple linear regression the computation of the estimate of $beta_0$ is:
$$hatbeta_0 = frac {1}{n} S_y + frac {1}{n} S_x frac {n S_{xy} - S_x S_y}{ n S_{xx} - S_x S_x}$$
with $S_x = sum x_i $, $S_y = sum y_i $, $S_{xx} = sum x_i x_i $, $S_{xy} = sum x_i y_i $
You could say it will be a linear sum of the $y_i$
$$hatbeta_0 = frac {1} {n} sum c_i y_i $$
with
$$c_i =left( 1 + frac {n x_i - S_x}{n S_{xx} - S_x S_x} right) $$
This does not seem to follow an easy distribution (or at least not a typical well known distribution) for both random $x_i $ and $y_i$ you have:
$$hatbeta_0 sim N(mu, sigma^2)$$
where $mu$ and $sigma$ are random variables themselves depending on the distribution of $X$ as well. (if every $y_i$ has an identical distribution $N(a,b)$ then $mu = a$, independent from the distribution of $X$)
However if you condition on $x_i$ then $hatbeta_0$ follows a regular normal distribution (note that the $y_i$ do not need to be distributed according to identical Normal distributions) .
In testing you often do not know the variance of this normal distribution and you will estimate it based on the residuals. Then you will use the t-distribution.
edited Dec 25 '18 at 13:34
answered Dec 25 '18 at 13:08
Martijn WeteringsMartijn Weterings
14.1k1762
14.1k1762
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384254%2fsimple-linear-regression-if-y-and-x-are-both-normal-whats-the-exact-null-dist%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
6
$begingroup$
I wonder whether you mean the distribution of $hat beta_0$ rather than of $beta_0$? You have specified that you are 100% sure that $beta_0 = 0$, so that is the rather degenerate distribution that it has! But it sounds to me that you might be rather more interested in the distribution of $hat beta_0$, which is the estimate of $beta_0$ that you would make from your random sample - and since different random samples will produce slightly different estimates, your estimator has a non-degenerate probability distribution
$endgroup$
– Silverfish
Dec 23 '18 at 9:18
$begingroup$
This question would be more interesting if you drop the independence assumption on $X$ and $Y$, and add an assumption on joint normal distribution.
$endgroup$
– kjetil b halvorsen
Dec 23 '18 at 9:59
$begingroup$
Yes, I meant that if I was to test $beta _{0}=0$ (for a simulation exercise I'm working on... I know the true value is $c$), I would have to generate the sampling distribution of $hat{beta _{0}}$ under the null that $beta _{0}=0$. I know asymptotically this distribution is normal. But since X and Y are both normal and n is relatively small, am I able to use the t-distribution (for example) to form an 'exact' null distribution of $hat{beta _{0}}$, rather than using the asymptotic approximation. The true value of the parameter is 0 (obviously), but this is not what I'm after!
$endgroup$
– Anna Efron
Dec 24 '18 at 6:16