Predicting future values or modeling data
$begingroup$
Suppose I know that this relationship exists y=(xb-b)+c+d if I had a table of y values for different values of x,b,c and d and I didnt know this relationship how would I go about finding this relationship. Would regression analysis produce the equation or would I have to plot the values and take some of the values as zero and model manually.
mathematical-modeling
$endgroup$
add a comment |
$begingroup$
Suppose I know that this relationship exists y=(xb-b)+c+d if I had a table of y values for different values of x,b,c and d and I didnt know this relationship how would I go about finding this relationship. Would regression analysis produce the equation or would I have to plot the values and take some of the values as zero and model manually.
mathematical-modeling
$endgroup$
$begingroup$
If you have all the values of $x,b,c,d$ there are no variables left to fit. You could however plot them and see how much error there is. Could you give an example of some of your data or more context for the problem?
$endgroup$
– tch
Dec 24 '18 at 20:54
$begingroup$
but problem is we are assuming we dont know the relationship y=(xb-b)+c+d , so even if we have the values of x,b,c and d we dont know how they relate to each other to produce a given y value.How would I plot multiple independent variables vs one dependent variable.
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:45
$begingroup$
table begin{table} begin{tabular}{lllll} & & & & \ & & & & \ & & & & \ & & & & end{tabular} end{table}
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:47
$begingroup$
Your table appears to be misformated. But to be clear, you have a set of data with poings looking like: (x,b,c,d) and you want to determine the relationship between them?
$endgroup$
– tch
Dec 26 '18 at 15:03
$begingroup$
Yes thats exactly it , a multiple linear regression model works to an extent but it isn't good enough. I tried formatting that table its picked straight from my latex document so its formatted correctly but this comment section wont put it correctly
$endgroup$
– Tariro Manyika
Dec 26 '18 at 20:47
add a comment |
$begingroup$
Suppose I know that this relationship exists y=(xb-b)+c+d if I had a table of y values for different values of x,b,c and d and I didnt know this relationship how would I go about finding this relationship. Would regression analysis produce the equation or would I have to plot the values and take some of the values as zero and model manually.
mathematical-modeling
$endgroup$
Suppose I know that this relationship exists y=(xb-b)+c+d if I had a table of y values for different values of x,b,c and d and I didnt know this relationship how would I go about finding this relationship. Would regression analysis produce the equation or would I have to plot the values and take some of the values as zero and model manually.
mathematical-modeling
mathematical-modeling
asked Dec 24 '18 at 14:39
Tariro ManyikaTariro Manyika
599
599
$begingroup$
If you have all the values of $x,b,c,d$ there are no variables left to fit. You could however plot them and see how much error there is. Could you give an example of some of your data or more context for the problem?
$endgroup$
– tch
Dec 24 '18 at 20:54
$begingroup$
but problem is we are assuming we dont know the relationship y=(xb-b)+c+d , so even if we have the values of x,b,c and d we dont know how they relate to each other to produce a given y value.How would I plot multiple independent variables vs one dependent variable.
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:45
$begingroup$
table begin{table} begin{tabular}{lllll} & & & & \ & & & & \ & & & & \ & & & & end{tabular} end{table}
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:47
$begingroup$
Your table appears to be misformated. But to be clear, you have a set of data with poings looking like: (x,b,c,d) and you want to determine the relationship between them?
$endgroup$
– tch
Dec 26 '18 at 15:03
$begingroup$
Yes thats exactly it , a multiple linear regression model works to an extent but it isn't good enough. I tried formatting that table its picked straight from my latex document so its formatted correctly but this comment section wont put it correctly
$endgroup$
– Tariro Manyika
Dec 26 '18 at 20:47
add a comment |
$begingroup$
If you have all the values of $x,b,c,d$ there are no variables left to fit. You could however plot them and see how much error there is. Could you give an example of some of your data or more context for the problem?
$endgroup$
– tch
Dec 24 '18 at 20:54
$begingroup$
but problem is we are assuming we dont know the relationship y=(xb-b)+c+d , so even if we have the values of x,b,c and d we dont know how they relate to each other to produce a given y value.How would I plot multiple independent variables vs one dependent variable.
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:45
$begingroup$
table begin{table} begin{tabular}{lllll} & & & & \ & & & & \ & & & & \ & & & & end{tabular} end{table}
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:47
$begingroup$
Your table appears to be misformated. But to be clear, you have a set of data with poings looking like: (x,b,c,d) and you want to determine the relationship between them?
$endgroup$
– tch
Dec 26 '18 at 15:03
$begingroup$
Yes thats exactly it , a multiple linear regression model works to an extent but it isn't good enough. I tried formatting that table its picked straight from my latex document so its formatted correctly but this comment section wont put it correctly
$endgroup$
– Tariro Manyika
Dec 26 '18 at 20:47
$begingroup$
If you have all the values of $x,b,c,d$ there are no variables left to fit. You could however plot them and see how much error there is. Could you give an example of some of your data or more context for the problem?
$endgroup$
– tch
Dec 24 '18 at 20:54
$begingroup$
If you have all the values of $x,b,c,d$ there are no variables left to fit. You could however plot them and see how much error there is. Could you give an example of some of your data or more context for the problem?
$endgroup$
– tch
Dec 24 '18 at 20:54
$begingroup$
but problem is we are assuming we dont know the relationship y=(xb-b)+c+d , so even if we have the values of x,b,c and d we dont know how they relate to each other to produce a given y value.How would I plot multiple independent variables vs one dependent variable.
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:45
$begingroup$
but problem is we are assuming we dont know the relationship y=(xb-b)+c+d , so even if we have the values of x,b,c and d we dont know how they relate to each other to produce a given y value.How would I plot multiple independent variables vs one dependent variable.
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:45
$begingroup$
table begin{table} begin{tabular}{lllll} & & & & \ & & & & \ & & & & \ & & & & end{tabular} end{table}
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:47
$begingroup$
table begin{table} begin{tabular}{lllll} & & & & \ & & & & \ & & & & \ & & & & end{tabular} end{table}
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:47
$begingroup$
Your table appears to be misformated. But to be clear, you have a set of data with poings looking like: (x,b,c,d) and you want to determine the relationship between them?
$endgroup$
– tch
Dec 26 '18 at 15:03
$begingroup$
Your table appears to be misformated. But to be clear, you have a set of data with poings looking like: (x,b,c,d) and you want to determine the relationship between them?
$endgroup$
– tch
Dec 26 '18 at 15:03
$begingroup$
Yes thats exactly it , a multiple linear regression model works to an extent but it isn't good enough. I tried formatting that table its picked straight from my latex document so its formatted correctly but this comment section wont put it correctly
$endgroup$
– Tariro Manyika
Dec 26 '18 at 20:47
$begingroup$
Yes thats exactly it , a multiple linear regression model works to an extent but it isn't good enough. I tried formatting that table its picked straight from my latex document so its formatted correctly but this comment section wont put it correctly
$endgroup$
– Tariro Manyika
Dec 26 '18 at 20:47
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
One "data driven" approach to find the relationship between the variables is basically to do a linear regression to some set of prespecified functions. This requires that you have some idea about what types of functions relate the variables (for instance, that the relationship is a polynomial of degree $leq2$).
First, make vectors:
$$
X = begin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix}, ~~
B = begin{bmatrix}b_1 \ b_2 \ vdots \ b_n end{bmatrix}, ~~
C = begin{bmatrix}c_1 \ c_2 \ vdots \ c_n end{bmatrix}, ~~
D = begin{bmatrix}d_1 \ d_2 \ vdots \ d_n end{bmatrix},~~
$$
Now, construct a "library" of possible relationships between your variables. The columns of this matrix shoukd be functions of the variables $x,b,c,d$ applied to each data point. For example, if you expect a polynomial relationship of degree$leq 2$ you could make the library:
$$
A =
begin{bmatrix}
| & | & | & | & | & | & | & | & & | \
1 & X & B & C & D & X^2 & XB & XC & ... &D^2\
| & | & | & | & | & | & | & | & & |
end{bmatrix}
$$
where, for example,
$$
XB = begin{bmatrix}x_1b_1 \ x_2b_2 \ vdots \ x_nb_n end{bmatrix}, ~~
$$
More generally, any column could be $f(X,B,C,D)$ where the $-$-the entry of this column is simply $f(x_i,b_i,c_i,d_i)$.
Now note that that the product $Ac$ gives a linear combination (weighted sum) of these entries. So you can solve $Ac = Y$ to find the relationship between the columns of your library. Of course, in practice you will have to solve the least squares problem $min_c Vert Y-Ac Vert$.
If your data exactly satisfies $y=(xb-b)+c+d$ then you will $c$ will have a coefficient of 1 on the $XB$ column, $-1$ on the $B$ column, $1$ on the C and D columns, and 0 everywhere else.
Example
Suppose we have data:
Y, X, B, C, D
16, 6, 2, 2, 4
22, 2, 7, 6, 9
5, 1, 4, 1, 4
33, 5, 7, 0, 5
13, 1, 4, 9, 4
If we know that $Y$ is a linear function of X, B, C, D, and XB. We can form our library A = [X,B,C,D,XB]
[[ 6, 2, 2, 4, 12],
[ 2, 7, 6, 9, 14],
[ 1, 4, 1, 4, 4],
[ 5, 7, 0, 5, 35],
[ 1, 4, 9, 4, 4]]
Now, solving the least squares problem gives:
x = [0,-1,1,1,1]
This tells us that $y = 0cdot x + -1cdot b + 1cdot c+1cdot d+1cdot xb$ which is exactly what we expected.
Now, if you didn't know that the only product term would be $xb$, you could have added more functions to the library and ideally the least squares would give coefficients of 0 for these functions (in our example we get a coefficient of 0 for $x$ since there is no $x$ term in the relationship). The results will vary depending on the amount of data you have, and how noisy it is. If you think the relationship is simple, you could promote sparsity through L1 regularization by instead solving:
$$
min_c Vert Y-Ac Vert_2 + lambda Vert c Vert_1
$$
where $lambda$ is a tune-able parameter.
$endgroup$
$begingroup$
Mr.Chen thank you so much , this is exactly what I was looking for. Simply amazing :)
$endgroup$
– Tariro Manyika
Dec 28 '18 at 19:15
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3051301%2fpredicting-future-values-or-modeling-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
One "data driven" approach to find the relationship between the variables is basically to do a linear regression to some set of prespecified functions. This requires that you have some idea about what types of functions relate the variables (for instance, that the relationship is a polynomial of degree $leq2$).
First, make vectors:
$$
X = begin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix}, ~~
B = begin{bmatrix}b_1 \ b_2 \ vdots \ b_n end{bmatrix}, ~~
C = begin{bmatrix}c_1 \ c_2 \ vdots \ c_n end{bmatrix}, ~~
D = begin{bmatrix}d_1 \ d_2 \ vdots \ d_n end{bmatrix},~~
$$
Now, construct a "library" of possible relationships between your variables. The columns of this matrix shoukd be functions of the variables $x,b,c,d$ applied to each data point. For example, if you expect a polynomial relationship of degree$leq 2$ you could make the library:
$$
A =
begin{bmatrix}
| & | & | & | & | & | & | & | & & | \
1 & X & B & C & D & X^2 & XB & XC & ... &D^2\
| & | & | & | & | & | & | & | & & |
end{bmatrix}
$$
where, for example,
$$
XB = begin{bmatrix}x_1b_1 \ x_2b_2 \ vdots \ x_nb_n end{bmatrix}, ~~
$$
More generally, any column could be $f(X,B,C,D)$ where the $-$-the entry of this column is simply $f(x_i,b_i,c_i,d_i)$.
Now note that that the product $Ac$ gives a linear combination (weighted sum) of these entries. So you can solve $Ac = Y$ to find the relationship between the columns of your library. Of course, in practice you will have to solve the least squares problem $min_c Vert Y-Ac Vert$.
If your data exactly satisfies $y=(xb-b)+c+d$ then you will $c$ will have a coefficient of 1 on the $XB$ column, $-1$ on the $B$ column, $1$ on the C and D columns, and 0 everywhere else.
Example
Suppose we have data:
Y, X, B, C, D
16, 6, 2, 2, 4
22, 2, 7, 6, 9
5, 1, 4, 1, 4
33, 5, 7, 0, 5
13, 1, 4, 9, 4
If we know that $Y$ is a linear function of X, B, C, D, and XB. We can form our library A = [X,B,C,D,XB]
[[ 6, 2, 2, 4, 12],
[ 2, 7, 6, 9, 14],
[ 1, 4, 1, 4, 4],
[ 5, 7, 0, 5, 35],
[ 1, 4, 9, 4, 4]]
Now, solving the least squares problem gives:
x = [0,-1,1,1,1]
This tells us that $y = 0cdot x + -1cdot b + 1cdot c+1cdot d+1cdot xb$ which is exactly what we expected.
Now, if you didn't know that the only product term would be $xb$, you could have added more functions to the library and ideally the least squares would give coefficients of 0 for these functions (in our example we get a coefficient of 0 for $x$ since there is no $x$ term in the relationship). The results will vary depending on the amount of data you have, and how noisy it is. If you think the relationship is simple, you could promote sparsity through L1 regularization by instead solving:
$$
min_c Vert Y-Ac Vert_2 + lambda Vert c Vert_1
$$
where $lambda$ is a tune-able parameter.
$endgroup$
$begingroup$
Mr.Chen thank you so much , this is exactly what I was looking for. Simply amazing :)
$endgroup$
– Tariro Manyika
Dec 28 '18 at 19:15
add a comment |
$begingroup$
One "data driven" approach to find the relationship between the variables is basically to do a linear regression to some set of prespecified functions. This requires that you have some idea about what types of functions relate the variables (for instance, that the relationship is a polynomial of degree $leq2$).
First, make vectors:
$$
X = begin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix}, ~~
B = begin{bmatrix}b_1 \ b_2 \ vdots \ b_n end{bmatrix}, ~~
C = begin{bmatrix}c_1 \ c_2 \ vdots \ c_n end{bmatrix}, ~~
D = begin{bmatrix}d_1 \ d_2 \ vdots \ d_n end{bmatrix},~~
$$
Now, construct a "library" of possible relationships between your variables. The columns of this matrix shoukd be functions of the variables $x,b,c,d$ applied to each data point. For example, if you expect a polynomial relationship of degree$leq 2$ you could make the library:
$$
A =
begin{bmatrix}
| & | & | & | & | & | & | & | & & | \
1 & X & B & C & D & X^2 & XB & XC & ... &D^2\
| & | & | & | & | & | & | & | & & |
end{bmatrix}
$$
where, for example,
$$
XB = begin{bmatrix}x_1b_1 \ x_2b_2 \ vdots \ x_nb_n end{bmatrix}, ~~
$$
More generally, any column could be $f(X,B,C,D)$ where the $-$-the entry of this column is simply $f(x_i,b_i,c_i,d_i)$.
Now note that that the product $Ac$ gives a linear combination (weighted sum) of these entries. So you can solve $Ac = Y$ to find the relationship between the columns of your library. Of course, in practice you will have to solve the least squares problem $min_c Vert Y-Ac Vert$.
If your data exactly satisfies $y=(xb-b)+c+d$ then you will $c$ will have a coefficient of 1 on the $XB$ column, $-1$ on the $B$ column, $1$ on the C and D columns, and 0 everywhere else.
Example
Suppose we have data:
Y, X, B, C, D
16, 6, 2, 2, 4
22, 2, 7, 6, 9
5, 1, 4, 1, 4
33, 5, 7, 0, 5
13, 1, 4, 9, 4
If we know that $Y$ is a linear function of X, B, C, D, and XB. We can form our library A = [X,B,C,D,XB]
[[ 6, 2, 2, 4, 12],
[ 2, 7, 6, 9, 14],
[ 1, 4, 1, 4, 4],
[ 5, 7, 0, 5, 35],
[ 1, 4, 9, 4, 4]]
Now, solving the least squares problem gives:
x = [0,-1,1,1,1]
This tells us that $y = 0cdot x + -1cdot b + 1cdot c+1cdot d+1cdot xb$ which is exactly what we expected.
Now, if you didn't know that the only product term would be $xb$, you could have added more functions to the library and ideally the least squares would give coefficients of 0 for these functions (in our example we get a coefficient of 0 for $x$ since there is no $x$ term in the relationship). The results will vary depending on the amount of data you have, and how noisy it is. If you think the relationship is simple, you could promote sparsity through L1 regularization by instead solving:
$$
min_c Vert Y-Ac Vert_2 + lambda Vert c Vert_1
$$
where $lambda$ is a tune-able parameter.
$endgroup$
$begingroup$
Mr.Chen thank you so much , this is exactly what I was looking for. Simply amazing :)
$endgroup$
– Tariro Manyika
Dec 28 '18 at 19:15
add a comment |
$begingroup$
One "data driven" approach to find the relationship between the variables is basically to do a linear regression to some set of prespecified functions. This requires that you have some idea about what types of functions relate the variables (for instance, that the relationship is a polynomial of degree $leq2$).
First, make vectors:
$$
X = begin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix}, ~~
B = begin{bmatrix}b_1 \ b_2 \ vdots \ b_n end{bmatrix}, ~~
C = begin{bmatrix}c_1 \ c_2 \ vdots \ c_n end{bmatrix}, ~~
D = begin{bmatrix}d_1 \ d_2 \ vdots \ d_n end{bmatrix},~~
$$
Now, construct a "library" of possible relationships between your variables. The columns of this matrix shoukd be functions of the variables $x,b,c,d$ applied to each data point. For example, if you expect a polynomial relationship of degree$leq 2$ you could make the library:
$$
A =
begin{bmatrix}
| & | & | & | & | & | & | & | & & | \
1 & X & B & C & D & X^2 & XB & XC & ... &D^2\
| & | & | & | & | & | & | & | & & |
end{bmatrix}
$$
where, for example,
$$
XB = begin{bmatrix}x_1b_1 \ x_2b_2 \ vdots \ x_nb_n end{bmatrix}, ~~
$$
More generally, any column could be $f(X,B,C,D)$ where the $-$-the entry of this column is simply $f(x_i,b_i,c_i,d_i)$.
Now note that that the product $Ac$ gives a linear combination (weighted sum) of these entries. So you can solve $Ac = Y$ to find the relationship between the columns of your library. Of course, in practice you will have to solve the least squares problem $min_c Vert Y-Ac Vert$.
If your data exactly satisfies $y=(xb-b)+c+d$ then you will $c$ will have a coefficient of 1 on the $XB$ column, $-1$ on the $B$ column, $1$ on the C and D columns, and 0 everywhere else.
Example
Suppose we have data:
Y, X, B, C, D
16, 6, 2, 2, 4
22, 2, 7, 6, 9
5, 1, 4, 1, 4
33, 5, 7, 0, 5
13, 1, 4, 9, 4
If we know that $Y$ is a linear function of X, B, C, D, and XB. We can form our library A = [X,B,C,D,XB]
[[ 6, 2, 2, 4, 12],
[ 2, 7, 6, 9, 14],
[ 1, 4, 1, 4, 4],
[ 5, 7, 0, 5, 35],
[ 1, 4, 9, 4, 4]]
Now, solving the least squares problem gives:
x = [0,-1,1,1,1]
This tells us that $y = 0cdot x + -1cdot b + 1cdot c+1cdot d+1cdot xb$ which is exactly what we expected.
Now, if you didn't know that the only product term would be $xb$, you could have added more functions to the library and ideally the least squares would give coefficients of 0 for these functions (in our example we get a coefficient of 0 for $x$ since there is no $x$ term in the relationship). The results will vary depending on the amount of data you have, and how noisy it is. If you think the relationship is simple, you could promote sparsity through L1 regularization by instead solving:
$$
min_c Vert Y-Ac Vert_2 + lambda Vert c Vert_1
$$
where $lambda$ is a tune-able parameter.
$endgroup$
One "data driven" approach to find the relationship between the variables is basically to do a linear regression to some set of prespecified functions. This requires that you have some idea about what types of functions relate the variables (for instance, that the relationship is a polynomial of degree $leq2$).
First, make vectors:
$$
X = begin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix}, ~~
B = begin{bmatrix}b_1 \ b_2 \ vdots \ b_n end{bmatrix}, ~~
C = begin{bmatrix}c_1 \ c_2 \ vdots \ c_n end{bmatrix}, ~~
D = begin{bmatrix}d_1 \ d_2 \ vdots \ d_n end{bmatrix},~~
$$
Now, construct a "library" of possible relationships between your variables. The columns of this matrix shoukd be functions of the variables $x,b,c,d$ applied to each data point. For example, if you expect a polynomial relationship of degree$leq 2$ you could make the library:
$$
A =
begin{bmatrix}
| & | & | & | & | & | & | & | & & | \
1 & X & B & C & D & X^2 & XB & XC & ... &D^2\
| & | & | & | & | & | & | & | & & |
end{bmatrix}
$$
where, for example,
$$
XB = begin{bmatrix}x_1b_1 \ x_2b_2 \ vdots \ x_nb_n end{bmatrix}, ~~
$$
More generally, any column could be $f(X,B,C,D)$ where the $-$-the entry of this column is simply $f(x_i,b_i,c_i,d_i)$.
Now note that that the product $Ac$ gives a linear combination (weighted sum) of these entries. So you can solve $Ac = Y$ to find the relationship between the columns of your library. Of course, in practice you will have to solve the least squares problem $min_c Vert Y-Ac Vert$.
If your data exactly satisfies $y=(xb-b)+c+d$ then you will $c$ will have a coefficient of 1 on the $XB$ column, $-1$ on the $B$ column, $1$ on the C and D columns, and 0 everywhere else.
Example
Suppose we have data:
Y, X, B, C, D
16, 6, 2, 2, 4
22, 2, 7, 6, 9
5, 1, 4, 1, 4
33, 5, 7, 0, 5
13, 1, 4, 9, 4
If we know that $Y$ is a linear function of X, B, C, D, and XB. We can form our library A = [X,B,C,D,XB]
[[ 6, 2, 2, 4, 12],
[ 2, 7, 6, 9, 14],
[ 1, 4, 1, 4, 4],
[ 5, 7, 0, 5, 35],
[ 1, 4, 9, 4, 4]]
Now, solving the least squares problem gives:
x = [0,-1,1,1,1]
This tells us that $y = 0cdot x + -1cdot b + 1cdot c+1cdot d+1cdot xb$ which is exactly what we expected.
Now, if you didn't know that the only product term would be $xb$, you could have added more functions to the library and ideally the least squares would give coefficients of 0 for these functions (in our example we get a coefficient of 0 for $x$ since there is no $x$ term in the relationship). The results will vary depending on the amount of data you have, and how noisy it is. If you think the relationship is simple, you could promote sparsity through L1 regularization by instead solving:
$$
min_c Vert Y-Ac Vert_2 + lambda Vert c Vert_1
$$
where $lambda$ is a tune-able parameter.
edited Dec 26 '18 at 21:32
answered Dec 26 '18 at 21:16
tchtch
803310
803310
$begingroup$
Mr.Chen thank you so much , this is exactly what I was looking for. Simply amazing :)
$endgroup$
– Tariro Manyika
Dec 28 '18 at 19:15
add a comment |
$begingroup$
Mr.Chen thank you so much , this is exactly what I was looking for. Simply amazing :)
$endgroup$
– Tariro Manyika
Dec 28 '18 at 19:15
$begingroup$
Mr.Chen thank you so much , this is exactly what I was looking for. Simply amazing :)
$endgroup$
– Tariro Manyika
Dec 28 '18 at 19:15
$begingroup$
Mr.Chen thank you so much , this is exactly what I was looking for. Simply amazing :)
$endgroup$
– Tariro Manyika
Dec 28 '18 at 19:15
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3051301%2fpredicting-future-values-or-modeling-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
If you have all the values of $x,b,c,d$ there are no variables left to fit. You could however plot them and see how much error there is. Could you give an example of some of your data or more context for the problem?
$endgroup$
– tch
Dec 24 '18 at 20:54
$begingroup$
but problem is we are assuming we dont know the relationship y=(xb-b)+c+d , so even if we have the values of x,b,c and d we dont know how they relate to each other to produce a given y value.How would I plot multiple independent variables vs one dependent variable.
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:45
$begingroup$
table begin{table} begin{tabular}{lllll} & & & & \ & & & & \ & & & & \ & & & & end{tabular} end{table}
$endgroup$
– Tariro Manyika
Dec 26 '18 at 7:47
$begingroup$
Your table appears to be misformated. But to be clear, you have a set of data with poings looking like: (x,b,c,d) and you want to determine the relationship between them?
$endgroup$
– tch
Dec 26 '18 at 15:03
$begingroup$
Yes thats exactly it , a multiple linear regression model works to an extent but it isn't good enough. I tried formatting that table its picked straight from my latex document so its formatted correctly but this comment section wont put it correctly
$endgroup$
– Tariro Manyika
Dec 26 '18 at 20:47