Derivative of matrix using index notation
$begingroup$
In my stats textbook, they define the following function:
$mathbf{f} = frac{1}{2}(mathbf{A}mathbf{x} - mathbf{b})^2$,
where $mathbf{A}$ is a matrix, $mathbf{x}, mathbf{b}$ are just vectors. They then say that:
$frac{partial mathbf{f}}{partial mathbf{x}} = mathbf{A}^{T}(mathbf{A}mathbf{x} - mathbf{b})$
I tried to do this derivative using index notion. So, I defined $f$ as:
$f = frac{1}{2} (A_{ij}x^{j} - b_{i})^2$,
Then took the derivative with respect to $x^k$, (I use commas to denote partial derivatives):
$f_{,k} = delta^{j}_{k} A_{ij} (A_{ij}x^{j} - b_{i})$
Which applying the contraction, I get:
$f_{,k} = A_{i}^{k} (A_{ij}x^{j} - b_{i})$
But, I do not know if $A_{i}^{k}$ represents $mathbf{A}^T$?
matrix-calculus tensors index-notation
$endgroup$
add a comment |
$begingroup$
In my stats textbook, they define the following function:
$mathbf{f} = frac{1}{2}(mathbf{A}mathbf{x} - mathbf{b})^2$,
where $mathbf{A}$ is a matrix, $mathbf{x}, mathbf{b}$ are just vectors. They then say that:
$frac{partial mathbf{f}}{partial mathbf{x}} = mathbf{A}^{T}(mathbf{A}mathbf{x} - mathbf{b})$
I tried to do this derivative using index notion. So, I defined $f$ as:
$f = frac{1}{2} (A_{ij}x^{j} - b_{i})^2$,
Then took the derivative with respect to $x^k$, (I use commas to denote partial derivatives):
$f_{,k} = delta^{j}_{k} A_{ij} (A_{ij}x^{j} - b_{i})$
Which applying the contraction, I get:
$f_{,k} = A_{i}^{k} (A_{ij}x^{j} - b_{i})$
But, I do not know if $A_{i}^{k}$ represents $mathbf{A}^T$?
matrix-calculus tensors index-notation
$endgroup$
$begingroup$
It should be $b_i$ instead of $b^j$
$endgroup$
– user251257
Dec 27 '18 at 19:44
$begingroup$
Why use index notation? We have $D_xf(p)=frac12cdot2langle Ap,Ax-brangle=langle p,A^T(Ax-b)rangle$, hence the gradient of $f$ is $A^T(Ax-b)$.
$endgroup$
– Michael Hoppe
Dec 27 '18 at 20:04
$begingroup$
@MichaelHoppe Hi. Yes, I know about this version, as you have done it. But, I wanted to try to do it using index notation! :)
$endgroup$
– Thomas Moore
Dec 27 '18 at 20:29
add a comment |
$begingroup$
In my stats textbook, they define the following function:
$mathbf{f} = frac{1}{2}(mathbf{A}mathbf{x} - mathbf{b})^2$,
where $mathbf{A}$ is a matrix, $mathbf{x}, mathbf{b}$ are just vectors. They then say that:
$frac{partial mathbf{f}}{partial mathbf{x}} = mathbf{A}^{T}(mathbf{A}mathbf{x} - mathbf{b})$
I tried to do this derivative using index notion. So, I defined $f$ as:
$f = frac{1}{2} (A_{ij}x^{j} - b_{i})^2$,
Then took the derivative with respect to $x^k$, (I use commas to denote partial derivatives):
$f_{,k} = delta^{j}_{k} A_{ij} (A_{ij}x^{j} - b_{i})$
Which applying the contraction, I get:
$f_{,k} = A_{i}^{k} (A_{ij}x^{j} - b_{i})$
But, I do not know if $A_{i}^{k}$ represents $mathbf{A}^T$?
matrix-calculus tensors index-notation
$endgroup$
In my stats textbook, they define the following function:
$mathbf{f} = frac{1}{2}(mathbf{A}mathbf{x} - mathbf{b})^2$,
where $mathbf{A}$ is a matrix, $mathbf{x}, mathbf{b}$ are just vectors. They then say that:
$frac{partial mathbf{f}}{partial mathbf{x}} = mathbf{A}^{T}(mathbf{A}mathbf{x} - mathbf{b})$
I tried to do this derivative using index notion. So, I defined $f$ as:
$f = frac{1}{2} (A_{ij}x^{j} - b_{i})^2$,
Then took the derivative with respect to $x^k$, (I use commas to denote partial derivatives):
$f_{,k} = delta^{j}_{k} A_{ij} (A_{ij}x^{j} - b_{i})$
Which applying the contraction, I get:
$f_{,k} = A_{i}^{k} (A_{ij}x^{j} - b_{i})$
But, I do not know if $A_{i}^{k}$ represents $mathbf{A}^T$?
matrix-calculus tensors index-notation
matrix-calculus tensors index-notation
edited Dec 27 '18 at 19:53
Thomas Moore
asked Dec 27 '18 at 19:23
Thomas MooreThomas Moore
425410
425410
$begingroup$
It should be $b_i$ instead of $b^j$
$endgroup$
– user251257
Dec 27 '18 at 19:44
$begingroup$
Why use index notation? We have $D_xf(p)=frac12cdot2langle Ap,Ax-brangle=langle p,A^T(Ax-b)rangle$, hence the gradient of $f$ is $A^T(Ax-b)$.
$endgroup$
– Michael Hoppe
Dec 27 '18 at 20:04
$begingroup$
@MichaelHoppe Hi. Yes, I know about this version, as you have done it. But, I wanted to try to do it using index notation! :)
$endgroup$
– Thomas Moore
Dec 27 '18 at 20:29
add a comment |
$begingroup$
It should be $b_i$ instead of $b^j$
$endgroup$
– user251257
Dec 27 '18 at 19:44
$begingroup$
Why use index notation? We have $D_xf(p)=frac12cdot2langle Ap,Ax-brangle=langle p,A^T(Ax-b)rangle$, hence the gradient of $f$ is $A^T(Ax-b)$.
$endgroup$
– Michael Hoppe
Dec 27 '18 at 20:04
$begingroup$
@MichaelHoppe Hi. Yes, I know about this version, as you have done it. But, I wanted to try to do it using index notation! :)
$endgroup$
– Thomas Moore
Dec 27 '18 at 20:29
$begingroup$
It should be $b_i$ instead of $b^j$
$endgroup$
– user251257
Dec 27 '18 at 19:44
$begingroup$
It should be $b_i$ instead of $b^j$
$endgroup$
– user251257
Dec 27 '18 at 19:44
$begingroup$
Why use index notation? We have $D_xf(p)=frac12cdot2langle Ap,Ax-brangle=langle p,A^T(Ax-b)rangle$, hence the gradient of $f$ is $A^T(Ax-b)$.
$endgroup$
– Michael Hoppe
Dec 27 '18 at 20:04
$begingroup$
Why use index notation? We have $D_xf(p)=frac12cdot2langle Ap,Ax-brangle=langle p,A^T(Ax-b)rangle$, hence the gradient of $f$ is $A^T(Ax-b)$.
$endgroup$
– Michael Hoppe
Dec 27 '18 at 20:04
$begingroup$
@MichaelHoppe Hi. Yes, I know about this version, as you have done it. But, I wanted to try to do it using index notation! :)
$endgroup$
– Thomas Moore
Dec 27 '18 at 20:29
$begingroup$
@MichaelHoppe Hi. Yes, I know about this version, as you have done it. But, I wanted to try to do it using index notation! :)
$endgroup$
– Thomas Moore
Dec 27 '18 at 20:29
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Some comments (I am not yet allowed add them as a comment):
a) your function $f=(Ax-b)^2$ is not defined if $A$ is a matrix. My guess is that it should be $f(x)=(Ax-b)^T(Ax-b)$.
b) The key of derivating a real valued function $f$ wrt a $K$-vector $x$ is:
b1) If $x$ is a column vector then $partial f/partial x$ is a column vector with $partial f/partial x_i$ as i-th element
b2) $partial x/partial x^T = partial x^T/partial x = I_K$
With these conventions derivation of $f$ wrt the vector $x$ yields the same result as element by element partial derivation.
$endgroup$
add a comment |
$begingroup$
Your second equation can be rewritten by taking its $k$th component, viz. $$f_{,k}=(A^T)_{ki}(Ax-b)_i=(A^T)_{ki}(A_{ij}x_j-b_i).$$Comparing this with your final equation, $A_i^k=(A^T)_{ki}=A_{ik}$.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3054284%2fderivative-of-matrix-using-index-notation%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Some comments (I am not yet allowed add them as a comment):
a) your function $f=(Ax-b)^2$ is not defined if $A$ is a matrix. My guess is that it should be $f(x)=(Ax-b)^T(Ax-b)$.
b) The key of derivating a real valued function $f$ wrt a $K$-vector $x$ is:
b1) If $x$ is a column vector then $partial f/partial x$ is a column vector with $partial f/partial x_i$ as i-th element
b2) $partial x/partial x^T = partial x^T/partial x = I_K$
With these conventions derivation of $f$ wrt the vector $x$ yields the same result as element by element partial derivation.
$endgroup$
add a comment |
$begingroup$
Some comments (I am not yet allowed add them as a comment):
a) your function $f=(Ax-b)^2$ is not defined if $A$ is a matrix. My guess is that it should be $f(x)=(Ax-b)^T(Ax-b)$.
b) The key of derivating a real valued function $f$ wrt a $K$-vector $x$ is:
b1) If $x$ is a column vector then $partial f/partial x$ is a column vector with $partial f/partial x_i$ as i-th element
b2) $partial x/partial x^T = partial x^T/partial x = I_K$
With these conventions derivation of $f$ wrt the vector $x$ yields the same result as element by element partial derivation.
$endgroup$
add a comment |
$begingroup$
Some comments (I am not yet allowed add them as a comment):
a) your function $f=(Ax-b)^2$ is not defined if $A$ is a matrix. My guess is that it should be $f(x)=(Ax-b)^T(Ax-b)$.
b) The key of derivating a real valued function $f$ wrt a $K$-vector $x$ is:
b1) If $x$ is a column vector then $partial f/partial x$ is a column vector with $partial f/partial x_i$ as i-th element
b2) $partial x/partial x^T = partial x^T/partial x = I_K$
With these conventions derivation of $f$ wrt the vector $x$ yields the same result as element by element partial derivation.
$endgroup$
Some comments (I am not yet allowed add them as a comment):
a) your function $f=(Ax-b)^2$ is not defined if $A$ is a matrix. My guess is that it should be $f(x)=(Ax-b)^T(Ax-b)$.
b) The key of derivating a real valued function $f$ wrt a $K$-vector $x$ is:
b1) If $x$ is a column vector then $partial f/partial x$ is a column vector with $partial f/partial x_i$ as i-th element
b2) $partial x/partial x^T = partial x^T/partial x = I_K$
With these conventions derivation of $f$ wrt the vector $x$ yields the same result as element by element partial derivation.
edited Dec 29 '18 at 18:04
answered Dec 27 '18 at 21:26
BertrandBertrand
45815
45815
add a comment |
add a comment |
$begingroup$
Your second equation can be rewritten by taking its $k$th component, viz. $$f_{,k}=(A^T)_{ki}(Ax-b)_i=(A^T)_{ki}(A_{ij}x_j-b_i).$$Comparing this with your final equation, $A_i^k=(A^T)_{ki}=A_{ik}$.
$endgroup$
add a comment |
$begingroup$
Your second equation can be rewritten by taking its $k$th component, viz. $$f_{,k}=(A^T)_{ki}(Ax-b)_i=(A^T)_{ki}(A_{ij}x_j-b_i).$$Comparing this with your final equation, $A_i^k=(A^T)_{ki}=A_{ik}$.
$endgroup$
add a comment |
$begingroup$
Your second equation can be rewritten by taking its $k$th component, viz. $$f_{,k}=(A^T)_{ki}(Ax-b)_i=(A^T)_{ki}(A_{ij}x_j-b_i).$$Comparing this with your final equation, $A_i^k=(A^T)_{ki}=A_{ik}$.
$endgroup$
Your second equation can be rewritten by taking its $k$th component, viz. $$f_{,k}=(A^T)_{ki}(Ax-b)_i=(A^T)_{ki}(A_{ij}x_j-b_i).$$Comparing this with your final equation, $A_i^k=(A^T)_{ki}=A_{ik}$.
answered Dec 27 '18 at 21:00
J.G.J.G.
30.3k23148
30.3k23148
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3054284%2fderivative-of-matrix-using-index-notation%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
It should be $b_i$ instead of $b^j$
$endgroup$
– user251257
Dec 27 '18 at 19:44
$begingroup$
Why use index notation? We have $D_xf(p)=frac12cdot2langle Ap,Ax-brangle=langle p,A^T(Ax-b)rangle$, hence the gradient of $f$ is $A^T(Ax-b)$.
$endgroup$
– Michael Hoppe
Dec 27 '18 at 20:04
$begingroup$
@MichaelHoppe Hi. Yes, I know about this version, as you have done it. But, I wanted to try to do it using index notation! :)
$endgroup$
– Thomas Moore
Dec 27 '18 at 20:29