Is there a mathematical operation to ensure that two vectors are perfectly equal?
$begingroup$
Let's said that I have a 3D matrix of integer of size 10x10x4, I want to know if along the 3rd dimension some vectors are perfectly similar (same number, same order).
So I could compare one by one each vectors (of size 4) with each others. But I'm using a computer and the time matters. So is there a mathematical operation to ensure that two vectors are perfectly equal. Or, same question others words, is there a mathematical operation that can link a unique value to a unique vector.
So typicaly for a vector of size 4 (base 10) the associated number should be between 1 and 10^4 (ideally).
vector-spaces numerical-methods
$endgroup$
|
show 8 more comments
$begingroup$
Let's said that I have a 3D matrix of integer of size 10x10x4, I want to know if along the 3rd dimension some vectors are perfectly similar (same number, same order).
So I could compare one by one each vectors (of size 4) with each others. But I'm using a computer and the time matters. So is there a mathematical operation to ensure that two vectors are perfectly equal. Or, same question others words, is there a mathematical operation that can link a unique value to a unique vector.
So typicaly for a vector of size 4 (base 10) the associated number should be between 1 and 10^4 (ideally).
vector-spaces numerical-methods
$endgroup$
$begingroup$
If you assign some unique value to each vector then you have to compare those values. For a computer comparing a vector (as in sequence of bytes) and comparing a number is pretty much the same thing. I don't see any gain in doing that (except for a given language limitations). If you don't know anything about the matrix then the only sensible thing is to compare row by row. This can be parallelized though (even on gpu).
$endgroup$
– freakish
Dec 12 '18 at 17:51
$begingroup$
However if you expect that those matrices are often not going to be perfectly similar (which, as I've mentioned is something you have to know a priori) and those vectors are big then I guess hashing is an option. Although it can give you false positives.
$endgroup$
– freakish
Dec 12 '18 at 17:54
$begingroup$
A matrix is two-dimensional. You're talking about a three-dimensional array. It's not any different from determining if two members of a one-dimensional array are identical. The fastest thing to do, so far as I know, is to sort the array and then compare adjacent elements. (This is on a single CPU.)
$endgroup$
– saulspatz
Dec 12 '18 at 17:54
$begingroup$
you are using a computer so you can write a function that converts a vector with four components into a string with separators: $"v1;v2;v3;v4"$ and make comparison based on the results of this function
$endgroup$
– Vasya
Dec 12 '18 at 17:59
$begingroup$
@Vasya that's a really bad idea. you are increasing the size of the data to parse, not reducing it
$endgroup$
– Federico
Dec 12 '18 at 18:04
|
show 8 more comments
$begingroup$
Let's said that I have a 3D matrix of integer of size 10x10x4, I want to know if along the 3rd dimension some vectors are perfectly similar (same number, same order).
So I could compare one by one each vectors (of size 4) with each others. But I'm using a computer and the time matters. So is there a mathematical operation to ensure that two vectors are perfectly equal. Or, same question others words, is there a mathematical operation that can link a unique value to a unique vector.
So typicaly for a vector of size 4 (base 10) the associated number should be between 1 and 10^4 (ideally).
vector-spaces numerical-methods
$endgroup$
Let's said that I have a 3D matrix of integer of size 10x10x4, I want to know if along the 3rd dimension some vectors are perfectly similar (same number, same order).
So I could compare one by one each vectors (of size 4) with each others. But I'm using a computer and the time matters. So is there a mathematical operation to ensure that two vectors are perfectly equal. Or, same question others words, is there a mathematical operation that can link a unique value to a unique vector.
So typicaly for a vector of size 4 (base 10) the associated number should be between 1 and 10^4 (ideally).
vector-spaces numerical-methods
vector-spaces numerical-methods
asked Dec 12 '18 at 17:43
obchardonobchardon
1134
1134
$begingroup$
If you assign some unique value to each vector then you have to compare those values. For a computer comparing a vector (as in sequence of bytes) and comparing a number is pretty much the same thing. I don't see any gain in doing that (except for a given language limitations). If you don't know anything about the matrix then the only sensible thing is to compare row by row. This can be parallelized though (even on gpu).
$endgroup$
– freakish
Dec 12 '18 at 17:51
$begingroup$
However if you expect that those matrices are often not going to be perfectly similar (which, as I've mentioned is something you have to know a priori) and those vectors are big then I guess hashing is an option. Although it can give you false positives.
$endgroup$
– freakish
Dec 12 '18 at 17:54
$begingroup$
A matrix is two-dimensional. You're talking about a three-dimensional array. It's not any different from determining if two members of a one-dimensional array are identical. The fastest thing to do, so far as I know, is to sort the array and then compare adjacent elements. (This is on a single CPU.)
$endgroup$
– saulspatz
Dec 12 '18 at 17:54
$begingroup$
you are using a computer so you can write a function that converts a vector with four components into a string with separators: $"v1;v2;v3;v4"$ and make comparison based on the results of this function
$endgroup$
– Vasya
Dec 12 '18 at 17:59
$begingroup$
@Vasya that's a really bad idea. you are increasing the size of the data to parse, not reducing it
$endgroup$
– Federico
Dec 12 '18 at 18:04
|
show 8 more comments
$begingroup$
If you assign some unique value to each vector then you have to compare those values. For a computer comparing a vector (as in sequence of bytes) and comparing a number is pretty much the same thing. I don't see any gain in doing that (except for a given language limitations). If you don't know anything about the matrix then the only sensible thing is to compare row by row. This can be parallelized though (even on gpu).
$endgroup$
– freakish
Dec 12 '18 at 17:51
$begingroup$
However if you expect that those matrices are often not going to be perfectly similar (which, as I've mentioned is something you have to know a priori) and those vectors are big then I guess hashing is an option. Although it can give you false positives.
$endgroup$
– freakish
Dec 12 '18 at 17:54
$begingroup$
A matrix is two-dimensional. You're talking about a three-dimensional array. It's not any different from determining if two members of a one-dimensional array are identical. The fastest thing to do, so far as I know, is to sort the array and then compare adjacent elements. (This is on a single CPU.)
$endgroup$
– saulspatz
Dec 12 '18 at 17:54
$begingroup$
you are using a computer so you can write a function that converts a vector with four components into a string with separators: $"v1;v2;v3;v4"$ and make comparison based on the results of this function
$endgroup$
– Vasya
Dec 12 '18 at 17:59
$begingroup$
@Vasya that's a really bad idea. you are increasing the size of the data to parse, not reducing it
$endgroup$
– Federico
Dec 12 '18 at 18:04
$begingroup$
If you assign some unique value to each vector then you have to compare those values. For a computer comparing a vector (as in sequence of bytes) and comparing a number is pretty much the same thing. I don't see any gain in doing that (except for a given language limitations). If you don't know anything about the matrix then the only sensible thing is to compare row by row. This can be parallelized though (even on gpu).
$endgroup$
– freakish
Dec 12 '18 at 17:51
$begingroup$
If you assign some unique value to each vector then you have to compare those values. For a computer comparing a vector (as in sequence of bytes) and comparing a number is pretty much the same thing. I don't see any gain in doing that (except for a given language limitations). If you don't know anything about the matrix then the only sensible thing is to compare row by row. This can be parallelized though (even on gpu).
$endgroup$
– freakish
Dec 12 '18 at 17:51
$begingroup$
However if you expect that those matrices are often not going to be perfectly similar (which, as I've mentioned is something you have to know a priori) and those vectors are big then I guess hashing is an option. Although it can give you false positives.
$endgroup$
– freakish
Dec 12 '18 at 17:54
$begingroup$
However if you expect that those matrices are often not going to be perfectly similar (which, as I've mentioned is something you have to know a priori) and those vectors are big then I guess hashing is an option. Although it can give you false positives.
$endgroup$
– freakish
Dec 12 '18 at 17:54
$begingroup$
A matrix is two-dimensional. You're talking about a three-dimensional array. It's not any different from determining if two members of a one-dimensional array are identical. The fastest thing to do, so far as I know, is to sort the array and then compare adjacent elements. (This is on a single CPU.)
$endgroup$
– saulspatz
Dec 12 '18 at 17:54
$begingroup$
A matrix is two-dimensional. You're talking about a three-dimensional array. It's not any different from determining if two members of a one-dimensional array are identical. The fastest thing to do, so far as I know, is to sort the array and then compare adjacent elements. (This is on a single CPU.)
$endgroup$
– saulspatz
Dec 12 '18 at 17:54
$begingroup$
you are using a computer so you can write a function that converts a vector with four components into a string with separators: $"v1;v2;v3;v4"$ and make comparison based on the results of this function
$endgroup$
– Vasya
Dec 12 '18 at 17:59
$begingroup$
you are using a computer so you can write a function that converts a vector with four components into a string with separators: $"v1;v2;v3;v4"$ and make comparison based on the results of this function
$endgroup$
– Vasya
Dec 12 '18 at 17:59
$begingroup$
@Vasya that's a really bad idea. you are increasing the size of the data to parse, not reducing it
$endgroup$
– Federico
Dec 12 '18 at 18:04
$begingroup$
@Vasya that's a really bad idea. you are increasing the size of the data to parse, not reducing it
$endgroup$
– Federico
Dec 12 '18 at 18:04
|
show 8 more comments
2 Answers
2
active
oldest
votes
$begingroup$
If by "number" you allow quaternions then of course to the vector $v=(v_1,v_2,v_3,v_4)$ you can associate $v_1+iv_2+jv_3+kv_4$, but that is not helpful at all...
If you know for instance that the entries are integers, then you can send $v$ to $v_1+sqrt2 v_2+sqrt3 v_3 + sqrt5 v_4$ and this would be a bijection.
You have to consider if you are interested in $v=w$ or rather $|v-w|<10^{-8}$, because the two "equalities" are quite different to test. For instance, you cannot hash the vectors in order to test the second one.
Otherwise, if you think about it, the bit representation of the floating points for the components of $v$ are already a number associated to the vector. You just associate to $v$ the binary number obtained from the concatenation of the binary representations of $v_1,dots,v_4$.
Given that the vectors are very short (only 4 entries), I don't see any real benefit of attempting anything different from pairwise comparison.
You can use something like int memcmp(const void *s1, const void *s2, size_t n);
(manpage) for that or trivially code your own.
If you know something more about the spacial distribution of your vectors, maybe you can try something like hextrees in order to reduce the number of necessary comparisons.
See also here for ideas on how to partition your vectors.
For instance, given $winmathbb R^4$, you can partition your vector in the two families $W_-={v:vcdot w<0}$ and $W_+={v:vcdot w>0}$. Now you only need to check for identity among the two families and not across them. So if you are able to find a good $w$ that splits your vectors in two families approximately equinumerous, you pass from $n^2$ tests to $n^2/2$.
Edit Taking further the previous idea of computing $wcdot v$,
you can in fact compute 10x10
matrix of these scalar product and then have to worry only about entries which are the same. This can reduce drastically the number of comparisons.
$endgroup$
add a comment |
$begingroup$
What you are looking for is a hash function in the general sense. Let a vector be $(a_0,a_1,a_2,a_3)$. If you know all the components are positive and less than some number $b$, you can just compute $a_0+ba_1+b^2a_2+b^3a_3$ and compare these. That gets you what you want, a unique number for each vector as long as they are what you expect. Whether it is worth computing this for each vector is up to you. I would be surprised if this is the slow point for your code. Of course, if $b$ is the size of the largest number that fits in an int you haven't saved any storage at all.
If you don't know the size of your vectors but all the components are positive whole numbers you can use the Cantor Pairing function. You can find many mentions of it on the site. You can pair $a_0 $ and $a_1$, $a_2$ and $a_3$ and finally the two pairs. This is more computation but handles numbers of any size. If you have negative numbers you need to make them positive some what or adapt the function.
$endgroup$
$begingroup$
That's wrong. Hashing does not give a unique number for a row. No such function can exist if the size of vector is greater then the size of the hash.
$endgroup$
– freakish
Dec 13 '18 at 7:55
$begingroup$
@freakish: hashing as a general term, does apply in the sense of taking many numbers and making one. The base representation and the pairing function are included. They will give unique results when given the proper input.
$endgroup$
– Ross Millikan
Dec 13 '18 at 14:58
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3037000%2fis-there-a-mathematical-operation-to-ensure-that-two-vectors-are-perfectly-equal%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
If by "number" you allow quaternions then of course to the vector $v=(v_1,v_2,v_3,v_4)$ you can associate $v_1+iv_2+jv_3+kv_4$, but that is not helpful at all...
If you know for instance that the entries are integers, then you can send $v$ to $v_1+sqrt2 v_2+sqrt3 v_3 + sqrt5 v_4$ and this would be a bijection.
You have to consider if you are interested in $v=w$ or rather $|v-w|<10^{-8}$, because the two "equalities" are quite different to test. For instance, you cannot hash the vectors in order to test the second one.
Otherwise, if you think about it, the bit representation of the floating points for the components of $v$ are already a number associated to the vector. You just associate to $v$ the binary number obtained from the concatenation of the binary representations of $v_1,dots,v_4$.
Given that the vectors are very short (only 4 entries), I don't see any real benefit of attempting anything different from pairwise comparison.
You can use something like int memcmp(const void *s1, const void *s2, size_t n);
(manpage) for that or trivially code your own.
If you know something more about the spacial distribution of your vectors, maybe you can try something like hextrees in order to reduce the number of necessary comparisons.
See also here for ideas on how to partition your vectors.
For instance, given $winmathbb R^4$, you can partition your vector in the two families $W_-={v:vcdot w<0}$ and $W_+={v:vcdot w>0}$. Now you only need to check for identity among the two families and not across them. So if you are able to find a good $w$ that splits your vectors in two families approximately equinumerous, you pass from $n^2$ tests to $n^2/2$.
Edit Taking further the previous idea of computing $wcdot v$,
you can in fact compute 10x10
matrix of these scalar product and then have to worry only about entries which are the same. This can reduce drastically the number of comparisons.
$endgroup$
add a comment |
$begingroup$
If by "number" you allow quaternions then of course to the vector $v=(v_1,v_2,v_3,v_4)$ you can associate $v_1+iv_2+jv_3+kv_4$, but that is not helpful at all...
If you know for instance that the entries are integers, then you can send $v$ to $v_1+sqrt2 v_2+sqrt3 v_3 + sqrt5 v_4$ and this would be a bijection.
You have to consider if you are interested in $v=w$ or rather $|v-w|<10^{-8}$, because the two "equalities" are quite different to test. For instance, you cannot hash the vectors in order to test the second one.
Otherwise, if you think about it, the bit representation of the floating points for the components of $v$ are already a number associated to the vector. You just associate to $v$ the binary number obtained from the concatenation of the binary representations of $v_1,dots,v_4$.
Given that the vectors are very short (only 4 entries), I don't see any real benefit of attempting anything different from pairwise comparison.
You can use something like int memcmp(const void *s1, const void *s2, size_t n);
(manpage) for that or trivially code your own.
If you know something more about the spacial distribution of your vectors, maybe you can try something like hextrees in order to reduce the number of necessary comparisons.
See also here for ideas on how to partition your vectors.
For instance, given $winmathbb R^4$, you can partition your vector in the two families $W_-={v:vcdot w<0}$ and $W_+={v:vcdot w>0}$. Now you only need to check for identity among the two families and not across them. So if you are able to find a good $w$ that splits your vectors in two families approximately equinumerous, you pass from $n^2$ tests to $n^2/2$.
Edit Taking further the previous idea of computing $wcdot v$,
you can in fact compute 10x10
matrix of these scalar product and then have to worry only about entries which are the same. This can reduce drastically the number of comparisons.
$endgroup$
add a comment |
$begingroup$
If by "number" you allow quaternions then of course to the vector $v=(v_1,v_2,v_3,v_4)$ you can associate $v_1+iv_2+jv_3+kv_4$, but that is not helpful at all...
If you know for instance that the entries are integers, then you can send $v$ to $v_1+sqrt2 v_2+sqrt3 v_3 + sqrt5 v_4$ and this would be a bijection.
You have to consider if you are interested in $v=w$ or rather $|v-w|<10^{-8}$, because the two "equalities" are quite different to test. For instance, you cannot hash the vectors in order to test the second one.
Otherwise, if you think about it, the bit representation of the floating points for the components of $v$ are already a number associated to the vector. You just associate to $v$ the binary number obtained from the concatenation of the binary representations of $v_1,dots,v_4$.
Given that the vectors are very short (only 4 entries), I don't see any real benefit of attempting anything different from pairwise comparison.
You can use something like int memcmp(const void *s1, const void *s2, size_t n);
(manpage) for that or trivially code your own.
If you know something more about the spacial distribution of your vectors, maybe you can try something like hextrees in order to reduce the number of necessary comparisons.
See also here for ideas on how to partition your vectors.
For instance, given $winmathbb R^4$, you can partition your vector in the two families $W_-={v:vcdot w<0}$ and $W_+={v:vcdot w>0}$. Now you only need to check for identity among the two families and not across them. So if you are able to find a good $w$ that splits your vectors in two families approximately equinumerous, you pass from $n^2$ tests to $n^2/2$.
Edit Taking further the previous idea of computing $wcdot v$,
you can in fact compute 10x10
matrix of these scalar product and then have to worry only about entries which are the same. This can reduce drastically the number of comparisons.
$endgroup$
If by "number" you allow quaternions then of course to the vector $v=(v_1,v_2,v_3,v_4)$ you can associate $v_1+iv_2+jv_3+kv_4$, but that is not helpful at all...
If you know for instance that the entries are integers, then you can send $v$ to $v_1+sqrt2 v_2+sqrt3 v_3 + sqrt5 v_4$ and this would be a bijection.
You have to consider if you are interested in $v=w$ or rather $|v-w|<10^{-8}$, because the two "equalities" are quite different to test. For instance, you cannot hash the vectors in order to test the second one.
Otherwise, if you think about it, the bit representation of the floating points for the components of $v$ are already a number associated to the vector. You just associate to $v$ the binary number obtained from the concatenation of the binary representations of $v_1,dots,v_4$.
Given that the vectors are very short (only 4 entries), I don't see any real benefit of attempting anything different from pairwise comparison.
You can use something like int memcmp(const void *s1, const void *s2, size_t n);
(manpage) for that or trivially code your own.
If you know something more about the spacial distribution of your vectors, maybe you can try something like hextrees in order to reduce the number of necessary comparisons.
See also here for ideas on how to partition your vectors.
For instance, given $winmathbb R^4$, you can partition your vector in the two families $W_-={v:vcdot w<0}$ and $W_+={v:vcdot w>0}$. Now you only need to check for identity among the two families and not across them. So if you are able to find a good $w$ that splits your vectors in two families approximately equinumerous, you pass from $n^2$ tests to $n^2/2$.
Edit Taking further the previous idea of computing $wcdot v$,
you can in fact compute 10x10
matrix of these scalar product and then have to worry only about entries which are the same. This can reduce drastically the number of comparisons.
edited Dec 12 '18 at 18:24
answered Dec 12 '18 at 17:53
FedericoFederico
5,034514
5,034514
add a comment |
add a comment |
$begingroup$
What you are looking for is a hash function in the general sense. Let a vector be $(a_0,a_1,a_2,a_3)$. If you know all the components are positive and less than some number $b$, you can just compute $a_0+ba_1+b^2a_2+b^3a_3$ and compare these. That gets you what you want, a unique number for each vector as long as they are what you expect. Whether it is worth computing this for each vector is up to you. I would be surprised if this is the slow point for your code. Of course, if $b$ is the size of the largest number that fits in an int you haven't saved any storage at all.
If you don't know the size of your vectors but all the components are positive whole numbers you can use the Cantor Pairing function. You can find many mentions of it on the site. You can pair $a_0 $ and $a_1$, $a_2$ and $a_3$ and finally the two pairs. This is more computation but handles numbers of any size. If you have negative numbers you need to make them positive some what or adapt the function.
$endgroup$
$begingroup$
That's wrong. Hashing does not give a unique number for a row. No such function can exist if the size of vector is greater then the size of the hash.
$endgroup$
– freakish
Dec 13 '18 at 7:55
$begingroup$
@freakish: hashing as a general term, does apply in the sense of taking many numbers and making one. The base representation and the pairing function are included. They will give unique results when given the proper input.
$endgroup$
– Ross Millikan
Dec 13 '18 at 14:58
add a comment |
$begingroup$
What you are looking for is a hash function in the general sense. Let a vector be $(a_0,a_1,a_2,a_3)$. If you know all the components are positive and less than some number $b$, you can just compute $a_0+ba_1+b^2a_2+b^3a_3$ and compare these. That gets you what you want, a unique number for each vector as long as they are what you expect. Whether it is worth computing this for each vector is up to you. I would be surprised if this is the slow point for your code. Of course, if $b$ is the size of the largest number that fits in an int you haven't saved any storage at all.
If you don't know the size of your vectors but all the components are positive whole numbers you can use the Cantor Pairing function. You can find many mentions of it on the site. You can pair $a_0 $ and $a_1$, $a_2$ and $a_3$ and finally the two pairs. This is more computation but handles numbers of any size. If you have negative numbers you need to make them positive some what or adapt the function.
$endgroup$
$begingroup$
That's wrong. Hashing does not give a unique number for a row. No such function can exist if the size of vector is greater then the size of the hash.
$endgroup$
– freakish
Dec 13 '18 at 7:55
$begingroup$
@freakish: hashing as a general term, does apply in the sense of taking many numbers and making one. The base representation and the pairing function are included. They will give unique results when given the proper input.
$endgroup$
– Ross Millikan
Dec 13 '18 at 14:58
add a comment |
$begingroup$
What you are looking for is a hash function in the general sense. Let a vector be $(a_0,a_1,a_2,a_3)$. If you know all the components are positive and less than some number $b$, you can just compute $a_0+ba_1+b^2a_2+b^3a_3$ and compare these. That gets you what you want, a unique number for each vector as long as they are what you expect. Whether it is worth computing this for each vector is up to you. I would be surprised if this is the slow point for your code. Of course, if $b$ is the size of the largest number that fits in an int you haven't saved any storage at all.
If you don't know the size of your vectors but all the components are positive whole numbers you can use the Cantor Pairing function. You can find many mentions of it on the site. You can pair $a_0 $ and $a_1$, $a_2$ and $a_3$ and finally the two pairs. This is more computation but handles numbers of any size. If you have negative numbers you need to make them positive some what or adapt the function.
$endgroup$
What you are looking for is a hash function in the general sense. Let a vector be $(a_0,a_1,a_2,a_3)$. If you know all the components are positive and less than some number $b$, you can just compute $a_0+ba_1+b^2a_2+b^3a_3$ and compare these. That gets you what you want, a unique number for each vector as long as they are what you expect. Whether it is worth computing this for each vector is up to you. I would be surprised if this is the slow point for your code. Of course, if $b$ is the size of the largest number that fits in an int you haven't saved any storage at all.
If you don't know the size of your vectors but all the components are positive whole numbers you can use the Cantor Pairing function. You can find many mentions of it on the site. You can pair $a_0 $ and $a_1$, $a_2$ and $a_3$ and finally the two pairs. This is more computation but handles numbers of any size. If you have negative numbers you need to make them positive some what or adapt the function.
answered Dec 12 '18 at 18:42
Ross MillikanRoss Millikan
295k23198371
295k23198371
$begingroup$
That's wrong. Hashing does not give a unique number for a row. No such function can exist if the size of vector is greater then the size of the hash.
$endgroup$
– freakish
Dec 13 '18 at 7:55
$begingroup$
@freakish: hashing as a general term, does apply in the sense of taking many numbers and making one. The base representation and the pairing function are included. They will give unique results when given the proper input.
$endgroup$
– Ross Millikan
Dec 13 '18 at 14:58
add a comment |
$begingroup$
That's wrong. Hashing does not give a unique number for a row. No such function can exist if the size of vector is greater then the size of the hash.
$endgroup$
– freakish
Dec 13 '18 at 7:55
$begingroup$
@freakish: hashing as a general term, does apply in the sense of taking many numbers and making one. The base representation and the pairing function are included. They will give unique results when given the proper input.
$endgroup$
– Ross Millikan
Dec 13 '18 at 14:58
$begingroup$
That's wrong. Hashing does not give a unique number for a row. No such function can exist if the size of vector is greater then the size of the hash.
$endgroup$
– freakish
Dec 13 '18 at 7:55
$begingroup$
That's wrong. Hashing does not give a unique number for a row. No such function can exist if the size of vector is greater then the size of the hash.
$endgroup$
– freakish
Dec 13 '18 at 7:55
$begingroup$
@freakish: hashing as a general term, does apply in the sense of taking many numbers and making one. The base representation and the pairing function are included. They will give unique results when given the proper input.
$endgroup$
– Ross Millikan
Dec 13 '18 at 14:58
$begingroup$
@freakish: hashing as a general term, does apply in the sense of taking many numbers and making one. The base representation and the pairing function are included. They will give unique results when given the proper input.
$endgroup$
– Ross Millikan
Dec 13 '18 at 14:58
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3037000%2fis-there-a-mathematical-operation-to-ensure-that-two-vectors-are-perfectly-equal%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
If you assign some unique value to each vector then you have to compare those values. For a computer comparing a vector (as in sequence of bytes) and comparing a number is pretty much the same thing. I don't see any gain in doing that (except for a given language limitations). If you don't know anything about the matrix then the only sensible thing is to compare row by row. This can be parallelized though (even on gpu).
$endgroup$
– freakish
Dec 12 '18 at 17:51
$begingroup$
However if you expect that those matrices are often not going to be perfectly similar (which, as I've mentioned is something you have to know a priori) and those vectors are big then I guess hashing is an option. Although it can give you false positives.
$endgroup$
– freakish
Dec 12 '18 at 17:54
$begingroup$
A matrix is two-dimensional. You're talking about a three-dimensional array. It's not any different from determining if two members of a one-dimensional array are identical. The fastest thing to do, so far as I know, is to sort the array and then compare adjacent elements. (This is on a single CPU.)
$endgroup$
– saulspatz
Dec 12 '18 at 17:54
$begingroup$
you are using a computer so you can write a function that converts a vector with four components into a string with separators: $"v1;v2;v3;v4"$ and make comparison based on the results of this function
$endgroup$
– Vasya
Dec 12 '18 at 17:59
$begingroup$
@Vasya that's a really bad idea. you are increasing the size of the data to parse, not reducing it
$endgroup$
– Federico
Dec 12 '18 at 18:04