Simple Neural Network: non linear system of equations? [closed]

I defined a very simple neural network of 2 inputs, 1 hidden layer with 2 nodes, and one output node.
For each input pattern $vec{x} in mathbb{R} times mathbb{R} $ and associated output $o in mathbb{R} $, the resulting nonlinear equation is:

$w_{o0} sigma(x_{0} W_{i00} + x_{1} W_{i10}) + w_{o1} sigma(x_{0} W_{i01} + x_{1} W_{i11}) = o$

where $W_i$ is the weight matrix of order $2 times 2$ of input connections, $sigma(x) = frac{1}{1+exp{-x}}$, and $vec{w_o}$ is the weight vector of the two output connections before the output node.

Given a dataset of $n$ (pattern, output) examples, there will be $n$ nonlinear equations.

I'm asking how to find the solutions of those nonlinear systems, as an alternative method to solve the learning problem, without backpropagation.
I've implemented an optimizer for the stated probem. If someone is interested I can provide the C sources (email: fportera2@gmail.com).

edited Nov 30 at 8:51

asked Nov 25 at 11:22

Filippo Portera

112

closed as too broad by the_candyman, Alexander Gruber♦ Nov 30 at 3:13

Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.

add a comment |

$w_{o0} sigma(x_{0} W_{i00} + x_{1} W_{i10}) + w_{o1} sigma(x_{0} W_{i01} + x_{1} W_{i11}) = o$

where $W_i$ is the weight matrix of order $2 times 2$ of input connections, $sigma(x) = frac{1}{1+exp{-x}}$, and $vec{w_o}$ is the weight vector of the two output connections before the output node.

Given a dataset of $n$ (pattern, output) examples, there will be $n$ nonlinear equations.

edited Nov 30 at 8:51

asked Nov 25 at 11:22

Filippo Portera

112

closed as too broad by the_candyman, Alexander Gruber♦ Nov 30 at 3:13

add a comment |

$w_{o0} sigma(x_{0} W_{i00} + x_{1} W_{i10}) + w_{o1} sigma(x_{0} W_{i01} + x_{1} W_{i11}) = o$

where $W_i$ is the weight matrix of order $2 times 2$ of input connections, $sigma(x) = frac{1}{1+exp{-x}}$, and $vec{w_o}$ is the weight vector of the two output connections before the output node.

Given a dataset of $n$ (pattern, output) examples, there will be $n$ nonlinear equations.

edited Nov 30 at 8:51

asked Nov 25 at 11:22

Filippo Portera

112

$w_{o0} sigma(x_{0} W_{i00} + x_{1} W_{i10}) + w_{o1} sigma(x_{0} W_{i01} + x_{1} W_{i11}) = o$

where $W_i$ is the weight matrix of order $2 times 2$ of input connections, $sigma(x) = frac{1}{1+exp{-x}}$, and $vec{w_o}$ is the weight vector of the two output connections before the output node.

Given a dataset of $n$ (pattern, output) examples, there will be $n$ nonlinear equations.

nonlinear-system

edited Nov 30 at 8:51

asked Nov 25 at 11:22

Filippo Portera

112

edited Nov 30 at 8:51

asked Nov 25 at 11:22

Filippo Portera

112

edited Nov 30 at 8:51

asked Nov 25 at 11:22

Filippo Portera

112

asked Nov 25 at 11:22

Filippo Portera

112

asked Nov 25 at 11:22

Filippo Portera

112

closed as too broad by the_candyman, Alexander Gruber♦ Nov 30 at 3:13

add a comment |

2 Answers
2

active

oldest

votes

For simplicity, let say that the output $y$ of your network is given by:

$$y = f(x, w),$$

where $f$ is a nonlinear function from the space of inputs $x$ and of weights $w$ to the space of the output $y$.

In general, we can not pretend that the network exactly learn the target corresponding to each input, i.e.

Find $w$ such that $$f(x,w) = o,$$

where $o$ is the target vector.

Indeed, pretending this means solving exactly the non linear system of equations you reported, which in general is a very difficult problem and often without a solution.

The idea behind the training of the neural network is to minimize the distance between output and target, i.e.:

Find $w$ such that $$|f(x,w) - o|,$$
is minimized,

where $| cdot |$ is a norm.

This approach is better for two reasons:

In general it is easier to be solved than just solving the system $f(x,w) = o$;

This guarantees a certain level of "elasticity" to your network - you don't really want that the output are really equal to the target, since in this way you are also able to classify new inputs not used for the training.

Besides these considerations, IMHO backpropagation is just an appealing name for defining an optimization procedure which takes advantage on the structure of the function $f$, inherited by the topology of the neural network. Specifically, to find the derivative of the objective function you use the chain rule. The structure of the derivative says that "error is backpropagated But you are just finding the derivative for solving an optimization problem!

edited Nov 25 at 11:53

answered Nov 25 at 11:47

the_candyman

8,70822044

Can I ask you if is there a formalization of "a very difficult problem and often without a solution" ?
– Filippo Portera
Nov 25 at 12:07

@FilippoPortera Linear systems are easy. Thanks to linear algebra, everything is known about linear systems and their solutions. Nonlinear systems are not easy since there is no general theory to account for them.
– the_candyman
Nov 25 at 12:18

I've followed the article: google.com/… and it turns out that, solving the associated nonlinear system as an optimization problem with gradient descent, is much slower than backpropagation and, in addition, it doesn't find a solution of the system.
– Filippo Portera
Nov 29 at 7:39

If you want, I can give you the c sources of my optimizers.
– Filippo Portera
Nov 29 at 11:37

add a comment |

I think there are at least numerical solutions attempts:

https://www.lakeheadu.ca/sites/default/files/uploads/77/docs/RemaniFinal.pdf

and:

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=2ahUKEwiC7OuujPneAhUFsKQKHWnRDukQFjAAegQIARAC&url=http%3A%2F%2Fscientificbulletin.upm.ro%2Fpapers%2F2011-1%2FFinta-The-Gradient-Method-for-Overdetermined-Nonlinear-Systems.pdf&usg=AOvVaw1GRmd0hToZc61RKXC8Plam

edited Nov 29 at 7:56

answered Nov 25 at 19:51

Filippo Portera

112

The risk is (overfitting)[en.wikipedia.org/wiki/Overfitting]. Optimization, i.e. find weights so that a certain norm is minimized, is the good choice rather than finding "exact solution".
– the_candyman
Nov 29 at 11:39

At present, the problem is that I'm not converging towards a solution of the nonlinear system. If you want I can provide the code I've implemented.
– Filippo Portera
Nov 29 at 11:46

Dear Filippo, if your problem is to solve a nonlinear system, then I warmly suggest you to create a new question on this specific topic.
– the_candyman
Nov 29 at 16:59

add a comment |

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

For simplicity, let say that the output $y$ of your network is given by:

$$y = f(x, w),$$

where $f$ is a nonlinear function from the space of inputs $x$ and of weights $w$ to the space of the output $y$.

In general, we can not pretend that the network exactly learn the target corresponding to each input, i.e.

Find $w$ such that $$f(x,w) = o,$$

where $o$ is the target vector.

Indeed, pretending this means solving exactly the non linear system of equations you reported, which in general is a very difficult problem and often without a solution.

The idea behind the training of the neural network is to minimize the distance between output and target, i.e.:

Find $w$ such that $$|f(x,w) - o|,$$
is minimized,

where $| cdot |$ is a norm.

This approach is better for two reasons:

In general it is easier to be solved than just solving the system $f(x,w) = o$;

This guarantees a certain level of "elasticity" to your network - you don't really want that the output are really equal to the target, since in this way you are also able to classify new inputs not used for the training.

edited Nov 25 at 11:53

answered Nov 25 at 11:47

the_candyman

8,70822044

Can I ask you if is there a formalization of "a very difficult problem and often without a solution" ?
– Filippo Portera
Nov 25 at 12:07

@FilippoPortera Linear systems are easy. Thanks to linear algebra, everything is known about linear systems and their solutions. Nonlinear systems are not easy since there is no general theory to account for them.
– the_candyman
Nov 25 at 12:18

I've followed the article: google.com/… and it turns out that, solving the associated nonlinear system as an optimization problem with gradient descent, is much slower than backpropagation and, in addition, it doesn't find a solution of the system.
– Filippo Portera
Nov 29 at 7:39

If you want, I can give you the c sources of my optimizers.
– Filippo Portera
Nov 29 at 11:37

add a comment |

For simplicity, let say that the output $y$ of your network is given by:

$$y = f(x, w),$$

where $f$ is a nonlinear function from the space of inputs $x$ and of weights $w$ to the space of the output $y$.

In general, we can not pretend that the network exactly learn the target corresponding to each input, i.e.

Find $w$ such that $$f(x,w) = o,$$

where $o$ is the target vector.

Indeed, pretending this means solving exactly the non linear system of equations you reported, which in general is a very difficult problem and often without a solution.

The idea behind the training of the neural network is to minimize the distance between output and target, i.e.:

Find $w$ such that $$|f(x,w) - o|,$$
is minimized,

where $| cdot |$ is a norm.

This approach is better for two reasons:

In general it is easier to be solved than just solving the system $f(x,w) = o$;

This guarantees a certain level of "elasticity" to your network - you don't really want that the output are really equal to the target, since in this way you are also able to classify new inputs not used for the training.

edited Nov 25 at 11:53

answered Nov 25 at 11:47

the_candyman

8,70822044

Can I ask you if is there a formalization of "a very difficult problem and often without a solution" ?
– Filippo Portera
Nov 25 at 12:07

@FilippoPortera Linear systems are easy. Thanks to linear algebra, everything is known about linear systems and their solutions. Nonlinear systems are not easy since there is no general theory to account for them.
– the_candyman
Nov 25 at 12:18

I've followed the article: google.com/… and it turns out that, solving the associated nonlinear system as an optimization problem with gradient descent, is much slower than backpropagation and, in addition, it doesn't find a solution of the system.
– Filippo Portera
Nov 29 at 7:39

If you want, I can give you the c sources of my optimizers.
– Filippo Portera
Nov 29 at 11:37

add a comment |

For simplicity, let say that the output $y$ of your network is given by:

$$y = f(x, w),$$

where $f$ is a nonlinear function from the space of inputs $x$ and of weights $w$ to the space of the output $y$.

In general, we can not pretend that the network exactly learn the target corresponding to each input, i.e.

Find $w$ such that $$f(x,w) = o,$$

where $o$ is the target vector.

Indeed, pretending this means solving exactly the non linear system of equations you reported, which in general is a very difficult problem and often without a solution.

The idea behind the training of the neural network is to minimize the distance between output and target, i.e.:

Find $w$ such that $$|f(x,w) - o|,$$
is minimized,

where $| cdot |$ is a norm.

This approach is better for two reasons:

In general it is easier to be solved than just solving the system $f(x,w) = o$;

This guarantees a certain level of "elasticity" to your network - you don't really want that the output are really equal to the target, since in this way you are also able to classify new inputs not used for the training.

edited Nov 25 at 11:53

answered Nov 25 at 11:47

the_candyman

8,70822044

For simplicity, let say that the output $y$ of your network is given by:

$$y = f(x, w),$$

where $f$ is a nonlinear function from the space of inputs $x$ and of weights $w$ to the space of the output $y$.

In general, we can not pretend that the network exactly learn the target corresponding to each input, i.e.

Find $w$ such that $$f(x,w) = o,$$

where $o$ is the target vector.

Indeed, pretending this means solving exactly the non linear system of equations you reported, which in general is a very difficult problem and often without a solution.

The idea behind the training of the neural network is to minimize the distance between output and target, i.e.:

Find $w$ such that $$|f(x,w) - o|,$$
is minimized,

where $| cdot |$ is a norm.

This approach is better for two reasons:

In general it is easier to be solved than just solving the system $f(x,w) = o$;

This guarantees a certain level of "elasticity" to your network - you don't really want that the output are really equal to the target, since in this way you are also able to classify new inputs not used for the training.

edited Nov 25 at 11:53

answered Nov 25 at 11:47

the_candyman

8,70822044

edited Nov 25 at 11:53

answered Nov 25 at 11:47

the_candyman

8,70822044

answered Nov 25 at 11:47

the_candyman

8,70822044

answered Nov 25 at 11:47

the_candyman

8,70822044

Can I ask you if is there a formalization of "a very difficult problem and often without a solution" ?
– Filippo Portera
Nov 25 at 12:07

@FilippoPortera Linear systems are easy. Thanks to linear algebra, everything is known about linear systems and their solutions. Nonlinear systems are not easy since there is no general theory to account for them.
– the_candyman
Nov 25 at 12:18

I've followed the article: google.com/… and it turns out that, solving the associated nonlinear system as an optimization problem with gradient descent, is much slower than backpropagation and, in addition, it doesn't find a solution of the system.
– Filippo Portera
Nov 29 at 7:39

If you want, I can give you the c sources of my optimizers.
– Filippo Portera
Nov 29 at 11:37

add a comment |

Can I ask you if is there a formalization of "a very difficult problem and often without a solution" ?
– Filippo Portera
Nov 25 at 12:07

@FilippoPortera Linear systems are easy. Thanks to linear algebra, everything is known about linear systems and their solutions. Nonlinear systems are not easy since there is no general theory to account for them.
– the_candyman
Nov 25 at 12:18

I've followed the article: google.com/… and it turns out that, solving the associated nonlinear system as an optimization problem with gradient descent, is much slower than backpropagation and, in addition, it doesn't find a solution of the system.
– Filippo Portera
Nov 29 at 7:39

If you want, I can give you the c sources of my optimizers.
– Filippo Portera
Nov 29 at 11:37

Can I ask you if is there a formalization of "a very difficult problem and often without a solution" ?
– Filippo Portera
Nov 25 at 12:07

@FilippoPortera Linear systems are easy. Thanks to linear algebra, everything is known about linear systems and their solutions. Nonlinear systems are not easy since there is no general theory to account for them.
– the_candyman
Nov 25 at 12:18

I've followed the article: google.com/… and it turns out that, solving the associated nonlinear system as an optimization problem with gradient descent, is much slower than backpropagation and, in addition, it doesn't find a solution of the system.
– Filippo Portera
Nov 29 at 7:39

If you want, I can give you the c sources of my optimizers.
– Filippo Portera
Nov 29 at 11:37

add a comment |

I think there are at least numerical solutions attempts:

https://www.lakeheadu.ca/sites/default/files/uploads/77/docs/RemaniFinal.pdf

and:

edited Nov 29 at 7:56

answered Nov 25 at 19:51

Filippo Portera

112

The risk is (overfitting)[en.wikipedia.org/wiki/Overfitting]. Optimization, i.e. find weights so that a certain norm is minimized, is the good choice rather than finding "exact solution".
– the_candyman
Nov 29 at 11:39

At present, the problem is that I'm not converging towards a solution of the nonlinear system. If you want I can provide the code I've implemented.
– Filippo Portera
Nov 29 at 11:46

Dear Filippo, if your problem is to solve a nonlinear system, then I warmly suggest you to create a new question on this specific topic.
– the_candyman
Nov 29 at 16:59

add a comment |

I think there are at least numerical solutions attempts:

https://www.lakeheadu.ca/sites/default/files/uploads/77/docs/RemaniFinal.pdf

and:

edited Nov 29 at 7:56

answered Nov 25 at 19:51

Filippo Portera

112

The risk is (overfitting)[en.wikipedia.org/wiki/Overfitting]. Optimization, i.e. find weights so that a certain norm is minimized, is the good choice rather than finding "exact solution".
– the_candyman
Nov 29 at 11:39

At present, the problem is that I'm not converging towards a solution of the nonlinear system. If you want I can provide the code I've implemented.
– Filippo Portera
Nov 29 at 11:46

Dear Filippo, if your problem is to solve a nonlinear system, then I warmly suggest you to create a new question on this specific topic.
– the_candyman
Nov 29 at 16:59

add a comment |

I think there are at least numerical solutions attempts:

https://www.lakeheadu.ca/sites/default/files/uploads/77/docs/RemaniFinal.pdf

and:

edited Nov 29 at 7:56

answered Nov 25 at 19:51

Filippo Portera

112

I think there are at least numerical solutions attempts:

https://www.lakeheadu.ca/sites/default/files/uploads/77/docs/RemaniFinal.pdf

and:

edited Nov 29 at 7:56

answered Nov 25 at 19:51

Filippo Portera

112

edited Nov 29 at 7:56

answered Nov 25 at 19:51

Filippo Portera

112

answered Nov 25 at 19:51

Filippo Portera

112

answered Nov 25 at 19:51

Filippo Portera

112

The risk is (overfitting)[en.wikipedia.org/wiki/Overfitting]. Optimization, i.e. find weights so that a certain norm is minimized, is the good choice rather than finding "exact solution".
– the_candyman
Nov 29 at 11:39

At present, the problem is that I'm not converging towards a solution of the nonlinear system. If you want I can provide the code I've implemented.
– Filippo Portera
Nov 29 at 11:46

Dear Filippo, if your problem is to solve a nonlinear system, then I warmly suggest you to create a new question on this specific topic.
– the_candyman
Nov 29 at 16:59

add a comment |

The risk is (overfitting)[en.wikipedia.org/wiki/Overfitting]. Optimization, i.e. find weights so that a certain norm is minimized, is the good choice rather than finding "exact solution".
– the_candyman
Nov 29 at 11:39

At present, the problem is that I'm not converging towards a solution of the nonlinear system. If you want I can provide the code I've implemented.
– Filippo Portera
Nov 29 at 11:46

Dear Filippo, if your problem is to solve a nonlinear system, then I warmly suggest you to create a new question on this specific topic.
– the_candyman
Nov 29 at 16:59

The risk is (overfitting)[en.wikipedia.org/wiki/Overfitting]. Optimization, i.e. find weights so that a certain norm is minimized, is the good choice rather than finding "exact solution".
– the_candyman
Nov 29 at 11:39

At present, the problem is that I'm not converging towards a solution of the nonlinear system. If you want I can provide the code I've implemented.
– Filippo Portera
Nov 29 at 11:46

Dear Filippo, if your problem is to solve a nonlinear system, then I warmly suggest you to create a new question on this specific topic.
– the_candyman
Nov 29 at 16:59

add a comment |

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg