| 1. |
What Are Conjugate Gradients, Levenberg-marquardt, Etc.? |
|
Answer» Training a neural network is, in most cases, an exercise in numerical optimization of a usually NONLINEAR objective function ("objective function" means whatever function you are trying to optimize and is a slightly more general term than "ERROR function" in that it may include other quantities such as penalties for weight decay; Methods of nonlinear optimization have been studied for hundreds of years, and there is a huge LITERATURE on the subject in fields such as numerical analysis, operations research, and statistical computing, e.g., Bertsekas (1995), Bertsekas and Tsitsiklis (1996), Fletcher (1987), and Gill, Murray, and Wright (1981). Masters (1995) has a good elementary discussion of conjugate gradient and Levenberg-Marquardt ALGORITHMS in the context of NNs. Training a neural network is, in most cases, an exercise in numerical optimization of a usually nonlinear objective function ("objective function" means whatever function you are trying to optimize and is a slightly more general term than "error function" in that it may include other quantities such as penalties for weight decay; Methods of nonlinear optimization have been studied for hundreds of years, and there is a huge literature on the subject in fields such as numerical analysis, operations research, and statistical computing, e.g., Bertsekas (1995), Bertsekas and Tsitsiklis (1996), Fletcher (1987), and Gill, Murray, and Wright (1981). Masters (1995) has a good elementary discussion of conjugate gradient and Levenberg-Marquardt algorithms in the context of NNs. |
|