Mạng thần kinh thường xuyên cho dự đoán P8

Số trang: 14 Loại file: pdf Dung lượng: 203.97 KB Lượt xem: 9 Lượt tải: 0

tailieu_vip

Phí lưu trữ: 3,000 VND

Xem trước 2 trang đầu tiên của tài liệu này:

Thông tin tài liệu:

Data-Reusing Adaptive Learning AlgorithmsIn this chapter, a class of data-reusing learning algorithms for recurrent neural networks is analysed. This is achieved starting from a case of feedforward neurons, through to the case of networks with feedback, trained with gradient descent learning algorithms. It is shown that the class of data-reusing algorithms outperforms the standard (a priori ) algorithms for nonlinear adaptive ﬁltering in terms of the instantaneous prediction error.
Nội dung trích xuất từ tài liệu:
Mạng thần kinh thường xuyên cho dự đoán P8 Recurrent Neural Networks for Prediction Authored by Danilo P. Mandic, Jonathon A. Chambers Copyright c 2001 John Wiley & Sons Ltd ISBNs: 0-471-49517-4 (Hardback); 0-470-84535-X (Electronic)8Data-Reusing AdaptiveLearning Algorithms8.1 PerspectiveIn this chapter, a class of data-reusing learning algorithms for recurrent neural net-works is analysed. This is achieved starting from a case of feedforward neurons,through to the case of networks with feedback, trained with gradient descent learn-ing algorithms. It is shown that the class of data-reusing algorithms outperforms thestandard (a priori ) algorithms for nonlinear adaptive ﬁltering in terms of the instanta-neous prediction error. The relationships between the a priori and a posteriori errors,learning rate and the norm of the input vector are derived in this context.8.2 IntroductionThe so-called a posteriori error estimates provide us with, roughly speaking, someinformation after computation. From a practical point of view, they are valuable anduseful, since real-life problems are often nonlinear, large, ill-conditioned, unstable orhave multiple solutions and singularities (Hlavacek and Krizek 1998). The a posteriorierror estimators are local in a computational sense, and the computational complexityof a posteriori error estimators should be far less expensive than the computation ofan exact numerical solution of the problem. An account of the essence of a posterioritechniques is given in Appendix F. In the area of linear adaptive ﬁlters, the most comprehensive overviews of a poste-riori techniques can be found in Treichler (1987) and Ljung and Soderstrom (1983).These techniques are also known as data-reusing techniques (Douglas and Rupp 1997;Roy and Shynk 1989; Schnaufer and Jenkins 1993; Sheu et al. 1992). The quality of ana posteriori error estimator is often measured by its eﬃciency index , i.e. the ratio ofthe estimated error to the true error. It has been shown that the a posteriori approachin the neural network framework introduces a kind of normalisation of the employedlearning algorithm (Mandic and Chambers 1998c). Consequently, it is expected thatthe instantaneous a posteriori output error e(k) is smaller in magnitude than the ¯136 INTRODUCTIONcorresponding a priori error e(k) for a non-expansive nonlinearity Φ (Mandic andChambers 1998c; Treichler 1987).8.2.1 Towards an A Posteriori Nonlinear PredictorTo obtain an a posteriori RNN-based nonlinear predictor, let us, for simplicity, con-sider a NARMA recurrent perceptron, the output of which can be expressed as y(k) = Φ(uT (k)w(k)), (8.1)where the information vector u(k) = [x(k − 1), . . . , x(k − M ), 1, y(k − 1), . . . , y(k − N )]T (8.2)comprises both the external input and feedback signals. As the updated weight vectorw(k+1) is available before the arrival of the next input vector u(k+1), an a posteriorioutput estimate y (k) can be formed as ¯ y (k) = Φ(uT (k)w(k + 1)). ¯ (8.3)The corresponding instantaneous a priori and a posteriori errors at the output neuronof a neural network are given, respectively, as e(k) = d(k) − y(k) a priori error, (8.4) e(k) = d(k) − y (k) ¯ ¯ a posteriori error, (8.5)where d(k) is some teaching signal. The a posteriori outputs (8.3) can be used to forman a posteriori information vector u(k) = [x(k − 1), . . . , x(k − M ), 1, y (k − 1), . . . , y (k − N )]T , ¯ ¯ ¯ (8.6)which can replace the a priori information vector (8.2) in the output (8.3) and weightupdate calculations (6.43)–(6.45). This also results in greater accuracy (Ljung andSoderstrom 1983). An alternate representation of such an algorithm is the so-calleda posteriori error gradient descent algorithm (Ljung and Soderstrom 1983; Treichler1987), explained later in this chapter.A simple data-reusing algorithm for linear adaptive ﬁltersThe procedure of calculating the instantaneous error, output and weight update maybe repeated for a number of times, keeping the same external input vector x(k) andteaching signal d(k), which results in improved error estimation. Let us consider sucha data-reusing LMS algorithm for FIR adaptive ﬁlters, described by (Mandic andChambers 2000e)  ei (k) = d(k) − xT (k)wi (k),   wi+1 (k) = wi (k) + ηei (k)x(k), (8.7)   subject to |ei+1 (k)| γ|ei (k)|, 0 < γ < 1, i = 1, . . . , L.DATA-REUSING ADAPTIVE LEARNING ALGORITHMS 137 Data-reusing algorithms - Linear case 0 Input: Speech signal recording ...