Mạng thần kinh thường xuyên cho dự đoán P9
Số trang: 12
Loại file: pdf
Dung lượng: 419.39 KB
Lượt xem: 9
Lượt tải: 0
Xem trước 2 trang đầu tiên của tài liệu này:
Thông tin tài liệu:
A Class of Normalised Algorithms for Online Training of Recurrent Neural NetworksA normalised version of the real-time recurrent learning (RTRL) algorithm is introduced. This has been achieved via local linearisation of the RTRL around the current point in the state space of the network. Such an algorithm provides an adaptive learning rate normalised by the L2 norm of the gradient vector at the output neuron. The analysis is general and also covers simpler cases of feedforward networks and linear FIR filters...
Nội dung trích xuất từ tài liệu:
Mạng thần kinh thường xuyên cho dự đoán P9 Recurrent Neural Networks for Prediction Authored by Danilo P. Mandic, Jonathon A. Chambers Copyright c 2001 John Wiley & Sons Ltd ISBNs: 0-471-49517-4 (Hardback); 0-470-84535-X (Electronic)9A Class of NormalisedAlgorithms for Online Trainingof Recurrent Neural Networks9.1 PerspectiveA normalised version of the real-time recurrent learning (RTRL) algorithm is intro-duced. This has been achieved via local linearisation of the RTRL around the currentpoint in the state space of the network. Such an algorithm provides an adaptive learn-ing rate normalised by the L2 norm of the gradient vector at the output neuron. Theanalysis is general and also covers simpler cases of feedforward networks and linearFIR filters.9.2 IntroductionGradient-descent-based algorithms for training neural networks, such as the back-propagation, backpropagation through time, recurrent backpropagation (RBP) andreal-time recurrent learning (RTRL) algorithm, typically suffer from slow convergencewhen dealing with statistically nonstationary inputs. In the area of linear adaptivefilters, similar problems with the LMS algorithm have been addressed by utilisingnormalised algorithms, such as NLMS. We therefore introduce a normalised RTRL-based learning algorithm with the idea to impose similar stabilisation and convergenceeffects on training of RNNs, as normalisation imposes on the LMS algorithm. In the area of linear FIR adaptive filters, it is shown (Soria-Olivas et al. 1998) thata normalised gradient-descent-based learning algorithm can be derived starting fromthe Taylor series expansion of the instantaneous output error of an adaptive FIR filter,given by N N N ∂e(k) 1 ∂ 2 e(k)e(k + 1) = e(k) + ∆wi (k) + ∆wi (k)∆wj (k) + · · · . i=1 ∂wi (k) 2! i=1 j=1 ∂wi (k)∂wj (k) (9.1)150 OVERVIEWFrom the mathematical description of LMS 1 from Chapter 2, we have ∂e(k) = −x(k − i + 1), i = 1, 2, . . . , N, (9.2) ∂wi (k)and ∆wi (k) = µ(k)e(k)x(k − i + 1), i = 1, 2, . . . , N. (9.3)Due to the linearity of the FIR filter, the second- and higher-order partial derivativesin (9.1) vanish. Combining (9.1)–(9.3) yields e(k + 1) = e(k) − µ(k)e(k) x(k) 2 2 (9.4)for which the nontrivial solution gives the learning rate of a normalised LMS algorithm 1 µNLMS (k) = 2. (9.5) x(k) 2The stability analysis of adaptive algorithms can be undertaken using contractiveoperators and fixed point iteration. For the contractive operator T , it follows that T z1 − T z2 γ z1 − z 2 , 0 γ < 1, z1 , z2 ∈ R N . (9.6)The convergence analysis of LMS, for instance, can be undertaken starting from themisalignment 2 vector v(k) = w(k) − w(k) by setting z1 = v(k + 1), z2 = v(0) ˜and T = [I − µ(k)x(k)x (k)] (Gholkar 1990). Detailed convergence analysis for a Tclass of gradient-based learning algorithms for recurrent neural networks is given inChapter 10.9.3 OverviewA class of normalised gradient-based algorithms is derived starting from the LMSalgorithm for linear adaptive filters through to a normalised algorithm for trainingrecurrent neural networks. For each case the adaptive learning rate has been derived.Stability of such algorithms is addressed in Chapter 10. The normalised algorithmsare shown to outperform standard algorithms with fixed learning rate. 1 The two core equations for adaptation of the LMS algorithm are e(k) = d(k) − xT (k)w(k), w(k + 1) = w(k) + µ(k)e(k)x(k). 2 The misalignment vector is defined as v(k) = w(k) − w(k), where w(k) is the set of optimal ˜ ˜weights of the system.A CLASS OF NORMALISED ALGORITHMS FOR TRAINING OF RNNs 151 0 −5 Averaged squared prediction error in dB −10 LMS −15 NLMS NGD −20 −25 NNGD −30 100 200 300 400 500 600 700 800 900 1000 ...
Nội dung trích xuất từ tài liệu:
Mạng thần kinh thường xuyên cho dự đoán P9 Recurrent Neural Networks for Prediction Authored by Danilo P. Mandic, Jonathon A. Chambers Copyright c 2001 John Wiley & Sons Ltd ISBNs: 0-471-49517-4 (Hardback); 0-470-84535-X (Electronic)9A Class of NormalisedAlgorithms for Online Trainingof Recurrent Neural Networks9.1 PerspectiveA normalised version of the real-time recurrent learning (RTRL) algorithm is intro-duced. This has been achieved via local linearisation of the RTRL around the currentpoint in the state space of the network. Such an algorithm provides an adaptive learn-ing rate normalised by the L2 norm of the gradient vector at the output neuron. Theanalysis is general and also covers simpler cases of feedforward networks and linearFIR filters.9.2 IntroductionGradient-descent-based algorithms for training neural networks, such as the back-propagation, backpropagation through time, recurrent backpropagation (RBP) andreal-time recurrent learning (RTRL) algorithm, typically suffer from slow convergencewhen dealing with statistically nonstationary inputs. In the area of linear adaptivefilters, similar problems with the LMS algorithm have been addressed by utilisingnormalised algorithms, such as NLMS. We therefore introduce a normalised RTRL-based learning algorithm with the idea to impose similar stabilisation and convergenceeffects on training of RNNs, as normalisation imposes on the LMS algorithm. In the area of linear FIR adaptive filters, it is shown (Soria-Olivas et al. 1998) thata normalised gradient-descent-based learning algorithm can be derived starting fromthe Taylor series expansion of the instantaneous output error of an adaptive FIR filter,given by N N N ∂e(k) 1 ∂ 2 e(k)e(k + 1) = e(k) + ∆wi (k) + ∆wi (k)∆wj (k) + · · · . i=1 ∂wi (k) 2! i=1 j=1 ∂wi (k)∂wj (k) (9.1)150 OVERVIEWFrom the mathematical description of LMS 1 from Chapter 2, we have ∂e(k) = −x(k − i + 1), i = 1, 2, . . . , N, (9.2) ∂wi (k)and ∆wi (k) = µ(k)e(k)x(k − i + 1), i = 1, 2, . . . , N. (9.3)Due to the linearity of the FIR filter, the second- and higher-order partial derivativesin (9.1) vanish. Combining (9.1)–(9.3) yields e(k + 1) = e(k) − µ(k)e(k) x(k) 2 2 (9.4)for which the nontrivial solution gives the learning rate of a normalised LMS algorithm 1 µNLMS (k) = 2. (9.5) x(k) 2The stability analysis of adaptive algorithms can be undertaken using contractiveoperators and fixed point iteration. For the contractive operator T , it follows that T z1 − T z2 γ z1 − z 2 , 0 γ < 1, z1 , z2 ∈ R N . (9.6)The convergence analysis of LMS, for instance, can be undertaken starting from themisalignment 2 vector v(k) = w(k) − w(k) by setting z1 = v(k + 1), z2 = v(0) ˜and T = [I − µ(k)x(k)x (k)] (Gholkar 1990). Detailed convergence analysis for a Tclass of gradient-based learning algorithms for recurrent neural networks is given inChapter 10.9.3 OverviewA class of normalised gradient-based algorithms is derived starting from the LMSalgorithm for linear adaptive filters through to a normalised algorithm for trainingrecurrent neural networks. For each case the adaptive learning rate has been derived.Stability of such algorithms is addressed in Chapter 10. The normalised algorithmsare shown to outperform standard algorithms with fixed learning rate. 1 The two core equations for adaptation of the LMS algorithm are e(k) = d(k) − xT (k)w(k), w(k + 1) = w(k) + µ(k)e(k)x(k). 2 The misalignment vector is defined as v(k) = w(k) − w(k), where w(k) is the set of optimal ˜ ˜weights of the system.A CLASS OF NORMALISED ALGORITHMS FOR TRAINING OF RNNs 151 0 −5 Averaged squared prediction error in dB −10 LMS −15 NLMS NGD −20 −25 NNGD −30 100 200 300 400 500 600 700 800 900 1000 ...
Tìm kiếm theo từ khóa liên quan:
Mạng thần kinh Artificial neural network mạng lưới thần kinh dự đoán mạng lướiGợi ý tài liệu liên quan:
-
Short-term load forecasting using long short-term memory network
4 trang 48 0 0 -
Nghiên cứu hệ thống điều khiển thông minh: Phần 1
232 trang 35 0 0 -
Applications of artificial neural network in textiles
10 trang 30 0 0 -
Bài giảng Nhập môn Học máy và Khai phá dữ liệu: Chương 8 - Nguyễn Nhật Quang
69 trang 28 0 0 -
Artificial intelligence approach to predict the dynamic modulus of asphalt concrete mixtures
10 trang 27 0 0 -
8 trang 25 0 0
-
68 trang 24 0 0
-
Ebook Sustainable construction and building materials: Select proceedings of ICSCBM 2018 - Part 2
446 trang 23 0 0 -
Sử dụng mạng nơron thần kinh nhân tạo để tính toán, dự đoán diện tích gương hầm sau khi nổ mìn
8 trang 23 0 0 -
Lecture Introduction to Machine learning and Data mining: Lesson 8
68 trang 23 0 0