Mạng thần kinh thường xuyên cho dự đoán P1

Số trang: 8 Loại file: pdf Dung lượng: 102.13 KB Lượt xem: 11 Lượt tải: 0

Thư viện của tui

Hỗ trợ phí lưu trữ khi tải xuống: 4,000 VND

Xem trước 2 trang đầu tiên của tài liệu này:

Thông tin tài liệu:

Artiﬁcial neural network (ANN) models have been extensively studied with the aim of achieving human-like performance, especially in the ﬁeld of pattern recognition. These networks are composed of a number of nonlinear computational elements which operate in parallel and are arranged in a manner reminiscent of biological neural interconnections. ANNs are known by many names such as connectionist models, parallel distributed processing models and neuromorphic systems (Lippmann 1987).
Nội dung trích xuất từ tài liệu:
Mạng thần kinh thường xuyên cho dự đoán P1 Recurrent Neural Networks for Prediction Authored by Danilo P. Mandic, Jonathon A. Chambers Copyright c 2001 John Wiley & Sons Ltd ISBNs: 0-471-49517-4 (Hardback); 0-470-84535-X (Electronic)1IntroductionArtiﬁcial neural network (ANN) models have been extensively studied with the aimof achieving human-like performance, especially in the ﬁeld of pattern recognition.These networks are composed of a number of nonlinear computational elements whichoperate in parallel and are arranged in a manner reminiscent of biological neural inter-connections. ANNs are known by many names such as connectionist models, paralleldistributed processing models and neuromorphic systems (Lippmann 1987). The ori-gin of connectionist ideas can be traced back to the Greek philosopher, Aristotle, andhis ideas of mental associations. He proposed some of the basic concepts such as thatmemory is composed of simple elements connected to each other via a number ofdiﬀerent mechanisms (Medler 1998). While early work in ANNs used anthropomorphic arguments to introduce the meth-ods and models used, today neural networks used in engineering are related to algo-rithms and computation and do not question how brains might work (Hunt et al.1992). For instance, recurrent neural networks have been attractive to physicists dueto their isomorphism to spin glass systems (Ermentrout 1998). The following proper-ties of neural networks make them important in signal processing (Hunt et al. 1992):they are nonlinear systems; they enable parallel distributed processing; they can beimplemented in VLSI technology; they provide learning, adaptation and data fusionof both qualitative (symbolic data from artiﬁcial intelligence) and quantitative (fromengineering) data; they realise multivariable systems. The area of neural networks is nowadays considered from two main perspectives.The ﬁrst perspective is cognitive science, which is an interdisciplinary study of themind. The second perspective is connectionism, which is a theory of information pro-cessing (Medler 1998). The neural networks in this work are approached from anengineering perspective, i.e. to make networks eﬃcient in terms of topology, learningalgorithms, ability to approximate functions and capture dynamics of time-varyingsystems. From the perspective of connection patterns, neural networks can be groupedinto two categories: feedforward networks, in which graphs have no loops, and recur-rent networks, where loops occur because of feedback connections. Feedforward net-works are static, that is, a given input can produce only one set of outputs, and hencecarry no memory. In contrast, recurrent network architectures enable the informa-tion to be temporally memorised in the networks (Kung and Hwang 1998). Basedon training by example, with strong support of statistical and optimisation theories2 SOME IMPORTANT DATES IN THE HISTORY OF CONNECTIONISM(Cichocki and Unbehauen 1993; Zhang and Constantinides 1992), neural networksare becoming one of the most powerful and appealing nonlinear signal processors fora variety of signal processing applications. As such, neural networks expand signalprocessing horizons (Chen 1997; Haykin 1996b), and can be considered as massivelyinterconnected nonlinear adaptive ﬁlters. Our emphasis will be on dynamics of recur-rent architectures and algorithms for prediction.1.1 Some Important Dates in the History of ConnectionismIn the early 1940s the pioneers of the ﬁeld, McCulloch and Pitts, studied the potentialof the interconnection of a model of a neuron. They proposed a computational modelbased on a simple neuron-like element (McCulloch and Pitts 1943). Others, like Hebbwere concerned with the adaptation laws involved in neural systems. In 1949 DonaldHebb devised a learning rule for adapting the connections within artiﬁcial neurons(Hebb 1949). A period of early activity extends up to the 1960s with the work ofRosenblatt (1962) and Widrow and Hoﬀ (1960). In 1958, Rosenblatt coined the name‘perceptron’. Based upon the perceptron (Rosenblatt 1958), he developed the theoryof statistical separability. The next major development is the new formulation oflearning rules by Widrow and Hoﬀ in their Adaline (Widrow and Hoﬀ 1960). In1969, Minsky and Papert (1969) provided a rigorous analysis of the perceptron. Thework of Grossberg in 1976 was based on biological and psychological evidence. Heproposed several new architectures of nonlinear dynamical systems (Grossberg 1974)and introduced adaptive resonance theory (ART), which is a real-time ANN thatperforms supervised and unsupervised learning of categories, pattern classiﬁcation andprediction. In 1982 Hopﬁeld pointed out that neural networks with certain symmetriesare analogues to spin glasses. A seminal book on ANNs is by Rumelhart et al. (1986). Fukushima explored com-petitive learning in his biologically inspired Cognitron and Neocognitron (Fukushima1975; Widrow and Lehr 1990). In 1971 Werbos developed a backpropagation learn-ing algorithm which he published in his doctoral thesis (Werbos 1974). Rumelhartet al . rediscovered this technique in 1986 (Rumelhart et al. 1986). Kohonen (1982),introduced self-organised maps for pattern recognition (Burr 1993).1.2 The Structure of Neural NetworksIn neural networks, computational models or nodes are connected through weightsthat are adapted during use to improve performance. The main idea is to achievegood performance via dense interconnection of simple computational elements. Thesimplest node provides a linear combination of N weights w1 , . . . , wN and N inputsx1 , . . . , xN , and passes the result through a nonlinearity Φ, as ...