欢迎光临安百文库,如需获取更多资料请使用搜索功能。
安百文库
当前位置:首页 » 高等教育 » 理学 » 正文

【图文】RBF神经网络英文课件

Radial Basis Function Networks: AlgorithmsIntroduction to Neural Networks : Lecture 13© John A. Bullinaria, 20041. 2. 3. 4. 5. 6.The RBF Mapping The RBF Network Architecture Computational Power of RBF Networks Training an RBF Network Unsupervised Optimization of the Basis Functions Finding the Output Weights

RBF神经网络英文课件

The Radial Basis Function (RBF) MappingWe are working within the standard framework of function approximation. We have a set of N data points in a multi-dimensional space such that every D dimensional input vector x p = {xip : i = 1,..., D} has a corresponding K dimensional target output t p = {tkp : k = 1,..., K}. The target outputs will generally be generated by some underlying functions gk ( x ) plus random noise. The goal is to approximate the gk ( x ) with functions yk ( x ) of the formyk ( x ) =∑ wkjφ j (x)j =0MWe shall concentrate on the case of Gaussian basis functions x−µ j φ j ( x ) = exp − 2  2 σ j 2   in which we have basis centres {µ j} and widths {σ j}. Naturally, the way to proceed is to develop a process for finding the appropriate values for M, {wkj}, {µij} and {σ j}.L13-2

RBF神经网络英文课件

The RBF Network ArchitectureWe can cast the RBF Mapping into a form that resembles a neural network:1• • •Koutputs ykweights wkj1• • •j• • •M basis functions φj(xi, µij, σj)weights µij1• • •Dinputs xiThe hidden to output layer part operates like a standard feed-forward MLP network, with the sum of the weighted hidden unit activations giving the output unit activations. The hidden unit activations are given by the basis functions φ j ( x, µ j , σ j ) , which depend on the “weights” {µij , σ j} and input activations {xi} in a non-standard manner.L13-3

RBF神经网络英文课件

Computational Power of RBF NetworksIntuitively, we can easily understand why linear superpositions of localised basis functions are capable of universal approximation. More formally: Hartman, Keeler & Kowalski (1990, Neural Computation, vol.2, pp.210-215) provided a formal proof of this property for networks with Gaussian basis functions in which the widths {σ j} are treated as adjustable parameters. Park & Sandberg (1991, Neural Computation, vol.3, pp.246-257; and 1993, Neural Computation, vol.5, pp.305-316) showed that with only mild restrictions on the basis functions, the universal function approximation property still holds. As with the corresponding proofs for MLPs, these are existence proofs which rely on the availability of an arbitrarily large number of hidden units (i.e. basis functions). However, they do provide a theoretical foundation on which practical applications can be based with confidence.L13-4

RBF神经网络英文课件

Training RBF NetworksThe proofs about computational power tell us what an RBF Network can do, but nothing about how to find all its parameters/weights {wkj , µij , σ j } . Unlike in MLPs, in RBF networks the hidden and output layers play very different roles, and the corresponding “weights” have very different meanings and properties. It is therefore appropriate to use different learning algorithms for them. The input to hidden “weights” (i.e. basis function parameters {µij , σ j }) can be trained (or set) using any of a number of unsupervised learning techniques. Then, after the input to hidden “weights” are found, they are kept fixed while the hidden to output weights are learned. Since this second stage of training involves just a single layer of weights {w jk } and linear output activation functions, the weights can easily be found analytically by solving a set of linear equations. This can be done very quickly, without the need for a set of iterative weight updates as in gradient descent learning.L13-5

RBF神经网络英文课件

Basis Function OptimizationOne major advantage of RBF networks is the possibility of choosing suitable hidden unit/basis function parameters without having to perform a full non-linear optimization of the whole network. We shall now look at several ways of doing this: 1. Fixed centres selected at random 2. Orthogonal least squares 3. K-means clustering These are all unsupervised techniques, which will be particularly useful in situations where labelled data is in short supply, but there is plenty of unlabelled data (i.e. inputs without output targets). Next lecture we shall look at how we might get better results by performing a full supervised non-linear optimization of the network instead. With either approach, determining a good value for M remains a problem. It will generally be appropriate to compare the results for a range of different values, following the same kind of validation/cross validation methodology used for optimizing MLPs.L13-6

RBF神经网络英文课件