## 1. Introduction

1 Overview

BP (Back Propagation) neural network was proposed in 1986 by a scientific research group led by Rumelhart and McCelland. See their paper Learning representations by back-propagating errors published on Nature.

BP neural network is a multi-layer feedforward network trained by error back propagation algorithm, and it is one of the most widely used neural network models. The BP network can learn and store a large number of input-output pattern mapping relationships without revealing the mathematical equations describing this mapping relationship in advance. Its learning rule is to use the steepest descent method to continuously adjust the weights and thresholds of the network through backpropagation to minimize the sum of squared errors of the network.

2 The basic idea

of the BP algorithm Last time we said that the multilayer perceptron encountered a bottleneck in how to obtain the weight of the hidden layer. Since we cannot directly obtain the weight of the hidden layer, can we indirectly adjust the weight of the hidden layer by obtaining the error between the output result and the expected output through the output layer? The BP algorithm is an algorithm designed with this idea. Its basic idea is that the learning process consists of two processes: the forward propagation of the signal and the backward propagation of the error.

In the forward propagation, the input samples are passed in from the input layer, processed by each hidden layer layer by layer, and then passed to the output layer. If the actual output of the output layer is inconsistent with the expected output (teacher signal), it will enter the error back propagation stage.

When backpropagating, the output is transmitted back to the input layer layer by layer through the hidden layer in some form, and the error is allocated to all the units of each layer, so as to obtain the error signal of each layer unit, and this error signal is used as the correction unit The basis of the weight.

The specific procedures of these two processes will be introduced later.

The signal flow diagram of the BP algorithm is shown in the following figure.

3 BP network characteristic analysis-BP three elements When

we analyze an ANN, we usually start with its three elements, namely

1) network topology;

2) transfer function;

3) Learning algorithm.

The characteristics of each element add up to determine the functional characteristics of this ANN. Therefore, we also start the research on BP network from these three elements.

3.1 Topological structure

of BP network As I said last time, BP network is actually a multilayer perceptron, so its topology is the same as that of a multilayer perceptron. Since single-hidden-layer (three-layer) perceptrons have been able to solve simple nonlinear problems, they are the most widely used. The topology of the three-layer perceptron is shown in the figure below.

One of the simplest three-layer BP:

3.2 Transfer function of

BP network The transfer function used by BP network is a nonlinear transformation function-Sigmoid function (also known as S function). Its characteristic is that the function itself and its derivative are continuous, so it is very convenient in processing. Why choose this function, we will introduce further when we introduce the learning algorithm of BP network.

The unipolar sigmoid function curve is shown in the figure below.

The bipolar sigmoid function curve is shown in the figure below.

3.3 The learning algorithm of BP network The learning algorithm of

BP network is BP algorithm, also called algorithm (we will find many terms with multiple names in the learning process of ANN). Take the three-layer perceptron as an example, when the network outputs When it is not equal to the expected output, there is an output error E, which is defined as follows.

Next, we will introduce the specific process of BP network learning and training.

4 Training decomposition of BP network

Training a BP neural network is actually adjusting the two parameters of network weight and bias. The training process of BP neural network is divided into two parts:

Forward transmission, the output value is transmitted in a wave-by-layer manner;

reverse feedback, the weight and bias are adjusted layer by layer in the reverse direction;

let s look at the forward transmission first.

Forward transmission (Feed-Forward forward feedback)

Before training the network, we need to initialize the weights and biases randomly, and take one of [ 1, 1] [-1,1][ 1,1] for each weight Random real number, each offset takes a random real number of [0,1][0,1][0,1], and then it starts forward transmission.

The training of the neural network is completed by multiple iterations. Each iteration uses all the records of the training set, and each training network uses only one record. The abstract description is as follows:

```
While termination conditions are not met:
for record:dataset:
trainModel(record)
Copy code
```

4.1 Backpropagation

4.2 Training termination conditions

Each round of training uses all the records of the data set, but when to stop, there are two stopping conditions:

set the maximum number of iterations, for example, use the data set to iterate 100 times and then stop the training

calculation The prediction accuracy of the training set on the network, and stop training after reaching a certain threshold

5 The specific process of BP network operation

5.1 Network structure The

input layer has n nn neurons, the hidden layer has p pp neurons, and the output layer has q qq neurons.

5.2 Variable definition

Step 9: Judging the rationality of the model

Judging whether the network error meets the requirements.

When the error reaches the preset accuracy or the number of learning times is greater than the maximum number of times designed, the algorithm ends.

Otherwise, select the next learning sample and the corresponding output expectation, return to the third part, and enter the next round of learning.

6 Design of

BP network In the design of BP network, it should generally be considered from the number of layers of the network, the number of neurons in each layer and the activation function, the initial value and the learning rate. The following are some selections in principle.

6.1 The layer number

theory of the network has proved that a network with bias and at least one S-type hidden layer plus a linear output layer can approximate any rational function. Increasing the number of layers can further reduce errors and improve accuracy, but it also complicates the network. . In addition, a single-layer network with only a nonlinear activation function cannot be used to solve the problem, because the problems that can be solved with a single-layer network can certainly be solved with an adaptive linear network, and the calculation speed of the adaptive linear network is faster. The problem that can only be solved with a nonlinear function, the single-layer accuracy is not high enough, and the desired result can only be achieved by increasing the number of layers.

6.2 Number of hidden layer neurons

The improvement of network training accuracy can be achieved by using a hidden layer and increasing the number of neurons, which is much simpler in structure realization than increasing the number of network layers. Generally speaking, we use the accuracy and training time to quantify the quality of a neural network design:

(1) When the number of neurons is too small, the network cannot learn well, the number of training iterations is relatively large, and the training accuracy is not high.

(2) When there are too many neurons, the more powerful the function of the network, the higher the accuracy, and the larger the number of training iterations, and the phenomenon of over fitting may occur.

From this, we get the principle of selecting the number of neurons in the hidden layer of the neural network: on the premise that the problem can be solved, add one or two neurons to speed up the error reduction speed.

6.3 Selection of

initial weights Generally, initial weights are random numbers with a value between ( 1,1). In addition, after analyzing how a two-layer network trains a function, Wedlow et al. proposed a strategy of selecting the initial weight level of s r, where r is the number of inputs and s is the number of neurons in the first layer. number.

6.4 Learning rate The

learning rate is generally selected as 0.01-0.8. A large learning rate may lead to instability of the system, but a small learning rate will cause the convergence to be too slow and require a longer training time. For a more complex network, different learning rates may be required at different positions on the error surface. In order to reduce the number and time of training to find the learning rate, a more appropriate method is to use a variable adaptive learning rate to set the network at different stages Different sizes of learning rate.

6.5 Selection of Expected Error

In the process of designing the network, the expected error value should also be determined by comparing and training an appropriate value. This appropriate value is determined relative to the number of hidden layer nodes required. Under normal circumstances, two networks with different expected error values can be trained at the same time, and finally one of the networks can be determined by comprehensive factors.

7 Limitations of BP network

BP network BP network has the following problems:

(1) Longer training time is required: This is mainly caused by the too small learning rate, which can be improved by using a varying or adaptive learning rate.

(2) Unable to train at all: This is mainly manifested in the paralysis of the network. Usually, in order to avoid this situation, one is to choose a smaller initial weight, but to use a smaller learning rate.

(3) Local minimum: The gradient descent method used here may converge to a local minimum, and better results may be obtained by using a multilayer network or more neurons.

8 Improvement of BP network

main goal of P algorithm is to speed up the training speed, avoid falling into local minimums, etc. Common improvement methods include momentum factor algorithm, adaptive learning rate, changing learning rate, and function function shrinking method, etc. . The basic idea of the momentum factor method is to add a value proportional to the previous weight change on the basis of back propagation, and to generate a new weight change according to the back propagation method. . The adaptive learning rate method is aimed at some specific problems. The principle of the method of changing the learning rate is that if the sign of the reciprocal of a weight of the objective function is the same in several successive iterations, the learning rate of this weight will increase, and on the contrary, if the sign is opposite, its learning rate will be reduced. The shrinking function of the action function is to translate the action function, that is, to add a constant.

## 2. the source code

```
function varargout = Processing(varargin)
% PROCESSING MATLAB code for Processing.fig
% PROCESSING, by itself, creates a new PROCESSING or raises the existing
% singleton*.
%
% H = PROCESSING returns the handle to a new PROCESSING or the handle to
% the existing singleton*.
%
% PROCESSING( 'CALLBACK' ,hObject,eventData,handles,...) calls the local
% function named CALLBACK in PROCESSING.M with the given input arguments.
%
% PROCESSING( 'Property' , 'Value' ,...) creates a new PROCESSING or raises the
% existing singleton*. Starting from the left, property value pairs are
% applied to the GUI before Processing_OpeningFcn gets called. An
% unrecognized property name or invalid value makes property application
% stop. All inputs are passed to Processing_OpeningFcn via varargin.
%
% *See GUI Options on GUIDE ' s Tools menu. Choose"GUI allows only one
% instance to run (singleton)" .
%
% See also: GUIDE, GUIDATA, GUIHANDLES
% Edit the above text to modify the response to help Processing
% Last Modified by GUIDE v2 .5 22 -May -2021 22 : 54 : 29
% Begin initialization code-DO NOT EDIT
gui_Singleton = 1 ;
gui_State = struct( 'gui_Name' , mfilename, ...
'gui_Singleton' , gui_Singleton, ...
'gui_OpeningFcn' , @Processing_OpeningFcn, ...
'gui_OutputFcn' , @Processing_OutputFcn, ...
'gui_LayoutFcn' , [] ,. ..
'gui_Callback' , []);
if nargin && ischar (varargin{ 1 })
gui_State.gui_Callback = str2func(varargin{ 1 });
end
if nargout
[varargout{ 1 :nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code-DO NOT EDIT
% --- Executes just before Processing is made visible.
function Processing_OpeningFcn (hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject handle to figure
% eventdata reserved-to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% varargin command line arguments to Processing (see VARARGIN)
% Choose default command line output for Processing
handles.output = hObject;
setappdata(handles.Processing, 'X' , 0 );
setappdata(handles.Processing, 'bw' , 0 );
% Update handles structure
guidata (hObject, handles) ;
% UIWAIT makes Processing wait for user response (see UIRESUME)
% uiwait ( handles.figure1 ) ;
% --- Outputs from this function are returned to the command line.
function varargout = Processing_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved-to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% Get default command line output from handles structure
varargout { 1 } = handles.output ;
% --- Executes on button press in pushbutton 12.
function pushbutton12_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton12 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% --- Executes on button press in pushbutton13.
function pushbutton13_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton13 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% --- Executes on button press in pushbutton8.
function pushbutton8_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton8 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% --- Executes on button press in pushbutton14.
function pushbutton14_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton14 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% --- Executes on button press in pushbutton15.
function pushbutton15_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton15 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% --- Executes on button press in pushbutton16.
function pushbutton16_Callback (hObject, eventdata, handles)
% hObject handle to pushbutton16 (see GCBO)
% eventdata reserved-to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% --- Executes on button press in InputImage.
function InputImage_Callback (hObject, eventdata , handles)
% hObject handle to InputImage (see GCBO)
% eventdata reserved-to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
[filename,pathname] =uigetfile({ '*.jpg' ; '*.bmp' ; '*.tif' ; '*.*' }, 'load image' );
file=[pathname,filename];
% global S% Set a global variable S to save the initial image path for subsequent restore operations
% S=file;
X=imread(file);
set ( handles.imageshow1 , 'HandleVisibility' , 'ON' );
axes(handles.imageshow1);
imshow(X);
handles.img=X;
guidata(hObject,handles);
setappdata(handles.Processing, 'X' ,X);
% --- Executes on button press in pushbutton 9.
function pushbutton9_Callback (hObject, eventdata, handles)
% hObject handle to pushbutton9 (see GCBO)
% eventdata reserved-to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% --- Executes during object creation, after setting all properties.
function imageshow1_CreateFcn (hObject, eventdata , handles)
% hObject handle to imageshow1 (see GCBO)
% eventdata reserved-to be defined in a future version of MATLAB
% handles empty-handles not created until after all CreateFcns called
% global fname fpath see
% [fname,fpath] =uigetfile( '*.*' , 'open' );
% see=[fpath fname];
% I=imread(see);
% axes(handles.axes1);
% imshow(I, 'notruesize' );
% title( 'Original picture' );
function B=boundaries(BW,conn,dir)
% Function input
% BW: Binary image matrix%
% conn: 4 or 8
% Returns the coordinates of the edge pixels of the connected area in the binary image
if nargin< 3 %nargin is defined in the user-defined function body, and nargin returns
% The number of variables used to call the function.
dir = 'cw' ;
end
if nargin< 2
conn = 8 ;
end
L=bwlabel(BW,conn);% The return value is the matrix L, the size is the same as BW, including the label of the connected part in BW
% Initialization result matrix B
numObjects=max(L(:));% Find the maximum value in the mark made in L. This maximum value actually corresponds to L
% Contains the number of the most connected parts.
if numObjects> 0
B={zeros( 0 , 2 )};% There is only one element in the element package array.
B=repmat(B,numObjects, 1 );% copy B to numObjects* 1 to form a new B.
else
B={};
end
%Add 0- value pixels at the image boundary
Lp=padarray(L,[ 1 1 ], 0 , 'both' );
% Establish a relationship between the direction and the index coordinates
M=size(Lp, 1 );%SIZE(X, 1 ) returns the number of rows.
if conn== 8
%Order is N NE E SE S SW W NW.
offsets=[ -1 ,M -1 ,M,M+ 1 , 1 ,-M+ 1 ,-M,-M -1 ];
else
%Order is NES W.
offsets=[ -1 ,M, 1 ,-M];
end
% Start direction of search direction
if conn== 8
next_search_direction_lut=[ 8 8 2 2 4 4 6 6 ];
else
next_search_direction_lut=[ 4 1 2 3 ];
end
% Next visit node
if conn== 8
next_direction_lut=[ 2 3 4 5 6 7 8 1 ];
else
next_direction_lut=[ 2 3 4 1 ];
end
START= -1 ;
BOUNDARY= -2 ;
scratch=zeros( 100 , 1 );
%Find candidate starting locations for boundaries.
[rr,cc]=find((Lp( 2 :end -1 ,:)> 0 )&(Lp( 1 :end -2 ,:)== 0 ));
rr=rr+ 1 ;
for k = 1 :length(rr)
r=rr(k);
c=cc(k);
if (Lp(r,c)> 0 )&(Lp(r -1 ,c)== 0 )&isempty(B{Lp(r,c)})
%We ' ve found the start of the next boundary.Compute its linear
%offset,record which boundary it is,mark it, and initialize the
%counter for the number of boundary pixels.
idx=(c -1 )*size(Lp, 1 )+r;
which=Lp(idx);
scratch( 1 )=idx;
Lp(idx)=START;
numpixels = 1 ;
currentpixel=idx;
initial_departure_direction=[];
done = 0 ;
next_search_direction = 2 ;
while ~done
%Find the next boundary pixel.
direction=next_search_direction;
found_next_pixel = 0 ;
for k = 1 :length(offsets)
neighbor=currentpixel+offsets(direction);
if Lp (neighbor) ~ = 0
%Found the next boundary pixel.
if (Lp(currentpixel)==START)&...
isempty(initial_departure_direction)
%We are making the initial departure from the starting
%pixel.
initial_departure_direction=direction;
elseif (Lp(currentpixel)==START)&...
(initial_departure_direction==direction)
% We are about to retrace our path.
%That means we ' re done.
done = 1 ;
found_next_pixel = 1 ;
break ;
end
%Take the next step along the boundary.
next_search_direction=...
next_search_direction_lut(direction);
found_next_pixel = 1 ;
numpixels=numpixels+ 1 ;
if numpixels>size(scratch, 1 )
%Double the scratch space.
scratch( 2 *size(scratch, 1 )) = 0 ;
end
scratch (numpixels) =neighbor;
if Lp (neighbor) ~ = START
Lp (neighbor) =BOUNDARY;
end
currentpixel=neighbor;
break ;
end
direction=next_direction_lut(direction);
end
if ~found_next_pixel
%If there is no next neighbor, the object must just have a
%single pixel.
numpixels = 2 ;
scratch( 2 )=scratch( 1 );
done = 1 ;
end
end
%Convert linear indices to row_column coordinates and save in the
%output cell array .
[row,col]=ind2sub(size(Lp),scratch( 1 :numpixels));
B{which}=[row -1 ,col -1 ];
end
end
if strcmp (dir, 'ccw' )
for k = 1 :length(B)
B{k}=B{k}(end: -1 : 1 ,:);
end
end
Copy code
```

## 3. running results

## 4. remarks

Version: 2014a