Summary: | Extreme learning machine (ELM) is massively mapped to the saturation region of the activation function. Moreover, the input and output of the hidden layer are far from being able to obtain a common distribution method, which gives rise to poor generalization performance. Aiming at this problem, the extreme learning machine that optimizes the affine transformation (AT) in the activation function under the Gaussian distribution is studied. The proposed algorithm introduces a new linear relationship of input data in the hidden layer. The gradient descent algorithm is used to optimize the scaling parameters and translation parameters in the objective function to satisfy the hidden layer output highly obeying the Gaussian distribution. The new method of calculating affine parameters based on the Gaussian distribution can ensure that the hidden nodes are independent of each other while retaining a high degree of dependency. The experimental results show that the output data of the hidden layer do not obey the uniform distribution well in the actual classification dataset and the image regression dataset, but obey the Gaussian distribution trend, which can achieve better experimental results in general. Compared with the original ELM algorithm and the AT-ELM1 algorithm, there are significant improvements in general.
|