Initialize the given weight and bias.
Initialize the given weight and bias.
the weight to initialize
the data format of weight indicating the dimension order of the weight. "output_first" means output is in the lower dimension "input_first" means input is in the lower dimension.
In short, it helps signals reach deep into the network.
During the training process of deep nn:
2. If the weights in a network start too large, then the signal grows as it passes through each layer until it’s too massive to be useful.
Xavier initialization makes sure the weights are ‘just right’, keeping the signal in a reasonable range of values through many layers.
More details on the paper [Understanding the difficulty of training deep feedforward neural networks] (http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf)