Softmax Function

The Softmax function normalizes ("squashes") a K-dimensional vector z of arbitrary real values to a K-dimensional vector of real values in the range [0, 1] that add up to 1.

The output of the softmax function can be used to represent a categorical distribution – that is, a probability distribution over K different possible outcomes, as illustrated below:

Using:

then the total of all values of z used as a power of e (Euler's Number) is:

and the individual normalized values j of K dimensioned vector z are:

Note that we're using the values of z as a power of e (Euler's Number) instead of using just z alone. This will change negative values to positive values and give disproportionately more weight to originally positive values.

References