Skip to content

Softmax Function

The softmax function is a mathematical function commonly used in machine learning, particularly in neural networks for multi-class classification tasks.

It transforms a vector of raw output scores (often referred to as logits) from the final layer of a neural network into a probability distribution over multiple classes.

The softmax function is defined mathematically as follows:

softmax(zi)=ezijezj

Where:

  • z is the input vector of raw outputs from the neural network.
  • e is the base of the natural logarithm (approximately equal to 2.718).
  • zi represents the raw score for class i
  • The denominator sums the exponentials of all class scores, ensuring that the output values are normalized and sum to 1.

Properties:

  1. The outputs are always between 0 and 1.
  2. The sum of all outputs equals 1.
  3. Each output can be interpreted as the likelihood of the input belonging to each class.