# Normalization

Normalization is a technique often applied as part of data preparation for machine learning. The goal of normalization is to change the values of numeric columns in the dataset to a common scale, without distorting differences in the ranges of values or losing information.

### Min-Max Normalization

Min-Max normalization is one of the most common ways to normalize data. For every feature, the minimum value of that feature gets transformed into a 0, the maximum value gets transformed into a 1, and every other value gets transformed into a decimal between 0 and 1.

**Process:**

copy`x_normalized = (x−min(x))/ (max(x)-min(x))`

**Where:**

- x_normalized is the normalized value of the feature.
- x is the original value of the feature.
- min(x) is the minimum value of the feature across the dataset.
- max(x) is the maximum value of the feature across the dataset.

**Example:**

Sample Input | 10 | 25 | 30 |
---|---|---|---|

Sample Output | 0 | 0.75 | 1 |

### Unit Normalization

Unit normalization consists of dividing every entry in a column (feature) by its magnitude to create a feature of length 1 known as the unit vector.

**Process:**

copy`x_normalized = x / ||x||`

**Where:**

- x_normalized is the normalized value of the feature.
- x is the original value of the feature.
- ||x|| is the magnitude which is calculated as
- ||x|| = sqrt(x1^2 + x2^2 + ……. xn^2)
- x1, x2, x3……xn are the original values of the feature.

**Example:**

Sample Input | 10 | 25 | 30 |
---|---|---|---|

Sample Output | 0.248 | 0.620 | 0.744 |

### Mean Normalization

This transformer transforms the data based on the mean so that sum of the values equals to 0.

**Process:**

copy`x_normalized = x - mean(x) / max(x) - min(x)`

**Where:**

- x_normalized is the normalized value of the feature.
- x is the original value of the feature.
- mean(x) is the mean of feature across the dataset.
- min(x) is the minimum value of the feature across the dataset.
- max(x) is the maximum value of the feature across the dataset.

**Example:**

Sample Input | 10 | 25 | 30 |
---|---|---|---|

Sample Output | -0.583 | 0.166 | 0.416 |

### Mean-Std Normalization

The data can be normalized by subtracting the mean (µ) of each feature and a division by the standard deviation (σ). This way, each feature has a mean of 0 and a standard deviation of 1. This results in faster convergence.

**Process:**

copy`x_normalized = x - mean(x) / std(x)`

**Where**

- x_normalized is the normalized value of the feature.
- x is the original value of the feature.
- mean(x) is the mean of feature across the dataset.
- std(x) is the standard deviation of the feature across the dataset.

**Example:**

Sample Input | 10 | 25 | 30 |
---|---|---|---|

Sample Output | -1.120 | 0.320 | 0.800 |

Last Updated 2023-10-09 18:18:15 +0530 +0530

Yes

No

Send your feedback to us