In data processing scaling is a method used to normalize the range of independent variables or features of data, it is also known as data normalization and is generally performed during the data preprocessing step.
MIN-MAX scaling or MIN-MAX normalization is the simplest method and consists in rescaling the range of features to scale the range in [0, 1] or sometimes [−1, 1] which is called unity-normed data scale. Selecting the target range depends on the nature of the data.
Creating data and MIN-MAX scaling:
import matplotlib.pyplot as plt import numpy as np N = 42 data = np.log(np.random.rand(N))*234 + 934 # get min and max dataMin = min(data) dataMax = max(data) # now min-max scale dataS = (data-dataMin) / (dataMax-dataMin) # now plot fig,ax = plt.subplots(1,2,figsize=(8,4)) ax.plot(1+np.random.randn(N)/20,data,'ks') ax.set_xlim([0,2]) ax.set_xticks() ax.set_ylabel('Original data scale') ax.set_title('Original data') ax.plot(1+np.random.randn(N)/20,dataS,'ks') ax.set_xlim([0,2]) ax.set_xticks() ax.set_ylabel('Unity-normed data scale') ax.set_title('Scaled data') plt.show()
Scaling to arbitrary data range:
## any arbitrary data range # step 1 is to [0,1] normalize as above # step 2: newMin = 4 newMax = 8.7 dataSS = dataS*(newMax-newMin) + newMin # test it! print([min(dataSS), max(dataSS)])