How to avoid Multicollinearity in Categorical Data?
pd.get_dummies silently introduces multicollinearity in your data. In this tutorial, we will walk through a simple example on how you can deal with the multicollinearity by using pd.get_dummies(drop_first=True) and also by using Lasso regression. Linear Regression assumes the features to exhibit no multicollinearity so treating it is an important step in machine learning model building process.
To view the video
- Click here
- Click on the image below