If you want to learn data science, you’re going to have to learn statistics.
In discussing large-scale datasets, we often talk about the difficulties of and rationales behind omitting certain columns and keeping others. One reason for eliminating a column is because it is merely a placeholder, such as an arbitrary ID number, and will certainly have no effect, linear or otherwise, on our target variable. Sometimes the data in one or more columns is so mangled, with so little hope of restoring it, that it is in everyone’s best interest if we put those columns out of our misery. Another particularly common reason for column elimination, especially when employing linear regression, is multicollinearity.