Data Leakage and how to avoid it

In this blog post we are talking about something that in machine learning is called data leakage. Please, do not misunderstand it as the leakage of data to the public.
Data leakage in machine learning is when using a feature for predicting the output, that at the time of prediction cannot be available. In many cases, the feature holds information about the value we are trying to predict.

Image of code on a computer with glasses in front