Data leakage is the use of a value during development of a model, that at the time of prediction can not be available. It often contains the information you are trying to predict.