To train and verify a machine learning model, a dataset is split in to train, validation and test datasets. The majority of the data will be used for training the model. Validation data is used to validate the results of the training session, and the test data is used to evaluate whether the model is generalizable.