Data Leakage: A Specific Question

Does anachronistic data leakage, I mean, can it be foolproof prevented in this way: The testing data isn’t collected until after the model is fit on the training data?

It seems that would solve the problem. I wonder how practical it is in practice, or if I am even right in the first place. It seems that if the training and testing data are separated temporally, there’s no way the test data could leak backward in time to the training data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s