NEW 'Factor List' tool for downloading data for AI/ML

jlittleton · December 24, 2023, 1:14am

I am making an assumption that the zscore using the entire dataset as the scope is using the same mean and std for every week when calculating the zscore. That is why I think you can use one overlapping week to scale one dataset to the other. If all weeks in each dataset have the same mean and std then it does not matter how the data varies over time. You just need to scale the data such that the one week lines up. Then since all weeks have the same scaling you should now have the same scaling across the datasets.

But I need to verify this. I have not had time yet… My plan is to download a few months for each dataset with one month of overlap. Scale the second dataset using one week and then check the other weeks to make sure the zscores are identical.

I agree that if you use the zscore scope of per date this will not work at all. Also my goal is not to get the original mean and std. Just make the two datasets line up.

EDIT: made a new thread so I can minimize hi-jacking this one any more than I have…