All,

I downloaded the weekly returns for a P123 model (details of the model not pertinent here) under statistics in P123 and asked ChatGPT if my results were statically significant. I have the ChatGPT desktop for Mac and was trying it out with some screenshots and downloads from P123.

This was my original question:

*You have a screenshot of the out-of-sample results of a stock model I am investing in. In addition, I have uploaded a csv file with the result of my model (column head "Model"), a benchmark ("S&P 1500 Pure Value") and "Excess Return" which is the excess returns of my model versus the benchmark. Are my results statistically significant?*

I had to prompt it as to do a paired t-test, to do it as a one-sided t-test and whether to use a Wilcoxon Signed-Rank test (and/or the Student's t-test). The Wilcoxon Signed-Rank test is good for data that is not normally distributed. But it was right each time for the type of test it was using. I.e., I got a correct two-sided Student's t-test for the model initially (probably the default answer most of the time).

Final answer:

*Yes, the p-value of 0.021 from the Wilcoxon Signed-Rank test is for a two-sided test. This means it tests whether there is a statistically significant difference in either direction between the returns of your model and the benchmark. If you have a specific direction in mind (e.g., you believe your model performs better), you could perform a one-sided test, which would have a p-value half of the two-sided test. However, since the two-sided p-value is already below 0.05, the result is significant for a one-sided test as well.*

**TL;DR: You don't have to know a lot about statistics to do statistics now: just the name of the test you want. You do not have to know how to program with Python. Also AI/ML is coming. ChatGPT may be able to supplement some of what P123 is doing to make machine learing and statistics easy (or easier at least) and understandable.**

Jim