Live vs Simulated - FactSet vs Compustat

Dan,

Thank you. That is definitely encouraging. And remember this is my baseline belief. So not entirely surprising:

For further informed discussions and also so that we do not keep repeating some of the same tests wondering what is going to happen each time, I wonder if you or someone would make clear what, exactly, P123 is taking a snapshot of? My understanding is that P123 uses snapshots since June 2020. But it was never clear to me whether that was for just fundamentals or included earnings estimates. I do think that is central to understanding any similarities or difference in sims and port since June 2020.

If everything is in the snapshots then there should never be any differences and any differences would have to be human error or a coding problem.

If earnings estimates are not snapshotted it might explain this in Jeff’s post above (Jeff please correct me if I misread your post or did not provide adequate context):

BTW, that should already be in the documentation somewhere, IMHO.

Again thank you.

Jim

Dan,

I am not sure whether what you have done makes me feel better or worse:

  1. There are ALMOST ALWAYS differences in the stocks bought and sold on each rebalance. Images: Live first then sim.

  2. The ranks at sale can look pretty different. Same image. Significantly different for: TIJAY

  3. So why at the end of the day were the results so similar. One will not be able notice much difference between the sim and port when they are both performing at the level of the benchmark. This is because randomly replacing one stock that will probably perform at the level of the benchmark with another stock the will probably perform at the level of the benchmark will not be noticed.

This is like if my wife’s discretionary stock picks are randomly replaced with my daughter’ discretionary stock picks the performance is not likely to change.

If Warren Buffett let my daughter randomly replace a few stocks in his portfolio people might notice. Here is an image with comparison to the SP 600 Value Small-cap.

The other thing is that actual earnings estimates revisions only make up about 4.5% of this sim/port ranking system. I do not claim to know if earnings estimates revisions are a particular problem but it has been suggested.

Marco did look at 2 examples where the sim and port had different holdings and he attributed it to differences in the earnings estimates data , so not just suggested but the proven cause in 2 cases.

I admit that I might reassess that if P123 ever lets us know for sure whether it uses snapshots on earnings estimates data. I am open to new information when it arrives.

On balance I am going to put your anecdotal results into the category of showing a pretty significant discrepancy. I value your opinion if you disagree. And I could have made a mistake.

Thank you.

Best,

Jim




We snapshot the data for preliminaries so that it doesn’t get overwritten by data for the final filing. In addition, we assign effective dates depending on when we actually receive the data from FactSet; prior to July 2020, we set the effective dates to one day after the announcement.

We do not snapshot other data, which will, on rare occasions, change a little. I did a little test of this from one week to another using the same as-of date and found that out of the Russell 3000 stocks, only 23 of them (less than 1%) had one of the following six fields changed: CurFYEPSMean, CurFYEPS4WkAgo, #AnalystsCurFY, NextFYEPSMean, NextFYEPS4WkAgo, #AnalystsNextFY. In some cases, an analyst was scrubbed, in others, an analyst was added, and a few had tiny differences in EPS. Not one of the revisions was major.

Over how many weeks?

Both one and two weeks. It’s a pretty easy test. Take the Russell 3000 and run a screen with various ShowVar items on a specific as-of date in the past. Then run exactly the same screen with the same as-of date a week later. Repeat a week or two later. Compare the results and see if anything has changed. Just make sure the as-of date is the same (you can check because the “Start” columns should match). Also be aware that tickers may change.

Yuval,

So, that is a clear example of look-ahead bias isn’t it? Information that was not available when you rebalanced the port being added to the sim data (after you rebalanced the port). It can and does happen. Just as you have said all along.

My auto port and sim transactions for this week show a change in one out of 2 of the buys. Auto port with no positions forced into the universe. I do not think my overall experience with my ports is consistent with the use of the word rare. Perhaps, something like a new analyst is often associated with a higher rank and shows up in my sims frequently enough to be relatively common. Or perhaps a one week sample may not be a sound basis for concluding a large position in a port is a good idea. Definitely the latter is true.

Or maybe it is just the size of the potential impact when look-ahead bias occurs: earnings surprises often have very quick earnings revisions. Look-ahead bias seeing an earnings revision could also mean you are acting on an earnings surprise before it happens.

How many large earnings surprises would you have to see in your crystal ball to make money? Although it is not immediately clear that look-ahead bias that blatant can occur now with the lag.

Anyway, I think you should expect to see differences in holdings. No one should be surprised anymore when this happens. You probably should not bother posting when you see it. It is just a matter of whether if affects the returns, and if so, by how much. You might start an auto port right next to any port you fund and not change the auto port. Start a new auto port if you do change your funded port.

I doubt you will ever see anything definitive as individual members do not have access to enough data. That was worth learning I think. Everyone comes to this with their own biases or prior beliefs but you will never really know for sure.

Thank you.

Best,

Jim



x

Yuval,

Thank you for your detailed description on the data. I would just like to understand the data and have no point with my questions here.

When you say effective dates that is an announcement only? Nothing is done whatsoever with estimates correct? P123 does not keep a record of when that data is received or any effective dates on earnings estimates data? I assume that is something P123 has no plans of doing if it is not already being done?

When the fields change, what is causing that? Is it a random correction of some error for example–with no real effect? One might fear look-ahead bias but one would have to be a little paranoid to think FactSet systematically does that on purpose.

Do you have a belief as to how likely it is that this represents look-ahead bias of some sort? To be clear, I do not have an opinion on this and I am just asking. Or maybe I will assume the data is probably okay until I see a mechanism for malignant look-ahead bias on a regular basis (post lag at P123). While I have questions about the impact, even correction of errors in hindsight is a type of look-ahead bias isn’t it? Albeit maybe not so bad of an example.

Again, thank you for your detailed description. Any additional information–whether it is related to my questions or not–welcome. Thank you in advance.

Jim

You’re correct, Jim, on all counts here.

Like I said, when an estimate changes it’s usually because FactSet has scrubbed or added an analyst. But sometimes an estimate will change without an analyst being scrubbed or added–a few pennies are added or subtracted. The changes tend to be pretty minor.

ABT is a typical example of these changes, which, as I said, occur in less than 1% of the Russell 3000. The number of analysts changed from 20 to 19; the current FY EPS estimate changed from $5.03 to $5.06; and the 4-week-ago EPS estimate changed from $4.98 to $5.01.

Another typical example is GIII, all of whose estimates went up by between $0.05 and $0.08, which is less than 2%, without any analyst changes.

But I did find a couple of rotten apples. ANAB has 7 analysts. Initially, the current FY EPS estimate was -$0.75; later it became $0.50. Next FY EPS estimate changed only by a penny, from -$3.22 to -$3.21. This might appear to be look-ahead bias at first glance, since the test was performed several weeks ago and the current FY EPS estimate has grown steadily over the last 13 weeks from $-2.00 to $1.80. However, I think it’s just FactSet correcting a mistake.

The same thing happened with GIC, which has only 2 analysts. The current FY EPS estimate changed from $1.24 to $1.75. $1.24, again, seems to be a mistake, since the estimate history simply doesn’t go down that low.

Out of 3000 stocks, 2 rotten apples doesn’t seem so bad. The other 21 changes are all very minor, and most of them are due to a change in the number of analysts.

If you consider error correction to be look-ahead bias, and if look-ahead bias invalidates your results completely, then you’ll never be happy with FactSet. They correct errors. It’s their policy. And because we don’t get any effective dates for those corrections, we’re stuck with that.

I believe, however, that the error corrections are MINOR, deliberate, and IN NO WAY DESIGNED TO PRODUCE BETTER BACKTESTS. Remember that FactSet is not in the backtesting business. Their entire business model is built around providing full and accurate data at all times to all people. That is why they correct errors when they see them.

I see absolutely no evidence of any look-ahead bias besides error correction here.

Yuval, Thank you. -Jim

All,

I do not want to spend any more time on this. It is worth revisiting if someone has more or new data (which I do not).

I do want to point out that Yuval and I do not have much disagreement.

We might have a mild disagreement on the use of the word ‘rare’ although we were using it in different contexts. And there is a subjective component to the term.

The only important question is how all of this affects returns. I do not have an opinion on that. So maybe a disagreement on the only thing that is important would not even be possible. Not even theoretically.

I did learn a lot from both Yuval and Dan. I will change how I look at this going forward based on their input. So I am glad for the discussion and appreciate it.

Jim

Jim,

I haven’t joined this discussion so far since I don’t know enough about the differences between Factset and Compustat data and how the earnings estimates ultimately got scrubbed or revised.

However, if 1 out of 5 stocks in my stock portfolio is different between the sims and ports on a constant basis (20%), I think most P123 members would be concerned whether the sims can be trusted and ask questions like what you have done here (you are also doing it on behalf of other P123 users who are less knowledge about these issues).

By the way, I am not saying this just for the sake of showing some support.

Regards
James

1 Like

Hi all
thanks for all the posts, very valuable info.
However, I am puzzled. I have been following my strategy last 4 weeks, and comparing live with simulated, results are different, and also some holdings are different.
Why is that? I read somewere in this topic that we should expect different holdings between the live and simulated strategy, but why is that? Even my strategy that only has 5 holdings?
Help!
thanks in advance