Fundemental Data II

No, that’s not the case. We have been keeping track of effective dates ever since we switched to FactSet. So post-June 2020, a sim would not have data earlier than a live strategy. Of course, prior to then, all the live strategies were running on Compustat data, so the sims and live strategies would be completely different.

Yuval,

Thank you. I had noticed what you said about post-June 2020 data. I was hoping it was as simple as that. I assume that applies to earning estimates–maybe some difference for some reason but no look-ahead bias with the lag and any snapshots.

The sim does do well since June 2020 (as shown below). I probably need to start a new port and try it again. There are some good reasons why the port and sims should not be exactly alike (e.g., adding or removing cash) and luck can play a part in the short run.

Very much appreciated. I will fund the port. Not with a lot of money to start but fund it.

Best,

Jim


Yuval,

I have seen ranks change a lot in the transactions for the same ticker on the same day, I thought. Which is when I finally pulled the plug on the port.

I will show you if it happens again. Maybe we can decide together if that is okay (should it happen).

Jim

Just to clarify

FactSet does provide two key dates for filing data historically:

eps report date - when company pre announces
filing date - when company filed with sec

These are not the actual effective dates when FactSet made the data available in their datafeeds. The FactSet effective dates could be same day for bigger companies, but likely are a day or two later. FactSet claims to process 97% of the data for US companies within three days (although for TTD that was obviously not the case, it took them a week)

So before June 2020 we use their fields and just add 1 day to them. The market has already reacted to the data so it should not be a huge bias using them. Post June 2020 we use the dates we recorded.

And one last fact is that FactSet overwrites the prelim data when the final data is processed

What this means in strategies is the following:

If you use the setting ‘Include Prelims’ for the PIT method the final data is made available on the eps report date. If you use ‘Exclude Prelim’ then the final data is made available on the filing date. If you depend on cashflow items for your strategy it’s probably best to not use Prelim date to avoid look ahead since cashflow data is usually missing in a preliminary press release.

Hope this helps

NOTE: eps report date and filing date could be the same if a company announces and files the same day

NOTE: FactSet also has one single row for restated data so that’s another source of some compromises we had to make.

FYI this is FactSet Fundamental data update policy where they claim to process 97% of the prelims in three days… except for TTD of course :frowning:

.


xxx

In light of the previous comments remind us again why the decision was made to move to FactSet from Compustat?

From my perspective it was a step backwards as the data is now less reliable.

Compustat changed the redistribution terms on renewal. They would only allow us to expose 5 years of history for users w/o a direct license from them.

You can still get Compustat + P123 if you get a license from them (around $15K/year and up depending on your firm)

Those rat bastards!

It’s a good biz selling data. Try putting FDS MSCI SPGI TRI in a watchlist and see the combined performance

1Y 50%
3Y 180%
5Y 350%

P123,

Thank you for the above, very clear, explanations.

Best,

Jim

Jim,

I’m not sure which thread you mentioned this, but I can corroborate that I have experienced the same thing - I rebalance recommendations change in the middle of the day. Never seen it before / or maybe it happened and I never noticed.

I had a look at my live strats and sims.
Actually, live strats do in some cases (recently!) better then sim (sim often finds more stocks and sells earlier then the live strat). No Idea why.

On those strats that do better live I am using the following buy filters

Rank > 90
RankPrev(1)/RankPrev(0) < 0.90 or RankPrev(2)/RankPrev(0) < 0.8 or RankPrev(3)/RankPrev(0) < 0.8 or RankPrev(4)/RankPrev(0) < 0.8

CurQEPS1WkAgo < CurQEPSMean (would not work with prelim off, since I would loose weeks on buying the stock on small caps, same with
all other estimates).

CurFYSales1WkAgo < CurFYSalesMean

NextFYSales1WkAgo < NextFYSalesMean

In the ranking system I use cashflow (Quarter).

Most important question: are those buy rules effected by lookback bias (I think we discussed this by initiative and help of Jim, p123 found the problem and fixed it).

Also important: How big is the cash flow problem. Does get the preliminary cash flow overwritten (not sure if I understood this right) by the non preliminary data? If yes, that would mean my sim would have a lookahead biax (not huge, its only 5% of the ranking but still).

Not so important: any idea why some live strats doing better live?

Best Regards
Andreas

All,

I do not pretend to know what is going on. But I do have a simple and honest question: Is this supposed to happen? If it is, then I have some follow-up questions like does P123 snapshot all of the data since June 2020?

This is JUST ONE REBALANCE DATE b[/b]. Transactions for the port then the sim for the same ticker and same date. RANKS AT THE FAR RIGHT are different enough be be significant, I think.

Why and what might I be missing? I am not saying I know the answer, but not understanding how this could happen, I closed my port which was not doing as well as the sim. This port is closed and no longer funded. I closed it just after 10/11/21 with some of the transactions (and reasons for closing it) shown below.

To summarize, the port was not doing as well as the sim, and the port had different transactions than the sim, in part, because the ranks on the transactions were different for the port and sim. The universe and ranking system are the same as confirmed by this screenshot. Port first then sim as before.

Thank you for any understanding of how this could happen or why I should fund this port again with no concerns…

Best,

Jim







Jim -

Your ranking system is extremely susceptible to tiny changes in estimates, and we don’t have any way to track how FactSet is backfilling those. A-Mark, for instance, has only two analysts. I’ve attached FactSet’s estimate history for those two analysts on the three estimate measures that comprise most of your ranking system. You can see that between the end of August and mid-September–four weeks prior to your sell date–all the estimates changed drastically, some going up, some down, some going up and then down, some going down and then up. If FactSet just adjusted the date of one or two of those changes, or adjusted the EPS estimate, that would cause the rank to change. What’s more, FactSet might have added one analyst recently and then backfilled his/her estimates.

I would advise using a ranking system that relies on far more factors, preferably largely uncorrelated and unrelated.

  • Yuval

Yuval,

What makes my ranking system so susceptible?

I guess you have looked at it.

It has VERY NORMAL VALUE FACTORS one being EBITDAQ/EV. And normal sentiment factors that I learned from Marc and have been posted numerous times over the years by many members.

Ultimately you may be right. My ranking system has only 6 COMPLETELY normal factors that everyone would recognize!!!

Only one buy and sell rule that basically ensures that there have been no recent analyst downgrades.

I would argue what I did worked to prevent overfitting. The sim continues to perform beyond my wildest dreams

So I think a lot of noise factors may drown out the signal-making the sim and port more closely aligned.

So I agree completely actually.

Take home points:

  1. Is supposed to do this

  2. P123 does not take a snapshot of all of the data.

Did I miss anything?

Thank you for expanding my knowledge of P123 and FactSet data.

Your comment are helpful and much appreciated.

Jim

Yuval,

As far as correlation is concerned, I performed PCA and factor analysis to reduce correlation between my nodes.

My sentiment and value nodes are completely uncorrelated.

Doing PCA and factor analysis stops me from adding a lot of noise factors which I say again: did prevent overfitting.

And one more question. The sentiment data is misleading in the sim no matter how many other factors I add, right? The misleading data just gets diluted by adding more factors. I don’t have to hire an AI expert for that right?

Extremely helpful. I did not fully understand FactSet’s earnings estimates data.

Honestly, I thought the lag recently added to that data had made it the most reliable data available at P123.

Jim

Yuval, please have a look at my question, thank you :-)))

Perhaps less is more? The difficulties we’re all having with prelims is why most research papers for factor analysis only use annual , final data that has been audited.

Preliminary data might just be completely useless for the type of analysis we do. During prelims data is incomplete, unstandardized, and can change. Some factors in your systems will fallback, others do not. Add to this that markets have knee jerk reactions creating value traps, and you have a complete mess during earnings. And having Compustat vs FactSet does not change the narrative much.

I’m not saying we’ll get rid of prelims. All I’m saying is that P123 should default to a much more stable mode. For Ex: when a user starts a new strategy it should default to avoid buying or selling during preliminary data (it’s possible to do this now but it’s a bit convoluted)

Marco,

Thank you very much for taking this seriously.

I do not use preliminary data: to the extent that checking the box to eliminate preliminary data completely eliminated it.

Yuval has just said earnings estimates are not at all reliable.

That being said is FactSet’s PIT offering of earnings estimates data not a possibility? Is it really PIT?

As it is, I think Yuval’s is right about the earnings estimate data.

That is why I have been importing Zacks rank data into P123 using InList.

Again, I am extremely grateful for the information you have provided and for addressing this in a serious manner.

Jim