Measure - MATRIX AND CORRELATIONS BETWEEN LONG-ONLY FACTOR RETURNS

If I have 50 nodes and want to pick out the nodes that historically are least correlated, does anyone know how I can proceed to find the individual nodes that historically are least correlated with each other?

Is it possible to measure, for example by creating a correlation matrix?

Here are the 50 nodes I want to test:

NetFCFQ/MktCap Universe Higher
FCFQ/mktcap Universe Higher
(NetFCFQ/ShsOutMR)/Close(0) Universe Higher
FCFPSQ/Price Universe Higher
(NetFCFQ / AstTotQ) + (NetFCFTTM/mktcap) industry Higher
NetFCFQ / AstTotQ Universe Higher
NetFCFQ/AstTotQ Universe Higher
FCFQ/EV Universe Higher
(OperCashFlQ-CapExQ)/EV Universe Higher
(mktcap + DbtTotQ) / Eval(EBITDAq>0,EBITDAq,NA) Sector Lower
(OperCashFlQ+ IntExpQ-CapExQ)/EV Universe Higher
OperCashFlQ/EV Universe Higher
IsNA( (EBIT(0,Qtr)-EBIT(4,Qtr))/ABS(EBIT(4,Qtr)) - (AstTotQ-AstTot(4,Qtr))/ABS(AstTot(4,Qtr)), 0) Universe Higher
(OCFPSTTM-(CapEx(0,TTM)/ShsOutMR))/ Price Universe Higher
OpIncPSQ/EVPS Universe Higher
NetFCFPSQ/Price industry Higher
NetFCFTTM / (Price * Vol3mAvg) Universe Higher
OpIncBDeprQ/EV Universe Higher
EBITDA(0,QTR)/EV Universe Higher
FCFYield Universe Higher
OpIncGr%PYQ Sector Higher
(NetFCFPSTTM + DivPSTTM) / Price Universe Higher
FCFTTM/mktcap Universe Higher
GrossProfitQ/AstTotQ Universe Higher
(FCFPSTTM + DivPSTTM) / Price Universe Higher
GR%PYQ(FCF) Universe Higher
FCFGr%PYQ Universe Higher
EV2EBITDAQ Universe Lower
ROE%Q Universe Higher
GR%PYQ(NetFCF) Universe Higher
NetFCFGr%PYQ Universe Higher
(opincq-opincpyq)/max(2,abs(opincpyq)) Sector Higher
IncBTaxQ/EV Universe Higher
PEInclRDQ Universe Lower
OperCashFlQ/AstTotQ Universe Higher
(OpMgn%Q - OpMgn%PYQ) / abs(OpMgn%PYQ) Universe Higher
OpMgn%Gr%PYQ Universe Higher
CurFYEPSMean/abs(CurFYEPS13WkAgo) Universe Higher
EBITDAGr%PYQ Universe Higher
NetFCFTTM / AstTotTTM Universe Higher
eval(NextFYEPS8WkAgo>0,(NextFYEPSMedian-NextFYEPS8WkAgo)/NextFYEPS8WkAgo,NA) Universe Higher
GrossProfit%AssetsQ Universe Higher
NextFYEPSMean/abs(NextFYEPS13WkAgo) Universe Higher
AstTurnQ industry Higher
NetFCFPSTTM/price Universe Higher
NetFCFPSTTM / Price Universe Higher
(OperCashFlTTM - CapExTTM + 0.8*IntExpTTM) / EV Universe Higher
(opercashflttm-capexttm+0.87*intexpttm)/$ev Universe Higher
NetFCFTTM / MktCap Universe Higher
NetFCFTTM/mktcap Universe Higher
1 Like

I guess you could use the API to download the ranks for each factor for each stock in your universe, and then calculate the correlation between the factors that way. You might have some issues for stocks with many NAs, not sure how to deal with that.

Thank you. I tried this solution, even if it is a lot of manual work: non correlated factors

It gave some t results:

I’ve also spend quite some time determining correlations and significance of the relation between different factors. This can be useful to get some idea of how factors relate. However, when two factors have a high correlation this doesn’t necessarily mean that one of them should be excluded, in my opinion.

Let me give you an analogy to explain what i mean.

Let’s say you you are in the business of buying appartements to rent out. Each time you go and buy an appartment, you first rank all appartments in the neighbourhoud based on a set of variables to determine their relative value. You most likely check the foundation, the state of the roof, the price and rent per square foot or m2 of the appartment vs the neighbourhood, and the general state of the appartment (kitchen, bathroom, etc.) and some other relevant stuff.

Now let’s say some friend of you comes along and tells you: “You know, I determined that there is a 95% correlation between the state of the roof and the state of the foundation. So you do not have to check the roof anymore from now on.”

What would you do? Just leave the roof be? Never check it again? I think that would be a mistake. You might buy an appartment without a roof and think everything is ok, because you checked the foundation.

Now this is not a perfect example. Especially because with an appartment you might save some actual time by not checking the roof and that time could be allocated somewhere else. But for a stock ranking system, you do not really save any time. So why not check all attributes anyhow?

That’s my 2 cents.

Yes you are right. I tride to compose a multi factor system out of 25 of the least correlated nodes. Did not improve anything… :face_with_peeking_eye: