How to mechanically assign a score to a ranking system

InspectorSector · February 8, 2007, 7:32pm

stenci -

The idea behind the buckets is that we are working with the ranking performance system as it stands. You get a nice summary at the end of a ranking performance run with performance data organized into buckets. If we don’t use that method then how do you plan to collect the data for the metric? Will you run week 1, collect the top x% of the stocks, run week 2, collect the top x%, run week 3, etc. Then consolidate the results?

The two excel spreadsheets may be useful but for me I am primarily concerned with reducing the amount of time / effort for optimizing a ranking system. So the spreadsheet should be able to vary the weightings for the various input factors, run the performance and come back with the metric. I don’t really care for hybrid ranking / simulation spreadsheet.

Steve

stenci · February 8, 2007, 8:39pm

Steve, as you say the result from the buckets is a nice summary. They are good for a graphical representation. And you can have a 5 bucket or a 200 bucket. 5 are faster to read but less accurate, 200 are more accurate. If you make a 3000 bar graphic is difficult to read but more accurate.

Whether you calculate the slope, the standard deviation or the ratio that I mentioned above you can refer to a 5, 200 or 3000 set of bucket.
I’m just saying forget the buckets and think of stocks.

Example:
10 12 14 16 18 - 20 22 24 26 28 = 14 24
10 28 10 10 12 - 42 20 28 20 10 = 14 24
These are 2 10 stocks groups. If you divide them in two buckets you get the same result: the first bucket is 14 and the second is 24 in both the cases.
But if you look at the data you see that the first group is well ordered while the second is a mess.

There are many statistical tools to smoothen a set of data, and the buckets is just one of them. My point is: buckets are a great tool when we make graphics, they are not that good when we make numbers.

The 2 add-ins are not meant to substitute the standard tools.
If you find useful the add-in for the simulations then very likely you will find useful the add-in for the rankings.

Stefano

jtbaccarat · February 8, 2007, 10:32pm

Steve, et al.

This thread is very important, and it is clear we all have different approaches, which is what makes the world interesting. I hope we all find utility in the various suggestions and that some great tools are created from the various suggestions that give each of us something to make our work easier, more flexible, etc. Good suggestions, folks.

I guess what I am trying to say is that as I go through my own ranking system methodology a key component is how many stocks are actually in the buckets, in addition to the other metrics that have been suggested. I find that the number of stocks is crucial, at least as far as I operate.

So, in addition to the other proposed metrics by the community, I propose that there should be a metric that shows the number of stocks in the buckets.

A quick mouse over the bucket is a simple, quick check of the number of stocks in a bucket, and a numerical metric that shows, something like the following example, is even better:

Bucket Number of Stocks
95 115
90 234
85 443
80 555
etc.

And, of course, if that is useful, then it would be good to populate the number of stocks in the buckets into the Excel work sheets.

crakes · February 8, 2007, 11:10pm

All,

I think to declare a ranking system “good” it should be one that comes close to capturing the actual returns of the market buckets.

An example of what I mean:

If the S&P500 Universe returns are ranked over a 5 year period and the top bucket has 40% gains and the bottom bucket -5% gains, then that would be the benchmark for assessing ranking systems for that Universe, for the time period, for the number of buckets chosen. The benchmark would be variable by these 3 parameters…

A ranking system with a “perfect score” would capture the returns per bucket that actually occurred. So the “goodness” of the ranking system would be determined by the “nearness of replication” of the ranking buckets against the benchmark buckets.

A ranking system cannot be any better than the actual spread across buckets for a Universe, for a time period, for a given number of buckets… agreed?

So, if we are agreed, that leaves 2 action items:

The statisticians amongst us can figure out how to quanitify the “nearness of replication” and
to an earlier point - can we get some guidance on the Reverse Engineering feature? (should have put this one in a different post - sorry!)

Thanks,
Carl

crakes · February 8, 2007, 11:14pm

Actually, being able to see the actual ranking for stocks in a Universe across a time period for a number of buckets would be valuable in itself - that is if we could then click on the buckets and see the stocks. Could lead to some interesting re-engineering efforts! Hmmm - feature request?

jtbaccarat · February 8, 2007, 11:36pm

Crakes,

Your ‘nearnesss of replication’ concept is very close to the mark: it ties in all of the metrics.

Right now the only way to test the nearness is to create a simulation; maybe the solution involves tying the ranking system and the simulation together, to determine ‘nearness’?

crakes · February 8, 2007, 11:49pm

I was thinking of a regression/correlation of the ranking datapoints (return/bucket) against the benchmark. But there may be more powerful statistical tools.

InspectorSector · February 9, 2007, 2:21am

OK Stenci -

Your approach seems to be reasonable. So long as 3000 buckets doesn’t unduly load the server. I don’t really see a need for that much resolution. One observation I have right off the bat is that the no. of negative delta stocks should be the numerator, the no. of stocks should be the denominator. Why? Because it will be necessary to run multiple time frames and average the results. For this purpose the variable should be the numerator, the constant the denominator.

I hope you aren’t offended but I personally don’t use the existing EXCEL add-in. I think it is a remarkable piece of software but I realized during the trial period that what I was more interested in ranking system development and test, simply because the ranking system is the foundation. Testing the system with a bunch of rules thrown in to complicate matters seems like trying to run before one learns to walk. You have to test the ruggedness of the ranking system before you test the ruggedness of the system that uses the ranking system.

I am interested in two things:
(1) sensitivity analysis of a ranking system. I would like to be able to increment and decrement each ranking factor weighting individually (by some percentage) and see the change in metric. Does the ranking system improve or get worse with a change in weighting factor? I would like to have the ranking system tested over multiple time frames and the metric averaged over the time frames. Each ranking factor sensitivity should be tested in succession and at the end of it all a summary should come up for each ranking factor. Obviously the weights have to add up to 100 so as a weight is increased or decreased then the rest of the factor weights need to be normalized. This should all be done with a minimum of keystrokes.
(2) Universe analysis. I would like to be able to quickly run the ranking system performance on a list of Universes and visually see the results (20 buckets is fine plus metric)

The rest of you guys -

Great ideas. But what I described above is what I am interested in.

Steve

jerrodmason · February 9, 2007, 5:37pm

Maybe I’m having a Senior Moment, but aren’t there by definition an equal number of stocks in each bucket? If not, then what does a bucket mean?

DennyHalwes · February 9, 2007, 6:18pm

Jerry,

In general there are near equal number of stocks in most of the Ranking Systems that have many factors. However, with Ranking Systems that have few factors it is not uncommon for the rank value of many stocks to cluster around certain rank values, and therefore, have very few stocks in some buckets. The extreme of this is single factor Ranking Systems. With many of them there will be many buckets with no stocks at all. (See Dan’s Single Factor Ranking System Thread)

Also, if you are evaluating various filters in the Ranking System Performance, Mkt Cap for example, there will be different number of stocks in the buckets because the universes have different total number of stocks.

Denny

jtbaccarat · February 9, 2007, 6:35pm

Jerrod,

That’s not my understanding, and not from my actual empirical testing, which is why I would like to know how many stocks are ‘in’ the bucket. It is a fairly crucial fact IMHO.

Jerrod, you hit the nail on the head: [color=blue]“Maybe I’m having a Senior Moment, but aren’t there by definition an equal number of stocks in each bucket? If not, then [color=limegreen]what does a bucket mean?”[/color]
[/color] Exactly [color=limegreen](“what does a bucket mean”)[/color] : wouldn’t it be nice to have basic information on a bucket (other than the annualised return) such as the number of stocks in the bucket. Personally, I often find myself flipping to the screener page and using a rank filter rule to try to determine how many stocks are in a bucket or system, or series of buckets. The number of buckets is definitely contained in the P123 software code somewhere, so it would be nice to have the basic granular data (number of stocks or other suggestions, but number of stocks is fundamental) displayed/presented somehow to the user.

The confusion/misunderstanding may have arisen due to the phrase ‘equally weighted’, which has nothing to do with the number of stocks in the bucket.

Try it out for yourself and you will see.

BTW, for clarification, equal-weighted concerns how the index (or in this case, the bucket) returns are calculated. A Google search will show the various methodologies (there are three big categories of indices) for creating an index (a big bucket):

Equally Weighted (P123 Buckets)
Price Weighted (DJIA)
DJIA Methodology
Market Cap Weighted (S&P500) (go to the S&P website for further info).

Here’s a good article on indices (buckets), and there are some other links to similar articles to the right.

Fool.com Construction of an Index

Also the ease of calculation from hardest to easiest is: Equal, Price, Market Cap.

But, to reiterate, equally weighted does not mean that each bucket has the same number of stocks: it means each bucket’s (or little index) returns are calculated based on equal weighting (it makes the return calculation much easier, and is probably a wise choice by P123; a happy median).

That said, it does help to understand the differences between the three types of indices. It is important to understand that a P123 bucket is an equal-weighted index and it is important to understand what the bucket’s returns tell you as opposed to what they would tell you if they were market cap weighted (chosen with a radio button, maybe).

For instance, the Dow 30 is equally weighted, meaning that each of the thirty stocks contribute 1/30 to the index, regardless of their market cap (there is a divisor in the DJIA, which makes price-weighted more complicated than equal-weighted). Alternatively, the S&P500 is market cap weighted.

P123 likely decided that equal weighted was the easiest to implement, simplest to understand, and I think they made the right decision. (It would be nice to have a market cap weighted radio button, though, as well).

If the radio button was clicked you would likely find the returns would gradually approach the ‘market’, but that all depends on what filters you have chosen (small cap, etc.).

I guess what it really comes down to is that there is a lot going on behind the scenes in a ranking (index) methodology, and some basic information like the number of stocks in a bucket would be VERY helpful.

So, no, the buckets do not contain the same number of stocks, which is why it would be a very good piece of information for all ranking systems.

To look at it a different way, imagine an index for which the number of component stocks is always changing. It is natural the number of stocks will change each time we change the index methodology (ranking system), but I would at least like to know how many there are. Think of each new ranking system you create as a similar process to the WSJ Editors deciding every few years which stocks to include in the DJIA. The analogy is apt. Ranking System = Index Methodology. And, in a way, when you create a simulation based on that ranking system (index) you are creating an index fund. Now you know what goes on inside the big fund companies (at least for their index funds).

Also, if you frequently use the Eval function, or boolean ranking, or other filtering (as an index creator would) then you will find that buckets do not always contain the same number of stocks (or, more accurately: almost never).

jtbaccarat · February 9, 2007, 7:19pm

Denny,

Agreed.

jerrodmason · February 9, 2007, 9:15pm

OK, Denny, I remember now. This part all comes back to tie-breaking. If ranking quanta were sufficiently small, there would (except for your next point) be identical numbers in each bucket. But simple ranking systems have large quanta, and so you wind up with a gazillion stocks tied for 43rd best, and none being 42nd best.

Yup. This is an artifact of the annoying Rank-Before-Filter procedure that had been discussed earlier.

[/SeniorMoment] Thanks.

jerrodmason · February 9, 2007, 9:37pm

Amen. A long time ago I put up a feature request to allow drilling down into a bucket. Ideally, to my thinking:
Level 0: Mean return (what we have now)
Level 1: Basic statistics on the bucket contents (number, mean, standard deviation, min, max)
Level 2: Raw data on every transaction in the bucket. Rank, dates, return.

All data should be downloadable to Excel.

martinfierro · February 9, 2007, 10:08pm

With all due respect, do we really need a ranking system of ranking systems ?
I believe the effort to be dedicated to this would be much better employed in of the many requests that have been proposed during the last couple of years.
Martin

jtbaccarat · February 9, 2007, 10:09pm

Jerrod,

Good man; Amen to that, Right on… etc. Excel, yep, and a simple display of the data values, like in the proposed Level?, below, similar to what exists in Simulations->.

Level?: Like the Simulations->Holdings->Ranks->Include Composite & Factor Ranks (i.e., but in this example, for the bucket chosen you can see what the mean values are of all the factors and composites within the bucket; this will give a huge amount of information and would be very helpful in deciphering exactly what we are looking at ‘inside’ a bucket; or as you aptly put it: ‘what does the bucket mean?’)

jtbaccarat · February 9, 2007, 10:18pm

Hi Martin,

P123 is completely built around Ranking Systems
Any successful model will have its success based on the ranking system; if one does not believe that, then one should not use a ranking system at all
Martin, I assume you have some personal methodology of evaluating a ranking system and you do not randomly pick a ranking system. Therefore, would it not be a nice feature in P123 so you could use your personal methodology to find a ranking system either in the public domain or in your private ranking systems that meets your criteria?

Everything is always open to debate…which we can all be thankful for.

Cheers

martinfierro · February 9, 2007, 10:26pm

jtbaccarat:

Let me try to verbalize what is really an intuitive perception.

The concept of ranking implies a single valuation dimension. Absolute positions along a single axis. How we define that axis is irrelevant, the key point is that the ranking is a scalar number, therefore one dimensional.

However, in my opinion, ranking systems cannot be evaluated along a single dimension. Absolute return per bucket is, indeed, a valuable parameter. But there are other elements that become visible when applying in a trading system. As you point out, ranking is an integral part of P123, but at the end of the day P123 is about trading.

Therefore, when selecting ranking systems I look at how they interact with other parts of the trading methodology, what kind of drawdowns they deliver, % profitable, etc etc. Same ranking system, when applied in different trading systems, will deliver very different results.

I am not implying that this would be useless. Just that given the limited results that P123 has, perhaps these results would be better used for some other requests.

Martin

o806 · February 10, 2007, 1:35am

Yes, Yes, Yes. This is a great way to do it.
And Level 1 (Basic statistics) should include Max Draw Down.

InspectorSector · February 10, 2007, 3:13am

Martin -

Somebody correct me if I am wrong but I believe that the ranking system metric is for Stephano and Stephano is only working on EXCEL Add-ins. The simulation add-in is essentially done and the next obvious step is to improve the user’s interface for ranking systems via EXCEL addin. I’m not sure why this activity would impact the other feature requests out there.

No matter how you look at it, trading systems are only as good as the ranking system they are built on top of. You can apply the existing ranking systems to any trading system or trading methodology you want but you won’t end up with a largecap or high liquidity trading system. This is the last frontier and ranking systems will have to be built from the ground up to make inroads. The ranking systems are out there. I’ve already demonstrated that.

My personal opinion is that without more effective tools for designing and ruggedizing ranking systems we will soon come to the end of the road for new ports. There is only so much you can do with what is out there. Right now I feel as if I am shuffling the same small and microcap stocks between one port and another.

Steve