To Quant Or Not To Quant, That Is The Question

Are we talking here about a product thing or a marketing thing?

I’ve been under the impression that we are valuable in a family office context. Can it be argued that this is what Andreas will be doing? I’m not aware of the family office market being tied to the sort of hi-quant for which Jim advocates. My impression is that the family office users is part of a broad cross section, some of whom we serve and some of whom we don’t.

If I’m wrong from the product standpoint (i.e. if the family office users are disproportionately hi-quant) please tell us more.

As to marketing, assuming p123 can now do more with the family office segment, that’s a different story. I think we’ve been equal-opportunity marketers; we market horribly to all segments. I’m not a marketing pro however, so in this area, I’m about questions, not answers. Anything you can share that can help us identify and communicate to this customer group would be greatly appreciated. I suspect that it can, in fact, be a strong growth area for p123, so I’d love to hear about how we can market to it. (As amateurish as I am in marketing, I think I know enough to assume that in going from rotten to better to good, effective targeting is an important early step).

** Edit ** I’m saying that Quant or No Quant is a symptom, not the issue.

I am horrible at marketing myself so take anything I say with a grain of salt. It is just my observation that Family Office is significant and growing. And there are a lot of these types of customers hanging around P123. It isn’t so much a question of the tools P123 provides, but how the site is presented. P123 needs to sell an image that it is “here for the Family Office”. Make it known that you have the platform, tools and resources (consultants) to hold the customer’s hand, get things going, and going successfully in a painless fashion You want to get that message out. Treat it as a separate business unit because right now what you have is a very sophisticated platform that intimidates the average user. There is a manhood issue here. Invariably they come to someone like me to implement their ideas, not because it is too complicated for them to tackle, but because they “don’t have the time”. The technical barrier that needs to be overcome but ego doesn’t allow them to come and ask for help. These people have money to spend, but you need to hold their hands and convince them that P123 is the answer.

SteveT has a great question. What is a quant? Then I guess there is hi-quant?

Then there is running literally thousands of optimizations in the rank performance which is….basic fundamental analysis. You know the average basic fundamental analyst at P123.

Personally, I do not see much differentiation in the labels here.

The bottom line should be…well the bottom line. Is it better? Will it attract customers? How hard would it be?

Again, I think not so hard but could be wrong. Its is definitely better and it will be noticed.

I think it could even be easier for P123 in the long run.

-Jim

For what it’s worth, here are my two cents on the subject. To increase interest in your service, Portfolio123 should do the following:

A) Focus less on promoting optimized ranking systems. The severe underperformance of the DM models should have set off alarm bells. Perhaps the overemphasis of super-optimized quant models–and not the changing dynamics of the market, i.e., growth vs. value–is the main culprit. The result of this poor performance has led to 1) the people designing these models dropping out of your service and 2) newcomers looking at the results and getting discouraged (That DM ranking screen is horrible advertisement for your service!).​ To believe that these DM builders would have outperformed their benchmarks if they only had more quant tools is hard to believe. (Note: I’m all for new tools! I just don’t think that is the solution to the main problem of over-optimized models leading to poor performance, leading to fewer subscribers)

B) Focus more attention, and marketing, on (yes, lowly) screens. The average do-it-yourselfer loves screens and understands them! I would argue that most screens are less optimizable than ranking systems. I have no data to show if they in fact perform better, but try this experiment: Start a new screen and put in just three factors: Your favorite value, quality, and growth fundamental factors (your factors are probably better than mine). Next, take the top 20% of each factor. Add a minimum liquidity value (try $100K a day) and eliminate the most shorted stocks (try bottom 50%). Run a two year backtest. My simple three-factor screen–and, probably your screen, too–would have beat over 80% of all the DMs! I realize these aren’t out-of-sample results, but, still, how is that possible? After all, people spent thousands of hours optimizing (torturing?) the data into ranked systems.​

P.S., I believe the “showing off” of optimized, simulated portfolios in threads (that show amazing stats, of course) is wrong and misleading, especially without proper disclosure. Why do people rarely, if ever, include out-of-sample performance? As a relative newcomer, I look at these beautiful charts and then look up the person’s DM performance and say “what the hell?”

P.S.S., I really enjoy Portfolio123!

Doug

Here is my two bits.

I was able to use some very powerful tools at WorldQuant. They had Python integration and every analytic you could shake a stick at. After a number of years they reduced pay for the Python-generated strategies as people were cranking out tens of thousands of hyper-optimized systems that failed. Last year they did horrible on their fund. Then last month they let go all of their part-time consultants (hundreds of them, me included) and shut down WebSim (the quant platform). It didn’t work they way they had hoped. A failed experiment. They had even offered a free 1-year program to learn quant techniques but this didn’t produce what they wanted.

It is my belief that slowing down the process and creating systems without excessive automation and machine learning is better. That being said, I do think there is room for P123 to automate some data-collecting processes such as having a ‘factor page’ where hundreds of factors are displayed with charts showing the bucketed returns with a user-defined time frame and universe. I am not against making my life easier, but with each automated step there is an increasing responsibility for the creator know what he is doing.

I am dubious as to how many hard-core quants would be attracted to P123 with a few added tools. I think there are other platforms that have more granular data (tick bars), pay-per-use analytics and Python integration such as Quantopian which are a better fit for that market. My personal experience is that hard-core quants want total control at the programming level and that is not something P123 can provide.

I am not is disagreement with you Jim that more quant like tools could help a certain sub-set of people already using P123. Maybe a few people would sign-up as well. But I worry that the money made from this would be too little - too late. I view P123 as being at a critical juncture and I want it to grow even if its in ways that don’t immediately benefit me. My thought would be to target a mass market of lower-hanging fruit until earnings climb and then build out additional tools.

These are my ideas on how to boost revenue and profit for P123 while leaving the door open for more features down the road whether it be for quants, retail investors or someone in the middle.

  1. A separate site focused just on themed stock screens. Every retail investor I have ever known has been overwhelmed and afraid when they see the vast tools of P123. I have been here 8 years and I am still learning new things and feel overwhelmed at times. Make a site like AAII stock screens. Names and themes that people already know and love. Net nets, Piotroski, Graham, etc. etc. Have half the screens for free and the other half available for something like $10 per month. Eventually, some people will take the plunge to develop their own and sign up.
    https://www.aaii.com/stockideas/performance

  2. Bridge the gap between P123 programmers and capital. Showcase our work to family offices and smaller RIAs. Become the workplace so P123 users can make money. P123 can be like agents helping us find work and P123 takes 25% off the top plus charges the RIA or family office platform fees to house the models that we design for them. Become the Elance or UpWork of the stock model world.

  3. As the revenues come in, by all means keep working on the tools, more analytics and features which are requested by firms and P123 users.

I can attest that P123 still has loads of juice. But I don’t find it coming from doing the same old thing over and over. People I consult with are always pushing me to test out new theories and ideas. Ones that I am not comfortable with initially. And every once in a while something beautiful emerges.

I would be happy to work on themed stock screens (although I think P123 already has quite a few) designed in the fashion of AAII and on a separate website. I would work for free on this.

I’m not sure what “quant” means, or if I should feel offended at being called one. /s

P123 is a great service. The Point-In-Time database is worth the subscription price on it’s own. The simulation and screening engines are powerful and flexible. I’m a Python programmer, and I consider P123 superior to any of its competitors. My only real feature requests (international data and older historical data) are already in the works. We use P123 to effectively manage our investments, and have the real returns to convince us it works.

I’m not an expert on machine learning, but I’m a competent enough programmer and have done the work to gather the data and test many of the ideas thrown around on these forums (for example, running various forms of equity curves through Sci-kit Learn’s machine learning algorithms). I have not been impressed with the results so far. Most of the machine learning field appears to be about designing ever more complex variations on linear regression. I have had much more success employing the same boringly basic methods Marc and Yuval advocate effectively for. And P123 provides excellent tools to support those methods.

Python, and programming / machine learning generally, cannot do anything that you can’t do in Excel. Python just does whatever you tell it to do faster and without human interaction. It is not possible for P123 to build a tool that is as powerful as Python without that tool also being at least as complicated as Python.

Kurtis,

Thank you.

I would be interested in what you actually saw being run as far as programs.

Boosting? A Random Forest?

I might also add that I will bet $1,000 here and now they were not running what I run. And for $200,000 you can have what I have been running.

BTW. I am not as married to this as one might think. For the longest time I kept things a big secret. Then it became obvious a lot of people had access to machine learning tools with stock data—as you illustrate here. The secret was already out.

But it has never been that important to me that everyone here at P123 have access to this. It would allow me to do some things with more data but I am caring a little less about whether other people at P123 having access today than I did yesterday even.

That having been said I have re-read some of my posts and I get that I can give a really hard sell. I truly get it.

Anyway, would be interested in what they were running if you know.

Thanks again!

-Jim

Jim,

Guaranteed that you would win $1,000 and $200,000 in your bet. My point is more along the lines of the average user not being able to benefit from these tools. WQ tried to train quants and made them prove their skills in a contest before being hired. They gave them free tools. And it failed. That’s the group I could see P123 enticing with a few extra tools.

Guys who really know how to handle the tools like you might be more inclined towards a different platform whether that be Quantopian or otherwise. I am just concerned that P123 with some additional quant add-ons might not bring in the dollars they need to survive.

Andrew,

You do not post much so I just want to say I have always been impressed with you math skills. I do not even know your training and its does not matter. You either had a lot of training and/or you paid attention in class.

You have helped me with a couple of things I think are important. I probably do not remember this accurately but I remember you saying I worry about slippage too much. You were clearly correct about that.

I also think you first exposed me to i.i.d. I probably still do not get all my assumptions correct when I do some of the things I do. But perhaps I do better.

Anyway, Thanks

-Jim

Kurtis,

Marco turned me onto Quantopian. I keep going back and looking but I do not seem to be able to get what I want. It could be my programming skills but I do not think so.

They do have some cool stuff there like pairs trading.

I need that m x n matrix that has the ticker and date as the (hierarchical) row index and the column is the P123 factors or functions. The label is the next weeks returns. I suspect this is created by P123 when running a sim.

With that you can run a multiple regression, a kernel regression, a support vector machine Random Forest, Boosting…. literally dozens of algorithms.

You can cross validate, do a walk forward validation to reduce the overfitting.

Not all of these will beat a P123 sim. But I can say a little deep reading of some of the advanced texts and well………

Anyway, I very much appreciate this. And as I said more to share with you and perhaps Marc or Marco that to convince anyone.

Appreciated.

-Jim

Interesting thoughts coming in.

Generally, I’m sensing that my view of the hi-quant opportunity is reasonable; that whatever its virtues may be feature for feature, it’s not adding up as a viable commercial path for us.

So the family office thing is about marketing but also, it seems from Steve’s latest post on the topic, a matter of site presentation. Kurtis also made some interesting site-presentation points and some others I need to digest more. I’m very on board with all of Doug’s views and want to think more about some details he puts forth. All in all, much to chew on here.

Jim, I don’t necessarily grasp all the points folks have been addressing to you, but I hope you find their perspectives helpful.

Marc,

Thank you for looking at this. Their perspectives, and yours, are very helpful.

-Jim

I just read this thread. Sorry I’ve been doing coding again, and time flies when you are having fun.

Judging from the response (zero) of what we just released “New Opinion & Watchlist functions!” it’s safe to say that these are not the enhancements you, the active forum user, were hoping for. But this is inline with what we’ve been doing for over a year: working on the new (free) Invest section, linking with brokers to make the investing process easier, real time prices, planning an advisor biz, and adding tools for the “stock pickers”. We didn’t completely stop enhancing the Research tools , but all recent additions have been small. The last big effort if I recall was the variable position sizing. A great , complex tool, that very few use.

Yes, we’ve taken a breather with the Research tools . I’m surprised nobody yet screamed: why the f&^%*! you doing things nobody wants ?? Obviously we think that once all the pieces come together many will want it. But lets back up a little to what started this…

About a year ago S&P changed the rules, and wanted even more changes for the next renewal. I’d rather not get into specifics; it was BAD. I’ve made peace with it. Data providers are all similar, if not the same, and all have hungry sales people with hungry wives & babies. You just have to learn to play the game. It also doesn’t help when you are developing a product that may be viewed as cannibalizing some of their own; which is what would happen if we become more quantish.

With that in mind , and the fact that the quant clientele didn’t exactly grow exponentially , we decided to explore different directions that would not need expensive data, and would not piss off providers (with Factset we have a rev sharing agreement which has a better chance of a long term relationship)

That’s why we’ve taken a break from making P123 more quant-ish. And yes, some of us do not have the quant know-how, me included. I’m more of a stock picker/programmer, and my statistics is long in the past (also it doesn’t help that I’ve done really well stock picking biotechs using a simple screen and the fundamental chart). But we do have quant people that can help us like Marc & Riccardo, university professors, students, and WE DO want to get back to work on Research and making P123 more appealing to quants/programmers. That’s fun to. Some quant additions are low hanging fruits and would please many. Other things I would like are APIs, automation, better long short, IC, parameters for custom factors, and some of Jrinne suggestions which I would love to discuss at some point.

So what are some low hanging fruits? What are the quant additions that would generate good ROI ( with ROI defined as quant guy happiness inversely related to development time) ?

Thanks

Thanks a million for these suggestions. Allow me to go through them one by one.

  1. What do you mean by custom data series? I’m intrigued. I can call up what I think of as custom data series using the Custom Series tool. For example, if I want IBM’s operating margin as a series, I would create a custom series with two rules: UnivSubset(“IBM”) and UnivAvg(“1=1”,“OpMgn%TTM”). I could then name that series and use it with the GetSeries command. The custom series feature is very powerful. But you probably mean something else–please do explain.
  2. We have a 500-character limit. Would a higher limit do the trick? Or is something else called for? I find it easier to manage data by embedding custom formulas rather than putting everything into one formula. In order to calculate intrinsic value, for example, I use about thirty different custom formulae, but each one can then be used for other things too.
    3 through 5. Much of this can be done–and has been done–by users using Python and Ruby, but I agree, in principle, that it would be good for P123 to make it easier, and I will discuss your suggestions soon.
  3. This is quite ambitious, but again, I’ll certainly bring it up for discussion. If, in the meantime, you could address my questions about 1 and 2 I’d appreciate it.

Thanks!

We are going to be starting a series of webinars to address these concerns, probably within the next few weeks. Stay tuned. If a user wants to conduct a webinar for other users, that might be worthwhile too.

Great points. There’s a very basic problem that’s at the root of this.

Portfolio123 offers users a chance to design and backtest strategies. Because of this capability, users are going to want to optimize those strategies, and the most active users are the ones who will spend the most time doing so. I speak from experience because I was one of those users myself, and I’m still trying to figure out whether or not to optimize, and what alternatives there are to doing so, and what exactly constitutes optimization. So the urge to optimize is inherent in what P123 offers. It takes a tremendous amount of discipline to fight that urge. For us to fight optimization would be like a videogame maker telling its players not to try to win. People are always going to treat P123 like a game, no matter what we do. It’s part of the fun of using this site: look, Ma, my backtest gets a 50% CAGR!

If P123 takes the stance of “optimization will harm your out-of-sample performance,” we’d be basically telling our users not to use our tools. And we can’t KNOW that it’s the case. I’ve been optimizing systems for four years and have maintained a 30% CAGR during that time. Maybe I would have done even better with less optimized systems. I don’t know.

I agree, even though I have been guilty of this myself. If people want to crow about their out-of-sample performance, as Andreas and I do quite often, that’s immodest but harmless. In a way, so is crowing about a 50%-per-annum CAGR in a simulation, except that it encourages overoptimization and data-mining. But the worst is to present a simulated model as if it were actually implemented. That is a no-no.

I have north of a thousand custom formula because I have to embed them. Last I checked the character limit was 250. It would be helpful if the custom formula character limit was increased substantially.

MORE IMPORTANTLY, allow the formula input box to be expanded, so that one can see all that is being entered.

It’s 500 characters now. Thanks for the suggestion, and I’ll see what can be done.

Thank you Yuval

Marco,

Thank you.

P123 is a wonderful tool! It works well exactly as is. Machine learning does work better for certain situations. For example, the linear (and normalized) equations now used for the ranks are one reason model performance drops off (generally) with models that have more than 5 stocks (I would love to show you this some day). But it is a fact that P123 cannot do as well as other methods with more than 5 stocks when the data is not linear.

This is all to say P123 works well. And if there are any limitations one can work around most of them—at least to some extent. For example, run three 5 stock models instead of a 15 stock model.

Getting to the point. All of Python’s useful features could be unleashed if we could have access to the m x n matrix (also called an array or DataFrame) that you (must) create when you run a sim. The array would have a hierarchical ticker and date row index. The columns would be the returns over the rebalance period (e.g., next week) and all of the factors or functions in the sim. The data points would still be the rank.

Question: Isn’t this matrix sitting in memory or on the hard drive at some point during a sim run? If not can it be created easily?

If so, why not move this matrix over to a PC loaded with Python (all Open Source) and allow us to have the most advanced quant system that one could create? With no download to the user that a data provider could object to.

Outline of points:

  1. Python is free. The extensive Libraries are also free and Open Source.

  2. The Matrix already exists or may not be hard to create.

  3. It could be done on a laptop (for one individual). I defer to you as to how much you want to scale this up. But I would pay for a PC to get this started. And what I really mean is you could probably buy the equipment with reasonable user fees. Equipment that can do a lot with a little: one server and not that big of a server, I think.

Summary: The overhead many not be big.

How useful really?

First: not like Quantopian. Quantopian is extremely limited FOR WHAT WE DO HERE. P123 is better even without the Python. Perhaps we can take this as a given for now.

There is a huge body of people with machine learning skill out there.

MARC IS A MACHINE LEARNER. He really is. The econometrics he learned when getting his degree is a type of machine learning. It is also called multiple regression.

RedShield and pvdb are two P123 members who have already explicitly expressed interest in doing multiple regression. RedShield ran a hedge fund that used multiple regression as its quantitative method.

Maybe not Marc but everyone with a degree in Finance would want to try an econometrics model.

But there is an entire world of machine learners out there who use other methods. Tom Yani another P123 user has used BigML to run Random Forests, which is a common and relatively easy machine learning method. RedShield has run some Random Forests.

There are a few million people who run (and believe in) Random Forests. If they invest in stocks they will want to check it out. Where else could they do it? Nowhere else.

There is a huge body of machine learners who have gone beyond multiple regression and Random Forests. They may be in the Silicon Valley, work at NASA, work in Universities, work for GOOGLE or FaceBook. If you count all of the technophiles in Europe and Asia then I believe you have a huge market. They will want to run Support Vector Machines, Kernel Regressions, LASSO Regression, Ridge Regression, LOESS, MARS, EARTH, Random Forests, non-linear regressions, polynomial regressions, splines, C4.5,……even Deep Learning. There are a lot of people that do this. I see them on the internet.

All of the above algorithms can be run with Python and the libraries. I have done all of them (except deep learning) on a laptop (some with R). We do not need to understand them to make them available. Only Python and the data we already have would need to be managed.

I can show you that some of the methods above work well.

You probably already know that PANDAS in Python was created by AQR Capital Management for their use. De Prado from AQR Capital Management has a book about how this can be used by retail investors. It is a serious tool now available as Open Source.

Bottom line: one matrix (per customer), one PC (possibly scaled up) and you have a HUGE market. And it is state of the art quantitative investing.

Ease of use:

For me the hardest part of Python is the data wrangling. After that the programing is not hard. The scripts are extremely short. People could share the scripts. Maybe P123 could write a few scripts.

I had one Fortran Course in college. I did audit a DOS course after I got out of Medical School so I could use Windows. If I can do it anyone interested in quantitative investing can do it: even if it is a hobby—as is my situation.

So that is it. If that m x n matrix is sitting in memory after running a sim you could reap a big profit from it, I think.

If that m x n matrix already exists you should use it.

What is a pitch deck? Should I get one;-)

Thank you Marco. I would love to answer any questions. Maybe I could even show a few examples through email, over the internet or even in person.

-Jim