LLMs are coming to P123

marco · February 14, 2025, 3:23pm

Dear All,

We are brainstorming use cases for LLMs for P123. I created a 2 GPTs to be an expert in writing and explaining P123 formulas. You can access them here

For an example prompt below, V1.0 gave a very coherent and correct answer with great alternatives (see below). But, for a different prompt, it proposed using a function, InIndex("SP500"), that does not exist. So it still needs additional guidance. Or more guardrails.

I'm just learning how to create specialized LLMs. I have no idea how much more effort it will take to generate a true P123 language expert. It probably needs a lot more examples, and feedback when it's wrong, although the process seems slow.

I'm also starting to feel like the key to creating super smart P123 GPTs is you, the P123 community. Our product would not have evolved nearly as much without your great forum contributions and questions.

Here are some ideas

Create different domain expert GPTs
Personal AI agents that manage your account using a strategy
Community curated GPTs (labeling, examples, etc) to make them smarter for benefit of all.
Designer GPTs: created by you, hosted on p123, that leveraging our infrastructure.
And, why not, P123 crypto token for rewarding contributors.

Looking forward to your feedback.
Cheers

PS. you can find all the files, data and instructions that I used to create the GPT in this google drive folder. I even used a decades old document written by the legendary Marc Gerstein.

Prompt Example

You said:

find stocks that are in the top decile by market cap

ChatGPT said:

To find stocks that are in the top decile (top 10%) by market capitalization, you can use the FRank function in Portfolio123, which ranks stocks based on a specified factor.

P123 Formula:

FRank("MktCap", #All, #DESC) >= 90

Explanation:

FRank("MktCap", #All, #DESC)

This function ranks all stocks in the database based on their market capitalization (MktCap).
#All specifies that the ranking should be applied to the entire stock universe.
#DESC ensures that higher market cap values get higher ranks (i.e., the largest stocks receive the highest ranks).

>= 90

Since the FRank function assigns values from 0 to 100, filtering for stocks with a rank of 90 or higher ensures that we only select those in the top decile (top 10%).

Alternative:

If you want to compare market cap within a specific sector or industry rather than across all stocks, you can modify the formula:

Top 10% within Sector:

FRank("MktCap", #Sector, #DESC) >= 90

Top 10% within Industry:

FRank("MktCap", #Industry, #DESC) >= 90

Would you like any refinements, such as filtering for specific sectors or countries?

Jrinne · February 14, 2025, 10:25pm

Very nice!

So ChatGPT has an "Operator" now that can actually manipulate a web page and also an option called "Canvas." Canvas puts a window to the right side of the dialogue.

An example of what Canvas does is open a window to the right with Python code, and either the user or ChatGPT can make changes directly in Canvas or through the dialogue.

If a user and ChatGPT could collaborate on a ranking system in Canvas, and either the user or ChatGPT could then run the rank performance test after that collaboration, I think you’d have a powerful marketing tool.

It would be a fusion of P123 classic and advanced AI, bypassing rudimentary ML models like random forests (joking only a little here). Call it P123 Classic AI.

Maybe even let ChatGPT work on a model overnight (using the rank performance test as feedback) and come back in the morning with results to collaborate further upon (for a fee). Fee based on token usage, I assume. ChatGPT could even help the user set up cross-validation in P123 Classic AI—training on one period and testing the results in another—while keeping track of the results and explaining the process along the way.

Purely a marketing idea if it’s even feasible.

korr123 · February 15, 2025, 12:04am

So smart Marco! Thank you for sharing it and the source docs!

I hope you use using them in your ide. You'll be able to do write way more code much faster with fewer people.

marco · February 15, 2025, 2:50am

Yes, of course. We've tried so many ways to make the site more usable for new users without much success. Courses, videos, etc., require large time commitment.

Users get easily discouraged and frustrated right now. And it makes sense. They come here to find ways to beat a market. But the market is doing great. So why bother.

But becoming a ChatGPT wrapper is not the answer either. We'll hopefully find ways to integrate GPTs in ways that improve the experience greatly. And perhaps do other things besides just help you create strategies. For example, there are a lot of friction points in managing your own portfolio

marco · February 15, 2025, 2:59am

Yeah, pretty cool UX. Redoing the ancient ranking system editor is a top priority project. Perhaps we should just jump to a an GPT version.

Michael7 · February 15, 2025, 12:35pm

I think the use of LLMs to support the creation and explanation of P123 formulas is a fantastic idea. Your initial GPT approach already shows great potential.

In addition, I believe it would be extremely valuable to improve the accessibility of documentation and knowledge within the forum. Relevant information is often available, but hard to find or not optimally structured. This is where Retrieval-Augmented Generation (RAG) architectures could be a game-changer.

A RAG-based approach could help by:
• Retrieving forum and documentation knowledge more efficiently allowing LLMs to provide helpful responses
• Better linking and summarizing scattered information, making it easier for users to find relevant content.

AlgoMan · February 16, 2025, 8:55am

I tried a bit, the LLM generated a buch of interesting factors, but way to many is incorrect and he caouldn't get the factor code right. Is there any document with the complete Factor References including the Full Description to train the LLM on?

AlgoMan · February 16, 2025, 1:37pm

I gave CG loads of example factors and XML ranking systems and asked it to create a Low-Risk Factor Ranking system for small-cap stocks. It came up with something that's actually not bad—it even works on large caps.
https://www.portfolio123.com/app/ranking-system/511315

One thing I struggle with is getting stuck in my own way of building ranking systems. After spending thousands of hours creating ranking systems, you eventually over-optimize your own thinking. I tend to slowly slide back into adding factors I "know work." Having an AI assistant that can challenge your way of thinking can be incredibly useful.

Jrinne · February 16, 2025, 1:47pm

Great post! Just to add to your thoughts, P123 might also be considering whether a well-designed ranking system—like yours—could help new users avoid frustration and stay engaged.

A key question is whether a beginner, with the help of ChatGPT, could develop something similar within a couple of weeks while still incorporating the ideas that initially attracted them to P123. If so, this could significantly smooth out the early learning curve and improve user retention.

I haven’t explored this enough to say for sure, but if P123 has data on how many users start and then leave after a certain number of months, that could provide insight into whether something like this has real potential for user retention. That is not to say there might not be other more important considerations for present members.

marco · February 16, 2025, 2:56pm

Yes, all our factors & function reference is in the google folder link in first post. Look in the CSV folder. Note that the CSV files are not very human readable.

Jrinne · February 28, 2025, 11:44am

There’s a lot about computers, AI, and especially LLMs that I don’t fully understand. But here’s what I do know:

P123's current model is based on ChatGPT-4 Turbo, but the more advanced ChatGPT-4.5 is launching Monday for Plus users.
ChatGPT Deep Research is also available, offering a powerful search option that could integrate web-based research with P123’s tools.
At times, a reasoning model like ChatGPT o3-mini-high or Grok 3 Think could be useful for different types of problem-solving.
The API costs money, meaning extensive queries could add up quickly if P123 were to rely on an API instead of allowing users to leverage their own AI services.

That raises an important question:

Why doesn’t P123 provide a single, structured PDF with all factor & function references? Users could then upload it into their own ChatGPT-4.5, Deep Research, Grok 3, or Claude 3.7—whichever model works best for them.

This would be a one-time download that users could save locally and upload into their preferred LLM, allowing them to work with state-of-the-art AI models without relying on an expensive P123 API. Plus, if one LLM produces garbage or hallucinations, users could test multiple models instead of being locked into a single system.

Since the same token context limit applies either way, why not keep the current system while also offering a structured PDF?

This approach would let users take advantage of the most advanced LLMs available, while being simple and low-cost for P123 to implement.

Edit: An agentic API that could run the optimizer and rank performance tests would provide significant advantages to both P123 and its members—but it would likely require deeper integration with P123 than a simple PDF download , with increased security controlling LLM access to tools and accounts, and come at a high cost.

Thoughts?

marco · February 28, 2025, 1:42pm

All the reference is in the google folder. I just uploaded a new simpler version , with simpler instructions.

Isn't this format better? I ask three data scientist/engineers and I get three different answers for what's needed to produce a good LLM for writing/explaining formulas:

Use the biggest/smartest model and use just the reference and many samples
Use the biggest/smartest model and just use the reference
Train from scratch a smaller model (DeepSeek) with reference and many samples

We're trying all of the above.

Another interesting approach is to involve the community to help create datasets for the different LLMs we're planning. For ex: we created a backend to enter formula prompts and answers that will be used for training. We could expose it to users and they can submit different formula examples. There are other labelling ideas too, perhaps with charts.

Not sure we need an agent for that. Just to run things in batch mode?

Jrinne · February 28, 2025, 3:42pm

marco:

All the reference is in the Google folder. I just uploaded a new simpler version, with simpler instructions.

That works. Downloading the files from Google and uploading them into an LLM of the user’s choice should be a viable solution.

The key takeaway, though, is that come Monday, I’ll be doing this with ChatGPT-4.5, o3-mini-high, or something other than the current implementation that uses ChatGPT-4 Turbo.

As long as the token context limit is greater than the total tokens in the documentation, a more advanced LLM (4.5 or later iterations) may be able to provide everything needed—that’s my broader point.

ChatGPT Deep Research is an example of this shift. Instead of relying solely on pre-trained knowledge, it performs research on the fly with reasoning, reducing the emphasis on static training data.

marco:

Not sure we need an agent for that.

I agree—it’s probably not necessary given the cost.

That said, it would be slick to have a ranking system sidebar where the user or LLM could modify factors, run a rank performance test, and then a sim—all in one window or semi-autonomously with the optimizer.

I wouldn’t put that on a to-do list either, but it’s an interesting idea.

plan_trader · March 12, 2025, 11:13pm

I thought I would try to create the seeking alpha ranking system and started using this LLM. It doesn't get a lot correct unfortunately, which is a bit disappointing. You have to keep telling it that the formula is incorrect - sometimes numerous times before it eventually gets it. It does have some value. I can't tell if it learns from its' own mistakes.

marco · March 13, 2025, 12:04pm

It's a prototype. We are trying different things

plan_trader · March 13, 2025, 11:23pm

Sorry if I sounded negative. It will be extremely useful when it is more accurate, which I guess is the point.
Besides the syntax issues it does give some great ideas. I asked for ways of measuring free cash flow growth and it gave five useful suggestions.

ljstens22 · March 16, 2025, 5:11pm

I like that you guys are exploring this space. Good call!

Jrinne · March 28, 2025, 5:34pm

So, near as I can tell, this setup is based on what's called "stuffing" — essentially loading relevant files into the context window, rather than training a custom model. It's a smart first move, and it sounds like P123 is actively exploring alternatives as well. **

ChatGPT (at least the version I tested) believes the full set of P123 files fits into its context window, which is pretty impressive. So: a very nice first step indeed! People seem to generally like the results so far. And as Marco pointed out above, we always have the option of trying other LLMs, like Gemini 2.5, on our own with these files — which opens up some interesting possibilities.

I do suspect that through either user experimentation or future steps by P123, we'll see more testing across different or newer LLMs. Some of those may end up being even more accurate or better aligned with our use cases. All of this is meant to be positive — it's clearly a good development.

I was mostly just curious about what was actually happening behind the scenes with this "Character" in ChatGPT. I should note that ChatGPT isn't always fully aware of its internal architecture, so I could be completely wrong — but that's my understanding so far.

Integration with an LLM that does internet searches, thinking (Chain-of-Thought) or both would be one obvious thing to try as a member looking for useful features in a ranking system. Of course, you would need to have the search and the documents all fit into a context window which is clearly possible with many newer LLMs.

Thanks, Marco!

davidste · April 15, 2025, 7:59pm

Brilliant! I'm already pelting it with dumb questions and it is helping me refine the question and offering possible solutions.

"How do I find stocks that have the highest streaks for consecutive daily gains over the last 3 years?"

This really helps me get through the many possiblities into something that has a good chance of working!

MUCH APPRECIATED!

marco · April 16, 2025, 1:01pm

I don't think there's a way to do this. We need a new function.

What else is along these lines? If there are a few use cases it makes it easier to justify adding a new function.