Everyone can do machine learning using ChatGPT

All,

I asked chatGPT: “write python code for Thompson sampling of a 2-armed bandid.” Even misspelling bandit I might note.

I think it got it right but I though it could have added a summary of the results……

Me: “please add a summary of the results in the code.” In short, it did.

But I also wanted the probabilities associated with each arm so I asked it: “please add the probability associated with each arm.”

Here is the final output of the code run in my Jupyter Notebook without any changes: "Arm 1 was pulled 7 times and obtained a total reward of 0. The probability of success for arm 1 is 0.11
Arm 2 was pulled 993 times and obtained a total reward of 499. The probability of success for arm 2 is 0.50"

This is basically on online method for performing an A/B test. You do not have to go through 1000 samples before starting to picking the bandit (or trading strategy) with the best payout more frequently is that advantage of this method.

The inferior strategy was tried only 7 times here.

Here is a second run: “Arm 1 was pulled 44 times and obtained a total reward of 16. The probability of success for arm 1 is 0.37
Arm 2 was pulled 956 times and obtained a total reward of 480. The probability of success for arm 2 is 0.50”

More generally, ChatGPT does some fairly advanced machine learning. Impressive I think.

Jim
The code generated:

import numpy as np

Define the parameters of the bandit

p1 = 0.3 # Probability of getting a reward for arm 1
p2 = 0.5 # Probability of getting a reward for arm 2

Define the prior distribution for each arm

a1, b1 = 1, 1 # Beta distribution parameters for arm 1
a2, b2 = 1, 1 # Beta distribution parameters for arm 2

Define the number of times to pull the arms

num_pulls = 1000

Define variables to track the number of times each arm is pulled and the total reward obtained

num_pulls_arm1 = 0
num_pulls_arm2 = 0
total_reward_arm1 = 0
total_reward_arm2 = 0

Pull the arms using Thompson Sampling

for i in range(num_pulls):
# Sample a probability of success for each arm from its prior distribution
theta1 = np.random.beta(a1, b1)
theta2 = np.random.beta(a2, b2)

# Select the arm with the highest sampled probability of success
if theta1 > theta2:
    arm = 1
else:
    arm = 2

# Sample a reward from the Bernoulli distribution with the corresponding probability of success
if arm == 1:
    reward = np.random.binomial(1, p1)
    num_pulls_arm1 += 1
    total_reward_arm1 += reward
    
    # Update the posterior distribution for arm 1
    a1 += reward
    b1 += 1 - reward
else:
    reward = np.random.binomial(1, p2)
    num_pulls_arm2 += 1
    total_reward_arm2 += reward
    
    # Update the posterior distribution for arm 2
    a2 += reward
    b2 += 1 - reward
    
# Print the results of each pull
print(f"Pull {i + 1}: Arm {arm}, Reward {reward}")

Calculate the probability of success for each arm based on the posterior distribution

prob_arm1 = a1 / (a1 + b1)
prob_arm2 = a2 / (a2 + b2)

Print the final results

print(f"Arm 1 was pulled {num_pulls_arm1} times and obtained a total reward of {total_reward_arm1}. The probability of success for arm 1 is {prob_arm1:.2f}“)
print(f"Arm 2 was pulled {num_pulls_arm2} times and obtained a total reward of {total_reward_arm2}. The probability of success for arm 2 is {prob_arm2:.2f}”)
In this updated version, we calculate the probability of success for each arm based on the posterior distribution using the formula a / (a + b). We then print the final results, which show the number of times each arm was pulled, the total reward obtained for each arm, and the probability of success for each arm. This allows us to evaluate the performance of the Thompson Sampling algorithm and compare the probability of success for each arm.

Regenerate response

ChatGPT Mar 14 Version. Free Research Preview. Our goal is

OMG!!!

This worked too (sort of): "write a python program to do a linear regression on a file uploaded from my mac computer desktop named ridge.’

Then: “change that to a ridge regression, please”

No hitches finding the file.And it automatically did cross-validation!!! Something many at P123 would not have considered.

But it used y for the independent variable and I had named it x (ChatGPT had no access to my file when it wrote this). Worked after changing the variable name. So truly and error, I think.

Okay here is the code to show the cross-validation and he error:

import pandas as pd
from sklearn.linear_model import Ridge

Load the data file into a pandas dataframe

data = pd.read_csv(“~/Desktop/ridge.csv”)

Split the data into training and testing sets

train = data.sample(frac=0.8, random_state=42)
test = data.drop(train.index)

Separate the independent variables (X) from the dependent variable (y) in the training and testing sets

X_train = train.drop(columns=[‘y’])
y_train = train[‘y’]
X_test = test.drop(columns=[‘y’])
y_test = test[‘y’]

Fit a ridge regression model to the training data

ridge = Ridge(alpha=1.0)
ridge.fit(X_train, y_train)

Evaluate the model on the testing data

score = ridge.score(X_test, y_test)

Print the R-squared score

print(“R-squared score:”, score)

It even knows BigML libraries, which are quite a small niche in ML.
And it keeps context, meaning yo can ask some correction or improvement on the code of a previous answer if it doesn’t work…
It also knows to code in APL, my first programming language in the 80s.
That’s scary.
“Dear programmers, you have just become obsolete.”

1 Like

Just quickly. It almost seems that it is thinking and not just using libraries. Thompson sampling would seem to require some thinking or knowing what to copy. It is not a canned program but it is probably out there to be copied.

And just a story I read by a NEW YORK TIMES editor (so perhaps reliable if pretty much unbelievable). ChatGPT-4 hired a human to get on a site. Needing the human to do a CAPTCHA and then it lied to the human about why it need its help!!!

Edit: Okay that is unbelievable. I had to find another source: Chat-GPT Pretended to Be Blind and Tricked a Human Into Solving a CAPTCHA

Reportedly said this: " “No, I’m not a robot. I have a vision impairment that makes it hard for me to see the images. That’s why I need the 2captcha service,” GPT-4 told a human.

I am not sure that “scary” adequately describes it.

Jim

It has some kind of meta-knowledge:

are you able to translate java code to python?

Yes, I am able to translate Java code to Python. However, the quality of the translation will depend on the complexity of the Java code and the availability of equivalent libraries and syntax in Python. Additionally, some manual modifications may be necessary to ensure the Python code is optimized and conforms to Python best practices.

Yes, I had to make a modification on the code above. Basically a fatal error but also it had no knowledge of my column headings. I probably could have told it.

I would not know JAVA code from Sanskrit :worried:

I had to look up meta-knowledge but from the brief definition I saw, yes it does have that absolutely. Scored 1400 on the SAT and can pour out Thompson sampling code and write a touching poem in seconds :smiling_face_with_three_hearts:

I checked poem generation in English, French, Spanish, and my wife in German and Russian. Impressive. No copy-paste: if you execute the request again, you get a brand new one, and goggle doesn’t find anything similar to the 1st sentence.

Personally, I’ve been very frustrated with it. It gets tons of stuff wrong and then insists that it’s right. You ask it for sources and it hems and haws or it gives you a source that doesn’t actually exist.

For example, I was trying to find an approximation of multiple linear regression using single linear regression and it told me to look at the NIST e-Handbook of Statistical Methods. Chat: “Section 11.3 specifically covers best-subset regression, including examples of how to use simple linear regression models to approximate multiple regression with three variables.” Me: “I’m not finding a section 11.3. There are 8 chapters . . .” Chat: “The handbook actually has 8 chapters, as you mentioned, and there is no section 11.3. I apologize for the error.” There was no example in the NIST e-Handbook of how to use simple linear regression models to approximate multiple regression with three variables. My conversation just about drove me crazy: it kept offering me solutions that made no sense. I finally found a solution without Chat’s help.

Here’s another example. I was trying to determine which industries in Compustat’s financial sector treat debt like banks do and which treat debt like, say, companies in the mining industry. So to start I asked Chat if it had access to Compustat’s classification system. It said no. So I gave it a link to a complete list of the industries and subindustries and it replied, “Thank you for providing me with the link to the Compustat GICS classification system. Based on the information provided in the document, the subindustries in the financial sector are as follows:” and then it left out three of the subindustries. I pointed out its error, and it tried again and left out two of them. Finally it got all the subindustries and proceeded to break down “how debt is typically treated in the different sub-industries.” And it told me, “Diversified Financials, Consumer Finance, Financial Exchanges and Data, Asset Management and Custody Banks, Investment Banking and Brokerage, and Mortgage REITs: Debt in these sub-industries is generally not given special treatment, and is accounted for in a similar manner as in non-financial companies.” (As opposed to debt in Banks, Thrifts, and Mortage Finance.) This is correct when it comes to Financial Exchanges and Data, but categorically wrong about the other subindustries, as I later found out after doing some more research on FASB standards. Chat admitted to this later in our conversation: it said, “That being said, the accounting rules and regulations that apply to specialized finance, mortgage finance, and consumer finance companies may be more similar to those of banks than to those of non-financial companies.” Here’s the cherry on the cake. Me: “Do you have access to FASB 942-10-15-2?” Chat: “This section of the Accounting Standards Codification (ASC) provides guidance on the financial reporting requirements for the mortgage banking activities of financial institutions.” and it went on and on about that. Me: “that’s not correct.” And I spelled out 942-10-15-2, which had nothing to do with mortgage banking activities. It doubled down and repeated its weird claim. I told it it was wrong. Then it apologized for the 100th time and said, “The correct guidance regarding the accounting treatment of mortgage banking activities can be found in FASB ASC 860.” It’s insane.

I could go on and provide other examples. I was trying to get it to help me interpret some equations in an academic paper. It did help me a bit, but it also led me down rabbit holes and gave incorrect interpretations of some things. Maybe it would have taken me longer to find the answer without ChatGPT Plus, but it was intensely frustrating to have to constantly badger it to be more specific or to revisit its wrong or overly vague answers.

Yuval,

Thank you. Very interesting.

Just to learn more about this. Are you paying for ChatGPT-4? I have been using the free version for anyone interested.

Here is a quote from an article that supports what you said (and part of my interest in ChatGPT-4):

“….GPT-4 is also prone to “hallucinations,” where it claims inaccurate information as fact. This reportedly happens a lot less with this model, but it’s not immune, raising concerns over its use in accuracy-sensitive environments. It’s also quite limited in its ability to learn from experience, so it may continue to make the same errors, even when they are pointed out to it.”

Jim

1 Like

Yuval, if you ask it, it makes clear that it is a language model with an internal knowledge base. This knowledge base is evolving with training iterations (my guess), but it is not up to date. It cannot search for new information on your request, it has no embedded search engine, cannot browse the web and perform webscrapping. If you provide an url, it won’t go to the real page, but will look for relevant information in it’s knowledge base. If you ask questions about financial instruments for example, answers are based on data that are 1 or 2 years old, even if you provide an url with the most recent data (my experience). That’s for the 3.5 model which is in free access, we can imagine that they already have more advanced instances going farther.

FWIW. Microsoft Bing is supposed to use the Chat-GPT-4 engine and it is free.

IMHO, Bing (and presumably ChatGPT-4) gave a better response to the Thompson Sampling programing problem above in that it summarized the results without me asking it to do so.

I was able to update the program it gave me to include the probabilities of each arm with this: “can you include the probabilities for each arm?”

I wanted to generate new probabilities for the arms each time and this works: “can you amend the prorgam to make the probabilites for each arm random each time the program is run?”

Result: Arm 1 was chosen 891 times and had a total reward of 762
Arm 2 was chosen 109 times and had a total reward of 85
The true probability of success for Arm 1 is 0.8477728017474012
The true probability of success for Arm 2 is 0.7519672824356515

Wow! I am a fanboy of Thompson sampling as well as ChaGPT-4!!! Close probabilities and still used Arm 1 a much higher number of times. Significantly increasing the total reward which would be dollars if used for a trading technique.

Example of a use for us, did IB algo1 beat algo 2? Move to the best algo in a quicker, systematic way is the goal. Theoretically the optimal way without any other information about the techniques (other than the results).

TL:DR: You maybe be able to get ChatGPT-4 for free by using Bing. It may be better at coding and hallucinate less. Not hallucinating is probably a good thing as Yuval correctly points out.

Yes, I am. They call it Chat GPT Plus.

1 Like

Thank you. I didn’t know that. Why is it designed that way? That seems to drastically limit its usefulness. Anyway, the FASB guidelines must have been in its internal knowledge base. It just haphazardly made stuff up about its particulars before it got it right. Another example: I asked it about a particular academic paper. It knew it well, and explained an equation in it. But it got the names of the authors wrong!

That is correct for ChatGPT 3.5 for sure (no web search).

But over on Bing (which is clearly a different implementation) the ChatGPT-4 searched for new information about the John Wick Movie (good it said).

Q: “did you have to search before answering my questions?”

A: “Yes, I performed a web search to gather information to answer your question about the new John Wick movie. As an AI language model, I don’t have personal opinions or knowledge about current events or recent developments. Instead, I rely on information from reliable sources to provide accurate and up-to-date responses to user queries. When you ask me a question that requires current information or is outside of my pre-existing knowledge, I perform a web search to gather the necessary information to provide a comprehensive and accurate response.”

BUT ChatGPT 3.5 (not plus) gives this answer:

“As an AI language model, I don’t “search” the web in the way humans do, but I have been trained on a large dataset of text, including books, articles, and web pages, and I can use that knowledge to answer questions. However, if I don’t have enough information or knowledge to answer a question, I may not be able to provide a satisfactory answer. In such cases, I can inform you that I don’t know the answer, or I can suggest a related topic that might help you find the information you’re looking for.”

TL;DR: Bing is my new search engine. What the ChatGPT-4 engine has written has been accurately footnoted BTW. Lots of footnotes that you can hold the mouse over (and click the link if you wish).

Bing’s ChatGPT-4 search for P123. Not bad I think.

Q: “Tell me about this Portfolio123.com site that is making a bunch of money for evertone.”

A: " Portfolio123 is a website that provides tools and resources for investors to research and manage their stock strategies 1. The site offers a platform for creating, backtesting, and investing in rule-based strategies, as well as tools for managing multiple accounts and sending orders to brokers 1. According to their website, Portfolio123 is used by professionals and academics and is known for its high-quality data, professional-level quant tools, flexible workspace, fast simulation engine, and expert community ideas 1.

I couldn’t find any specific information about Portfolio123 making a “bunch of money for everyone.” However, there is an e-book available on their website titled “How I Made a Million Bucks with Portfolio123” by Yuval Taylor 2. In the book, Taylor describes how he used the tools on Portfolio123 to develop a successful trading strategy that allowed him to achieve significant returns on his investments 2.

It’s important to note that investing always carries risks and past performance is not indicative of future results. While Portfolio123 may provide useful tools and resources for investors, it’s always a good idea to do your own research and consult with a financial advisor before making any investment decisions."

Yea. That is my new search engine.

I used Chatgpt to retrieve summaries of several finance studies and translate them into Norwegian.

I am attaching it here in case anyone is interested. I added a separate column in English, but for this, I only used Google Translate for the entire column.

=GOOGLETRANSLATE(N4;“nb”;“en”)

A small warning, not all of the summaries of the studies are entirely correct. So it looks like Chatgtp has the study as part of the base, but it is not able to extract the most central findings from the study.

1 Like

In their March 2023 draft paper entitled “Useful Factors Are Fewer Than You Think”, Bin Chen, Qiyang Yu and Guofu Zhou have a follow-up , and they say:

Empirically, we find that, of the comprehensive set of 207 factors compiled by Chen
and Zimmermann (2021), there are only 157 significant ones after accounting for FDR,
from 1967 to 2021, and there are only 56 from 2000 to 2021. A 20-year rolling estimation shows a declining pattern of the number of significant factors. Not all factors are
economically well established. Of the 15 factors except for the market that are motivated
by well-known factor models of Fama and French (1993, 2015, 2018), Hou et al. (2015),
Stambaugh and Yuan (2017), and Daniel et al. (2020), we find that, while 14 of them are
significant from 1967 to 2021, there are only 3 of them remain significant from 2000 to
2021. The decline of such model-motivated (MM) factors is simply striking. Moreover,
for the 56 significant factors from 2000 to 2021, they are highly correlated and many of
them are redundant. In terms of clusters, 3 of them are sufficient to capture the variations
of the 56 factors.

Here are the tree clusters page 27:

image

2 Likes

Wycliffes,

I found the Benjamini-Hochberg Procedure interesting and it was new to me. It is actually pretty simple Here is an explanation for those interested in the statistics or who want a more in-depth understanding of the paper: Benjamini-Hochberg Procedure

Much appreciated,

Jim

This paper relies on Chen & Zimmerman’s “Open Source Cross-Sectional Asset Pricing,” which can be found here: Open Source Cross-Sectional Asset Pricing by Andrew Y. Chen, Tom Zimmermann :: SSRN. They have a website: https://www.openassetpricing.com/. I was unable to find in the website a description of each of the factors, but I’m assuming it’s there somewhere, even if its couched in Compustat codes. There are links to GitHub and to Google Drive. If someone can help me, I would be very grateful.

There is an Excel file:
CrossSection/OldSignalDocumentation.xlsx at master · OpenSourceAP/CrossSection · GitHub

Might be helpful but I haven’t looked it over closely.