Turing Test and Possible Web Scrapping and Sentiment gauge

Jrinne · December 7, 2022, 4:07pm

Azouz, SteveA and others,

So have you all heard of OpenGPT and OpenAI.com?

Ellon Musk has commented but his comments were a bit political and I will avoid quoting them. But he is aware.

Also, Turing test for whether AI is as good as a human. Whether a human chatting with it can be fooled. CPAPTCHA stands for this: *Completely Automated Public Turing test to tell Computers and Humans Apart "

So I am gullible but could be fooled.

If you have not kept up on your CME for medicine for example just ask the AI:

**Q:What is the differential diagnosis of anterior uveitis? **

A: The differential diagnosis of anterior uveitis includes infectious, autoimmune, and inflammatory causes such as herpes simplex virus, syphilis, toxoplasmosis, sarcoidosis, Behcet’s disease, and Fuchs’ heterochromic iridocyclitis.

Better answer than I could give. Too good in fact. So fails the Turing test. Conclusion: Not a human. But still. Want a short story in the Horror Genre?

Topic: Breakfast

Two-Sentence Horror Story: He always stops crying when I pour the milk on his cereal. I just have to remember not to let him see his face on the carton.

TL;DR It is highly capable for language recognition. It has multiple modes including ones that could be used to asses sentiment after web scrapping. And maybe not that expensive.

Jim

Jrinne · December 8, 2022, 11:10am

BTW, this does have an API that allows it to be loaded into Python with various options on which engine to use for language recognition (with different costs).

News stories about it everywhere. Usually along the lines of the end of homework, school papers etc (the students will just have the AI do it). I just used it to help compose an email and then copied the results into the email.

No doubt it could be used to determine sentiment from any source including web scrapped data.

Whycliffes · December 10, 2022, 3:53am

I have tried it out a bit, for example to see if it can write the python code to open a website like www.portfolio123.com, but it doesn’t seem to be able to do that.

I am learning a lot about how AI solves coding problems, but to my question, are there other similar easy solutions where you write what you want and then an AI creates the code for you?

I have tested the code from OpenAI in both Colab and jupyter.org.

Jrinne · December 10, 2022, 8:19am

Whycliffes,

Cool that you know so much Python!

I may not understand everything in your post fully. I think you are asking how to web scrape a site with P123 being an example of a site you might scrape for sentiment.

This a topic I do not know much about. I can tell you that P123 data can be scraped. When I first heard about it I thought it was probably a big secret but Marco seems to be aware of and even facilitated it, I think. At least several members have talked about it in the forum. I have also talked about it in emails with people. Has the API made some of that unnecessary? Probably I would think.

But I have never done any web scraping. Others could discuss it more. If I were to to start doing it I would probably look at this book. But I have not read it yet: Web Scraping with Python: Collecting More Data from the Modern Web

Second, I do think OpenAI would work PERFECTLY for someone looking to get data on sentiment from any text source. They discuss looking at sentiment on their site as well as training on new data. The language skills are simply amazing based on my playing with the site. Like I said: close to human in its abilities. The only concern would be the cost of the service.

I am not sure that it would be something I could undertake, but Azouz for example, is a highly sophisticated programmer and I think he has some professional resources that he could put into a project like that. SteveA is a professional with great programming skills. There are others I have in mind as possibly finding this useful e.g., Dnevin123 and Quantonomics. They seem to have some resources or larger investments and some skills. Georg and ETFOptimize (Chris) could consider it for their ETFs. Even reading Powell texts and executing trades I suppose but that would be very advanced due to speed issues. Maybe sorting though multiple company quarterly projections from CEOs before the open.

P123 could look at this at some point. But I would guess they would not want to make it a priority now. And I would definitely trust their judgement on that.

It is probably too much for me at the end of the day. I am old-school math and statistics trained: before much programming. I am particularly slow at munging data which I imagine would be a big part of what you would have to do after scrapping the web.

Uploading a csv file and running, say a random forest, on the data does not really take any programming skill. I am not a good programmer.

I do not know if that helps but so cool that you are such a good programmer!!! And thank you for your interest.

Whycliffes · December 10, 2022, 9:11am

I’m sorry, but I don’t think I expressed myself well here. I am not a good coder, and I am not looking for a way to test “scraping”. It was more to show how chat.openai works in relation to writing Python code.

My point was that this is a very fascinating AI, and for someone like me who can’t code or Python, it was amazing to see how chat.openai can solve complex programming with simple language commands from the user.

Jrinne · December 10, 2022, 10:57am

Thank you Whycliffes. I find it truly amazing myself. It costs just pennies for everyday stuff (and I am on my free trial) so I keep it open in my browser so I can sound knowledgable on the phone for example. You never know when someone might ask you this. It could seem like I remember on the phone:

Q: What are the layers of the retina?

A: The layers of the retina are the ganglion cell layer, inner nuclear layer, outer plexiform layer, inner plexiform layer, outer nuclear layer, photoreceptor layer, and retinal pigment epithelium.

I agree. Truly amazing. And we have not being good at Python in common

BTW, for serious professional use, it can learn from new data. And I used it on this last sentence. It seemed to think the commas were correct but it did not like my use of BTW replacing it with By the way.

Not that it could ever replace Yuval’s old job as an editor. My job has already been replaced, however. All medical decisions are made by an AI at the big insurance companies. Well, almost all but it is getting very difficult to get a human that will even listen to my opinions on the phone (and even then a 50-50 chance I will be “accidentally” disconnected after listening to Muzak for 2 hours). Of course, the patient always has the option of paying out of pocket for things that the insurance company will not authorize. Like the $ 3.5 million dollar per dose (marked up) drug that just got FDA approval (for hemophilia). $3.5-Million Hemophilia Gene Therapy Is World’s Most Expensive Drug

It is not like Skynet in Terminator. It is worse already!!! So, seriously, AIs are making life and death decisions. No one can afford $3.5 million dollars for a life-saving drug (without a mostly automated favorable decision from the insurance company).

Serious stuff no matter what your perspective on it is. And might as well have an AI on your side, where possible. Including when you invest. As well as P123: whatever category you place its data and tools in.

Jim

InspectorSector · December 10, 2022, 3:01pm

Jim - I would be careful with that. AI has a long way to go before it can create anything useful. I just went through an exercise at work of evaluating an AI-based electronics design tool, that would supposedly take an engineer from a block diagram to a finished schematic with the push of a button. Needless to say, it didn’t work, but there was a team of programmers in India hard coding my designs in the background, or at least the programmers were trying to translate my designs into schematics.

This is much different than AI-based trading though. I still look forward to P123 providing tools in that area.

Whycliffes · December 11, 2022, 5:03am

Yes, you’re absolutely right. I’m not good at Python, but tried to get the code on a simple task, creating a folder and a word document in google drive, with some text in it. This is the answer I got:

It didn’t work. After many rounds of new recommendations, it still doesn’t work.
I’m going to test the Chat out more, and I’m learning a lot from seeing how it tries to solve the problem, but until now it’s only been able to solve very simple tasks, but not the document on google drive.

test_user · December 11, 2022, 11:54am

I’ve been testing it a bit for programming. It’s super convenient, but in my experience not completely reliable, depending on the language and the problem. Unfortunately it is very “confident”, and gives the impression of being able to solve anything. I’m finding that it’s important to be critical of any code it produces, and only use suggestions that make sense to you.

@sama is the CEO of OpenAI, worth a follow if you’re on twitter:

It will be very interesting (and a bit scary) to see where this technology goes from here!

Jrinne · December 11, 2022, 1:09pm

First, I am confused. Is anyone saying I have a new idea? A new idea that we are going to decide–in the forum–whether it might be commercially viable or not? Wow. What I am saying is the technology (which is already being used) has improved to the point that some of you who are particularly good programmers could consider using it too (in your own creative way). Having said that I am not saying it is for anyone (and certainly not for everyone) reading this and it is not for me (probably). Image suggesting this might already be in use:

Also, you must know that Google does some of this. Some of it published publicly and other data that you have to pay for: Analyzing Sentiment. I am assuming that, at times, the data is useful to those paying for it.

Regarding the quote, I literally do not understand the idea that the AI is “too confident” with regard to this discussion. He was discussing the factual question/answer setting. Where generally you are wanting one answer to a question like who was the 35th president of the United States. And you are not going to be spending a lot of time agonizing whether a 99.999% chance that it was John F. Kennedy is good enough for your homework assignment. But you can get a probability for any output including that if you want. With regard to sentiment the confidence settings for the question/answer portion of ChatGPT is not too interesting. Or pertinent, I think.

Here are some statements (totally made up as an example) about P123 and what the AI thinks the sentiment might be for each quote. ALONG WITH THE PROBABILITY. On the second statement it thought there was about a 57% chance it was positive. Image to follow (probability is for the second quote):

It was pretty confident about the first (I can see why) but still only 99.98% confident that the sentiment for the first quote is positive.

On quote 3 it pretty sure that this is a neutral comment (kind of interesting). 99.82% sure this comment is neutral. Not how a human generally thinks for sure. We would want to put in in a positive or negative bin and then asses the probability.

You would have to figure out how you want to aggregate the sentiment for each…whatever. You may not want to look at tweets.

But you know you can get a probability (and an aggregated probability) and set the confidence level for any action you might take, right? That is not a programming challenge (even for me). As an example, you could average the sentiment and act when the average chance that the sentiment is positive (for all of the comments) is above 75%. Or act when 85% of the comments were assessed to be positive, etc.

**TL;DR. Confidence is not a problem. This technology is being used already and probably in ways that have not been discussed in this thread. The technology is more possible to use—for a good programmer with resources–than it ever has been in history. Arguably, with a small quantum leap in the availability for a retail investor recently.

Jim

test_user · December 11, 2022, 6:42pm

I wasn’t trying to comment or critique AI/ML in general, you know a lot more about this topic than me. My only & very narrow point was about the present state of ChatGPTs attempts at programming. ChatGPT will happily suggest programming code to solve a given problem, but occasionally the solution will be completely wrong. In these cases the code will superficially look good, and ChatGPT will not indicate in any way that it is not sure about how to solve the problem. In my experience ChatGPT often doubles down and defends critique of bad code, and happily makes up plausible-sounding (but wrong) explanations about it.

(If you’re aware of a setting or prompt in ChatGPT that alleviates this issues I’d be very happy to hear about it!)

So it is important not to blindly copy-and-paste a solution into your program, or to trust the results without understanding what the code does. Which, for me, has been very helpful in learning more about programming, since I’m forced to understand all of ChatGPTs suggestions.

Jrinne · December 11, 2022, 8:22pm

test_user

Is it possible we are talking about completely different things? And if forced to discuss one topic at a time we might agree. Both you and Whycliffs keep talking about using ChatGPT to create code.

So on that topic I agree. Or think I would agree if I could make it create any code at all. I made a brief attempt to do that and could not. **So on that point I agree.**But then again, why would I use it for that?

Now Sam Altman has a short post that may be taken out of context but the these words appear in the post: “factual queries.” The default setting for factual queries it to give one and only one answer: the most probable answer. So he is right. That could give a false sense of confidence. For factual queries. Agreement again.

However, my original post and all my posts afterward were related to measuring sentiment. I gave a small example above (P123 sentiment). An example where “confidence” takes an entirely different meaning. Is 56.92% probability “very confident?”

Especially considering OpenAI allows you to train the program on your data there is no evidence that it is particularly bad for sentiment analysis. One could research the DaVinci code if one were interested. Is there a better language engine that you are aware of?

Anyway, Whycliffes and test_user, I cannot get it to create any Python code either.

Jim

test_user · December 11, 2022, 8:29pm

My bad. I scrolled the thread and saw a reference to code generated from ChatGPT, which I have some experience with. My answer was ignoring all the earlier posts, thereby missing a lot of context. Bad habit of mine.

I’ll ask ChatGPT to summarize the thread next time

Jrinne · December 11, 2022, 8:34pm

test_user

It has a TL;DR feature. I was hoping to use it for some of my posts. I could not get it to work (too many tokens). I did really try it.

But it probably would have helped

Thank you!!!