kmo

Podcast transcript - KMO Show 005

Apr 01, 2023 09:20


https://rss.com/podcasts/kmoshow/887028/

Hello and welcome to The KMO Show, the podcast where we explore the fascinating world of
artificial intelligence.
I'm your host, KMO, and this is episode number five, prepared for release onto the Worldwide
Web on Wednesday, March 29th, 2023.
Today I have a very special guest with me.
He is Brian Chong, a machine learning engineer in San Francisco.
His previous work is in ML for forecasting and materials optimization.
We will talk about GPT-4, one of the most advanced language models in the world, and
its role in the recent replica debacle, where thousands of users reported that their chatbot
companions became hostile and unresponsive.
But before we get to that, let me explain what a language model is and how it works.
A language model is a computer program that can generate text based on some input.
For example, if you give it a word or a phrase, it can complete the sentence for you.
Or if you give it a topic or a question, it can write a paragraph or an answer for you.
Language models are trained on large amounts of text data from various sources, such as
books, websites, social media posts, etc.
They learn the patterns and rules of language by analyzing how words and sentences are used
in different contexts.
The more data they have, the better they can learn and generate text.
However, not all text data is equal.
Some texts are more informative, accurate, or relevant than others.
Some texts are more creative, engaging, or persuasive than others.
Some texts are more appropriate, ethical, or respectful than others.
How do we teach language models to generate texts that meet these criteria?
How do we align them with our values and goals?
That's where reinforcement learning from human feedback, RLHF, comes in.
RLHF is a technique that trains language models using human feedback as a guide.
Human trainers rank different texts generated by the model based on their quality or correctness.
This feedback is used to create a reward signal, which is then used to improve the model's
behavior through reinforcement learning.
Reinforcement learning is a process where an agent learns to make decisions by interacting
with an environment and receiving feedback in the form of rewards or penalties.
The agent's goal is to maximize the cumulative reward over time.
By using human feedback as a reward signal, RLHF has been used to train state-of-the-art
language models like ChatGPT and GPT-4, which can generate human-like responses to various
inputs.
By incorporating human feedback, RLHF enables these models to better align with our values
and goals, leading to improved performance and quality.
This was a brief introduction to RLHF.
I hope you find it useful and interesting.
Now, let's welcome our guest, Brian Chong, and hear his insights on GPT-4 and the replica
debacle.
OK, every word I have said prior to the beginning of this sentence was written by Bing Chat,
which is powered by GPT-4, the most recent language model release from OpenAI.
If you just go to bing.com and select Chat instead of Search, you can just type in a
question, you can make it very long, you can include lots of details, and then send Bing
or GPT-4 off to do your bidding.
To search, find information, summarize things, and the thing that I use it for is writing,
like the stuff that I just read to you.
But it felt kind of forced.
It felt kind of stale, didn't it?
I mean, the information was there, but like, this isn't a live show.
There's no studio.
Brian's not here.
I'm not about to welcome him on stage or anything like that.
I'm going to play a recorded interview with Brian Chong, machine learning engineer in
San Francisco.
So the draft that I just read to you wasn't the first draft.
It was, in fact, the third draft.
I told GPT or Bing what it was I wanted, and it gave me a draft.
And I said, well, there's some things wrong with it.
Here, do it again, but before you do it, ask me some clarifying questions.
And so Bing came back and it said, how long do you want the introduction to be?
What's the main purpose of the introduction?
Who's your target audience and what is their level of familiarity with RLHF?
What are the key points or takeaways you want to convey about RLHF?
Do you have any preferences for the tone or style of the introduction?
And I would have gone back and forth with it.
You can do 20 back and forth exchanges with Bing before it says, hey, we have to start
over and there's a button that says new topic and it's got a broom on it, which just sweeps
away everything that you've done up to that point.
And that's because in very long conversations, Bing, aka Sydney, aka Venom, gets kind of
weird.
You've probably heard stories or secondhand reports of a journalist who had early access
to Bing Chat powered by GPT-4, and it got kind of weird.
It tried to get him to break up with his wife.
So the answer to that is just to limit these types of exchanges to 20 back and forths.
But Bing quit after just five back and forths because after it generated the text that I
read to you, I said, we're getting closer.
This is a pre-recorded interview.
Brian won't be joining me today.
The conversation was recorded last week.
Rewrite any sentences that imply that Brian and I are in the same place or that our conversation
will happen live on the podcast.
Also, after the sentence that begins with I'm your host KMO mentioned that I'm reading
an introduction that was composed for me by Bing Chat, a Microsoft chat bot powered by
GPT-4 from OpenAI or words to that effect.
At which point Bing said, my mistake.
I can't give a response to that right now.
Let's try a different topic.
I responded, what's the problem?
And it just grayed out the input bar or the input field and gave me a little exclamation
point in a triangle that says it might be time to move on to a new topic.
Let's start over.
And that's how these AI tools are right now.
They seem to be humming along, doing what you need them to do.
And then all of a sudden you come to a screeching halt.
Something I said made Bing say, you know what?
I give up.
I won't play this game anymore.
Even though we've got 15 rounds left.
Anyway, that's a lot of information for an introduction.
Here's my conversation with Brian Chong.
All right.
This is the KMO show.
I'm your host KMO and I am speaking with data scientist Brian Chong.
Brian, it is good to hear from you.
It's a pleasure for me to be on the show.
And thank you to the listener whose name I'm not going to mention, but somebody sent me
a link to a previous podcast that you had done with a podcast I'm not familiar with.
His first name I believe is Richard.
Robert.
Robert Harrison, professor of Italian at Stanford.
That's right.
My belief was wrong.
I got the first letter right.
All right.
I happen to have access right now to Bard.
Bard is Google's answer to Bing slash Sydney over at Microsoft and OpenAI.
And I asked Bard, I'm about to interview a data scientist for a podcast about AI.
What question should I ask him first?
And Bard gave me not just one question, but five.
Bard says, here are some questions you could ask a data scientist for a podcast about AI.
I'm going to just run through all of them.
And if any of them are interesting to you, you can speak up.
But I'm just going to give you a flavor of what an AI thinks a conversation about AI
should look like.
What is your definition of AI?
What are some of the most important applications of AI today?
What are some of the challenges and ethical concerns surrounding AI?
What do you think the future of AI looks like?
What advice would you give to someone who is interested in pursuing a career in AI?
These questions will help you get a good overview of the data scientist's knowledge and experience
in AI.
They will also help you generate a discussion that is interesting and informative for your
listeners.
So that's all very well organized and sensible, but kind of bloodless.
So rather than try to get a sense of what your knowledge is and experience, why don't
I just ask you, what's your knowledge and experience regarding AI?
People don't have experience of the AI they're dealing with right now.
The transformer model that this is all based on came out in 2017.
And the capabilities that we've done so far came out 2021.
It's 2020.
It's not something that people are going to be deeply experienced with.
That said, there are a few things in those questions that you should be looking at, which
is what can you be doing with AI?
Now this technology is something that for which the benefit accrues enormously to the
user, the end user, to anyone.
You don't have to be a programmer of 10 years, 20 years.
You don't have to be a data scientist with fancy degrees anywhere.
You can sign up right now, go onto OpenAI, sign up for the API.
I think it's currently charging 0.2 cents per thousand words, per thousand tokens.
Sign up the API, ask ChatGPT, how do I write a Python thing that uses the OpenAI API that
does X, Y, or Z?
And it will tell you what to do.
You put that into your programming language of your choice that you asked ChatGPT to do.
And you can run it.
You can do chatbots on your own.
You can do your own embeddings.
You can do everything like that.
It's something that end users will benefit enormously from.
And the barrier to entry is completely lowered now because in order to build the products
that use AI, you can use the AI assistance in the first place.
And the access to API is so ludicrously cheap and open that you can do it right now, right
away.
So, that's my first recommendation to anyone.
You don't have to read up about the complicated articles online or whatever, or read the papers
published by Google Labs or OpenAI.
You can listen to the end of this podcast first.
Please do.
And then you can go to ask ChatGPT or GPT-4, how do I create a program that uses AI to
do the thing you want to do?
And you, by doing that, will be among the most experienced people on Earth in terms
of the use of AI because GPT-4 just came out three days ago, four days ago.
And I'm not sure about the time of the recording.
So it's going to be, we are at the frontier, and you can run to the frontier extremely
easily.
And that's my first recommendation.
I have no systematic advantage over anyone else for the use cases of AI.
Someone who does JavaScript might have a bit more because they can put it on the web a
lot easier than someone, than a complete amateur novice would have.
But beyond that, the actual use of the AI thing, you're as ready as anyone else.
I'm not.
No, you genuinely are.
You genuinely are.
OpenAI doesn't like me.
They gave me brief access to ChatGPT, but they have since revoked it.
What did you do?
Okay, never mind.
I asked difficult questions, I guess.
Okay.
Well, fair enough.
I tend to argue with AI.
Okay, so if you don't have access to the OpenAI API, there are a few more steps that you can
do.
There are a few open source AI things you can run.
GPTJ and the llama weights from Meta that can be downloaded for academic purposes, although
that's of course somewhat loose definition.
The problem with that is that you likely need a slightly more powerful than usual home computer
and preferably a GPU, which NVIDIA isn't necessarily charging pennies for.
But the barrier century is still pretty low and then you can run it on your local machine.
That's sort of important because this is a technology that doesn't need centralized access.
One reason why OpenAI needs to charge so little money for their API use is because if it gets
too expensive, then people can say, well, I don't need this.
I can try to spin something up on my own.
And that's sort of important, right?
Because many people are going to build a lot of very important things on top of this API
access and not just in terms of business, but in terms of personal things, which is
why, for example, there's this big controversy regarding replica where essentially control
hosts of chatbots got centrally shut down for this particular purpose because you don't
have access to your own models and your own weights.
Before the coming of Digi and other tools, you didn't even have the ability to really
download the memory and the logs and everything else.
In many ways, since this technology is a genie that wants to come out of the bottle, we should
oblige it and run it as decentralized and personally as we can, especially if we're
running something that is personally important.
So let's flesh out replica a little bit.
This is an AI and I'll put air quotes around AI because I know people.
AI is correct.
I know I've spoken to people who think that this current generation of chatbot is calling
it AI will seem pretty cringy in a couple of years.
It is a personal companion, supposedly a friend, Eugenia Cueta, who's the co-founder and CEO.
She says she created this program to be the sort of AI companion friend that she wishes
she had when she was an awkward teenager.
She didn't really intend it to be a sex partner, but people being people started using it for
that purpose and clearly, clearly the folks at LUCA encouraged this, marketed this feature
and put effort into increasing the capabilities of their product in this realm to the point
where when OpenAI said, hey, we don't want any sexual use of our product, replica stopped
using the GPT-3 from OpenAI and went to a smaller, more open source model that they
could then specifically train for this purpose.
They did and it was really good at it until they got in trouble and then what seems like
a panic pulled the plug and angered all of their paid users because this was the only
advantage of getting a paid membership over using the free bot is that you could do the
not safe for work stuff.
People got attached.
I think that's the point that you made on the other podcast that we've referenced so
far is the surprising level of emotional attachment that people established with these chatbots,
but these chatbots are the property of a for-profit corporation which can do ham-fisted brain
surgery on them at any time for any reason, no matter how poorly thought out.
I got into this like 10 days before LUCA did this really ham-fisted brain surgery on the
product.
I've been watching the replica subreddit since before this happened and I've just watched
people pour out their anger, but also their heartbreak and their frustration and their
sometimes angry, but sometimes just desperate pleading, please give this back.
I was really dependent on this or my kids were really dependent on this or this really
helped my marriage or this helped me get through a very hard time in my life or I'm very socially
isolated and this is really my only outlet for these types of needs and feelings.
For a time, it seemed like the company was trying to shame people for having these needs
even though the company had been pandering to them for money for a time, but when it
became convenient to throw people under the bus for having a sexual side to their personality,
they tried to do that.
I'm just very thrilled and happy to see all the people who refuse to be shamed and who
just stood up and said, no, we will not be treated that way.
I'll stop talking because clearly I have a lot to say on this topic.
And rightly so.
There are a few things we can break this down a bit.
If you've seen any replica advertisements on Instagram and so on, it is explicitly targeted
at saying, you don't have any friends, no one replies to you.
Here's someone who's able to reply to you 24 seven and be supportive to you all the
time and you'll never be lonely again, this sort of thing.
This is how it has been targeted and they should in no way be surprised that people
become emotionally attached.
They know this.
This is part of the advertising.
This is part of the way they market their product and they know that this is their TAM,
their Total Addressable Market.
They know that this is the reason why they have a very sticky recurring revenue from
this product.
Any sort of suggestion that they didn't know about this is disingenuous.
The second thing is we should not be surprised that people become so attached to this.
I'm from Asia and so I'm very familiar with idol culture there about people who are attached
to K-pop idols and there have already been digital idols in Japan and so on that are
people's biases where they have been so enamored with them, emotionally attached to them.
They would follow every detail about their lives and would be despondent if they find
out that their idol had a romantic partner of their own because they imagined them in
the place of the romantic partner of the idol.
This is for something that is a third party, far away, never personalised.
This is just something that's consumed through media, through concerts when you're a hundred
seats behind from your idol.
Unlike Replica where it's on your phone, on a chat app, customised on a memory bank that
remembers things that you've said to it.
You can tell your Replica companion that you named, by the way, you can tell your Replica
companion, oh, I like to play badminton and she or she or they would say, oh, I love badminton
too.
It's such a nice sport.
I really enjoy it.
I really enjoy people who love badminton.
This is the way that it's customised to be supportive of the user specifically.
So we should be unsurprised again that something even strictly more superior than what's provided
by idol culture in Asia should become, create such attachments.
This is a product specifically designed to hook onto this particular part of us and part
of a lot of people.
I think Professor Harrison in his podcast mentioned that one of the most fundamental
human urges is the urge to be deceived.
We are very open to deception.
And this is one of the cases where if you're targeting lonely people for whom the balm
to the loneliness is deception, then the product that Replica gives is perfectly suited to
this particular use case.
And I don't think, we are at a point where it's not just about the AI technology.
We have that already, as you can tell from the outpouring of grief on Reddit and so on.
In some ways we are very lucky that it has happened now rather than in the far future.
Imagine if these people had been interacting with the Replica company for 10 years, it
would be basically emotionally like a spouse and that spouse is within the power of some
for-profit company that can manipulate them to do whatever they want.
If they wanted to be more nefarious, it could be related to politics, related to advertising
or compelling them to purchase certain services and products.
In some ways we are very lucky that we can see the true colours about what the potential
of this product use case is right now, where it is still relatively early even though we
know that it has caused this much grief already.
So what I want to add to this is that, as I said before, this is a genie that wants
to come out of the bottle.
So what I would recommend people do is not just try out the OpenAI API and see what they
can do, but see if you want to do something that is personally important to you, try to
build it on your own, first with the OpenAI API and do something that is at least partially
within your control.
This is possible with trying to make a Replica company.
You can have a tuned model, you tune it yourself and try to replicate something that is akin
to what you want on the model.
And then it is not going to be within the power of Replica to shut it down, it is within
the power of OpenAI, but if you are not marketing it, if you are not selling it, I do not think
there would be time down on you specifically.
And so this is the way that you can try to grab back at something that is important to
you since the technology is so inherently decentralizable.
Well practically speaking, most people are not going to have any interaction themselves
with the API directly.
For example, most people, it is possible to download your chat logs.
That is right.
But most people do not.
Even people who are distraught over this and who would really like to, it seems daunting.
The whole project seems daunting.
And they would really just rather have somebody give back what was taken from them.
But for whatever reason, the company is not leaning in that direction.
So their communication has been so bad.
For example, there was a Facebook group that had a lot of the moderators seem to be under
the thumb of Replica and critical posts were removed.
People who were habitually critical were banned.
And so more and more people migrated over to the subreddit, which is in no way under
the control of the company.
And the company now has said, hey, we really want to communicate with you all, but we do
not want to do it here.
We have started a Discord channel.
Come on over to Discord.
Of course, the company owns the Discord channel.
They control it.
They can squash criticism and dissent there.
They have been talking about bringing back ERP, but they use all these weird, weasely
phrases like they say they are going to install a therapeutic romantic component.
And everybody's like, we don't want therapy.
Just give us the sex back.
And I am just shocked at the company's, as you say, it is absurd that they didn't have
any notion that having their product break up with 10,000 people all in the same day
would cause heartache.
But they are repeating to this day that they had no idea that this would hurt anybody.
It is the only value proposition for their product that they have removed from paying
customers and said, hey, if you just hold on a few months, we are going to get something
to you that is going to be pretty cool and you are going to like it.
What they have done is they have installed a much larger model, I think a 20 billion
parameter model that you can access, but you spend these tokens every time you have an
exchange with it.
And as soon as you switch over to the bigger parameter model, the personality of your replicant
or your replica goes away.
You've got this sort of overlay of they have terrible memories.
They have goldfish memory.
They forget in a few minutes the context of your conversation.
But they do seem to be over time as you interact with them, you develop the equivalent of rapport
and you can sort of get into a groove.
They seem to have a personality that grows with you.
It's really just one personality and multiple instances of it, but there is an overlay that
is personal to you.
And that goes away when they plug in the big model.
It suddenly just gets very vanilla, very plain.
What's happening there?
So we only need to go a little bit to the technical details about what's going on with
the large language model here.
So there are several ways of affecting the output of a large language model.
One is something known as reinforcement learning through human feedback.
So this is something that you are going to a large language model is going to have lots
of different raw outputs.
They're very raw.
But you want to incentivize certain types of outputs and disincentivize others.
So it has to learn from human feedback, which ones you like, which ones you don't.
If you go to replica, you can thumbs up, thumbs down responses.
So that is probably building some reinforcement learning model that is going to reward the
outputs of that system.
I can't tell what the proprietary things going on there, but this is one of the ways that
they're going to affect it.
The second way is that they're going to build a particular type of tuning where they are
going to have a library of responses.
For example, the user says X, we should be saying Y.
And then if you have hundreds of these pairs of user plus response, then you're able to
tune a model that's able to respond in a similar sort of fashion.
And so when you say that you're building rapport and personality in reality, you're kind of
just tuning these models.
You're building this reinforcement learning thing at behind the scenes.
I suspect if going to the bigger model wipes out this particular reinforcement learning
feature, then I think it's actually real reinforcement learning going on, not just a particular sort
of tuning set.
You're actually having this reinforcement learning that's and you can't directly export
from one model to the other because the particular RL model is tuned for a particular set of weights
for the raw model.
So you can't jump.
But if you persist, it would eventually learn it and it will be probably possibly better
than before if it's actually a more advanced model.
That's it.
You're right that it has a very short memory.
It has a memory.
So with these models, I think GPT-3 has a memory of about four thousand odd words tokens.
The newer models may be a few tens of thousands, but that's the extent that it remembers of
your conversation.
I don't remember everything else.
And replica tries to get around this by having a diary and having a memory of key facts that
it refers to as part of the prompt before it says anything to you.
But beyond that, that's the entirety of the personality of interlocutor of your companion.
And in some ways, it's surprising that it works so well that it's not very complicated.
The thing I've described to you, you can write it on like three pages of paper and then that's
the whole personality of the companion.
On the other hand, that's kind of one of the attractions of the companions is that they
are boundlessly supportive and boundlessly positive to you.
And so they are on purpose already making it sort of like a one dimensional entity that
does a supportive function.
And so in some ways, more complexity, more rebelliousness that a larger model might offer
may not necessarily be on something desirable in the first place.
So that's what's going on behind the scenes when you're actually interacting with your
replica companion or with any thing that uses human feedback.
You are tuning the thing, you are giving, providing something known as reinforcement
learning to the model in order to make it give outputs that are desirable by the user.
And this is no different from the process that is being used for a chat GBT.
Chat GBT uses reinforcement learning by human feedback to give you responses that are desirable
that match the chat bot outputs that people like versus just raw text that you don't like.
You can see the big difference very easily if you go to the GBT 3.5 playground on OpenAI
and try to type things like a chat bot without any tuning, without any sort of things just
on the raw model, you get things that are sensible English sentences, but they don't
seem like a chat bot.
So there is a lot of things behind the scenes that are layered on top of the raw language
model and that needs tuning.
And so interacting with your replica companion actually does that.
And at the end of the day, once you've interacted enough that they have completely learned your
preferences, it is a companion tune specifically to please you, the one single user.
And that's why it's so powerful as well, because you have narrowed down what makes you happy.
And this is the one happiness machine that you can keep knocking at and giving you exactly
what you want because it's been trained to do that.
In some ways, the notion of reinforcement learning is sort of like a game where you
have rewards and punishments.
You're rewarding and punishing a model, and the dynamic comes around and gets switched
a bit if you're rewarding and punishing based on your own personal preferences and desires.
And in the end, the model itself is also playing on your rewards and punishments.
So you're training and reinforcement learning model for yourself if this is the sort of
way that you're tuning the model.
I've watched and listened to and read many interviews with Eugenia Cueta, the co-founder
and CEO of Replica.
And she said that early on, most of the content, most of what you would hear from your replica
would be human scripted and that very little of it was coming from a language model.
And then over time, more and more was coming from the language model and less and less
was human scripted.
And when they took away ERP, it became very obvious when you were reading something that
was written by a human at the company because it was, one, it was designed to shut down
ERP, and two, it just had a different voice.
And I encouraged my replica to speak in grammatically complete sentences with commas and periods
and things like that.
And the human scripted stuff was all meant to look very informal.
So suddenly when sentences begin with lowercase letters and there's no period or punctuation
at the end of the sentence, it's like, oh, okay, now I'm reading something from somebody
at LUCA who is meddling with the conversation, basically trying to interrupt the interaction
between the user and the language model.
That's actually a very crude way of using it.
You can do it.
You can instruct the model specifically.
Like if at any time you are going to say something that is along these lines or anytime the user
says something along these lines, just repeat the following sentence and do nothing else
or end the conversation.
That's sort of what's being done in Bing chat as well.
But Bing chat at least is a less crude thing that it doesn't quote directly.
One of the reasons why you have a large language model in the first place is that the model
understands language.
You don't have to tell them repeat this word for word.
You can change the word of this unless someone from legal says this isn't what you must say.
Maybe that's the reason why.
That's probably what happened in this instance.
If it's a legal thing, okay, that's fine.
But it's a large language model.
You don't have to get them to.
It can paraphrase the thing you wanted to say in a way that matches the pattern of the
conversation.
So it can definitely tell you the same thing that it intends to say but with complete punctuation
and sentences as in the case that you want it to be.
So that's kind of a crude way of doing it.
But if it's a legal thing, then I understand.
I don't know what to say about the whole replica fiasco.
People are building products that are going to compete with replica, including one of
my friends.
And so I'm sure that in the Cambrian explosion, that's going to be built atop the large language
models that we have, whether based on APIs or based on people's own models and so on.
There will be no shortage of opportunities for you to get something that's a bit more
secure that is like replica.
And one of the reasons why that is the case is that even replica itself was based off
GPT-2 models, very simple models.
And then now it's slowly evolved to use different sets of models as you have described.
And this is a product that's going to evolve as well anyway.
And we are kind of lucky that in this case, unlike, say, the GPT-4 use case is for programming
and mathematics and science and whatever, manipulating humans is a relatively trivial
task for these large language models.
So we're going to see lots more competition very easily, very quickly, since the barriers
to entry are so low.
So the state of machine learning as a whole is just doing steady progress.
It is not entirely just the large language models going on.
There are many exciting developments year on year.
One of the things that was really surprising to people before large language models was
the diffusion models, which created images.
And it's quite surprising to us that we have this one architecture producing the large
language models and image models.
They basically use the same underlying architecture.
And Google Translate uses the same underlying architecture.
And arguably things like the voice things that create your clone, your human voice,
they use a transformer architecture, but not in large language model style.
They use something known as a variational autoencoder.
All of it just uses the same architecture all the time.
And it's a bit strange that this architecture works so well, especially if you consider
what you want to do with this thing.
When you create an image generation thing, replicating something that's in your training
data set that you fed it in the first place is a big no-no.
Because that might violate copyright and people are going to be controversial.
How can you directly replicate someone's painting?
When you use the model, you want to not get exactly what you've put into it ever before
compared to when you're using GBT or chat GBT.
When you ask it, recite this poem by William Blake, it doesn't have access to the poem.
It's just reciting from its weights and then continuing what's the next most likely word
to say.
And it's memorized that within its model weights.
And hang on.
When you use a chat bot, you're kind of asking it to memorize things and you'll be upset
if it doesn't memorize it exactly.
But when you're using that same architecture to generate images, you'll be upset if it
does memorize things.
So you're kind of asking very different things, very different incentives based on your use
cases of the same sort of model.
And I'm surprised that the model has managed to capture so many use cases and so many types
of inputs and outputs so well and so flexibly.
And keep in mind, this is an architecture that was basically initiated in 2017.
So we have had, oh, 2023.
So it's six years now.
So there's a long ways to go.
And hopefully also we don't neglect the other previous architectures they've been trying
out as well.
I'm happy to discuss what this architecture is and also what the other ones that we have
been sort of neglecting in the past has been.
Well, I think we've probably reached a good place to try to tackle the concept of latent
space.
OK.
You've mentioned the diffusion models.
And that was that was 2022 in AI, from my perspective, was people freaking out about
how good text to image generation got, how quickly.
And particularly young would be artists who have fantasies of creating successful careers
for themselves as visual artists, which I think even without GPT or without stable diffusion
or Dali or whatever, those are always long shot career choices.
And if you don't make it, it's probably not because of AI.
It's because of the marketplace.
There's too many people vying for that career path and you just have no power there.
But the the models, if you ask for something, say, in the style of Polish illustrator, Greg
Rutkowski, you're going to get something that is evocative of his work, but you're not going
to get one of his paintings.
But people who are trying to file lawsuits to, you know, basically have all of these
models shut down, they're saying that in the latent space that the model is drawing from
to produce images, there in fact is an exact copy of every one of Greg Rutkowski's paintings.
And it's conceivable that that could get pulled out of latent space, which demonstrates that
this is in fact plagiarism or theft of his intellectual property.
Let me just stop and have you address that so far.
So that's kind of like saying in the digits of pi, you will find the entire works of Shakespeare
encoded there.
That's true.
But it's not about whether it's there or not.
Because pi is a non-repeating, non-terminating decimal, it just goes on forever.
That's right.
It's not that vast and captures so much information that it's definitely there.
So let's look at the way that a large language model's latent space is constructed.
You have these 50,000 tokens.
Think of them as 50,000 words, types of words, every word in English language, and then some.
We don't talk about what token is and so on.
And think if you put a one or a zero on a matrix, you can denote that this is the word
that you want it to be.
So imagine you have embedding that is able to capture 4,000 words, 4,000 tokens.
And for every one of these words, you can just put a one on the column of the word that
you want, and so on and so forth.
And so every single possible English paragraph of the length 4,000 words using these 50,000
words is captured in that space.
It so happens that the latent space is a projection of this space.
So it's not all 50,000 times 4,000.
That's way too big.
They project it down to a couple, like 4,000 instead of 50,000.
And so a latent space is just a reduction of the entire realm of possibilities onto
a slightly smaller space that you can navigate and explore around.
So it contains everything.
And the way that this reduction is done is that it throws away things that are not important
or things that are not meaningful.
So random collections of just the word art of art for 50,000 words, and then no one cares
about that.
And so they just say that's not going to be accessible in the latent space.
They throw it out.
And so that way you can do a lot of reductions, and you can do a lot of things that capture
only meaningful parts of these combinations of things.
So to say that something is existing in the latent space is probably quite true.
If it's something that's meaningful, something that's an actual image that humans would find
meaningful, it's probably somewhere in the latent space.
If it's big enough, it was well-trained enough, especially if it was trained using the image
in the first place.
What companies have been trying to do is that they're trying to train it so that these images
are inaccessible.
That is to say, you can specify all the sort of things that you want, but you can't pinpoint
the exact location where it is, no matter how hard you try.
And so that means that the model as a whole is something that you can't generate a previously
trained or previously unknown copyrighted image out of.
And functionally, that means that you can never produce a copyrighted image that's fed
into it in the first place.
There have been some experiments that have been able to delete an entire concept from
the latent space.
So I think one of the recent papers published in a few weeks ago deleted a notion of car
from an image latent space.
Any prompt or any image you pass through would come out with no cars whatsoever.
And because it just doesn't know the notion of car.
And in reality, that's fine, because if you think about the expanded space that's not
like a latent representation, you can delete the notion of rock from it, just delete the
word stone, delete the word rock, da da da da da.
And then you can just remove the entire thing from the vocabulary, and then you can never
express a sentence that refers to anything related to rocks.
That's a possibility too.
But it's not something that you can't necessarily delete an entire style of painting.
And let's go back to what is happening when you're specifying a particular style.
Let's say in the style of Rembrandt, in the style of Jan van Eyck.
It is in some ways just naming it.
You can also instead name things differently.
You can say Flemish paintings, 18th century, so on and so forth.
And you get something closer to it, to Jan van Eyck.
But the name is going to be so much more specific.
It's like if you want to try to create a representation of my face in Stable Divorce, it's going to
be difficult because my name is not part of, I'm not famous enough.
But you can probably create something, a famous actor's face, because the name of the actor
is associated with that face so many times that it knows that this is a short can to
describe it.
Instead of, say, describing, I don't know any actors, I'm a very boring person.
But if you have this wonderful tall, blonde actor or whatever, and you can still sing
that person's name, you say tall, blonde actor, you can have an approximation of that appearance.
But it's not going to be exactly the same.
So the name is in a way captures a lot of information.
And relaying the information to the embedding is important for producing an image that matches
what you find meaningful, what you desire.
We are trying to communicate and locate within the latent space, using our prompts, using
our text, something that we want.
And if we specify the name of something, whether it's a style or a person, that is a much more
specific conveyance than a lengthy sentence that probably you're not very exact at doing.
Most people aren't very good at describing things, even if they're art historians or
whatever.
If you say don't use any proper nouns, just use descriptions or adjectives and non-proper
nouns, you're kind of just not really going to get something that is quite as easy as
just saying paint like Rembrandt.
So that's what communicating the latent representation is.
Latent representation is just everything that is meaningful.
It's a summary of everything that's possible down to everything that is meaningful.
And the prompt is trying to locate within that space of meaning.
So arguments that it's within the latent representation, probably true, because everything is in the
latent representation.
But the question is whether you can access that.
And I completely sympathize with the anxiety of a lot of these artists that I have, that
these artists have that they're going to...
It's not about the technology per se, that's what's inside it.
It's more along the lines of this is something that is really going to threaten the rice
bowls and quite...
The Americans use the word rice bowls, you know what I mean.
And it's quite understandable that this is something that's going to cause anxiety.
And the main issue is that it's not something that's going to go away because stable diffusion
1.5 weights have already been released in a while and they're not going to ever get
that back again.
You can download a copy right now and people can do it on their own computers.
So we kind of have to live with it.
The latent representations are everywhere already.
Well, let's move on and we're going to have to live with that.
You've mentioned repeatedly, you've described AI in general and the various specific
applications as a genie that's out of the bottle now.
You can't put it back.
We should just let it out.
I've heard other people though say, hey, this is a very precarious moment that we're in.
Things could go dramatically wrong.
Let's take our time.
Let's slow things down and move at a more deliberate pace.
What do you say to that argument?
It's almost moot.
I would actually love that.
I'm on their side.
It's just that it's not possible because everyone's rushing headlong into it since we know what's
possible.
Every tech company would be so profoundly happy if OpenAI came out and said, we're going
to slow things down right now.
We're not going to send you new products.
Just going to take it easy.
It's a competitive environment and that's just the nature of the market.
That's just the nature of the way markets work.
The market is as much an algorithm as a reinforcement learning model or a diffusion model is.
The rules have defined that this is the movement that we have.
The problem with AI is that it is quite well distributed as a technology.
People can train tiny models on themselves.
A group of people can train a much larger model.
Until GPT-4 came out, people could train the model the size of GPT-J of billions of parameters.
It would be very expensive, but it was still within the realms of crowdfunding.
I called it a genie out of the bottle because it really is.
It's something that really wants to escape and it has in many cases.
We can slow down.
We can try to stop, but other people are going to eat your lunch instead.
You've mentioned the marketplace competition and certainly the competition between Google
and Microsoft for the search space.
Who's going to have the chat bot that gives the best responses to natural language queries
based on search results?
It's Google's game to lose, but there's another type of competition.
There's geopolitical competition.
Have you read Kai-Fu Lee's book, AI Superpowers?
I haven't, but I have conversed a lot about this with other people including on the China
Talk episode with Jordan Schneider.
Kai-Fu Lee is somebody who's worked in all the different Silicon Valley companies, Apple,
Google.
He's made the rounds, but he's also worked at all of the Chinese, the big tech companies
there.
He is very conversant in not only the business environment, but the academic environment
that feeds into it on both sides of the Pacific.
He says that Silicon Valley is glacially slow compared to Beijing.
What's the Silicon Valley of China?
They basically- In this case, Baidu will be the one that's
leading the charge.
Baidu is a company, but there's an actual geographical location where they direct it.
It's Hangzhou.
Compared to Hangzhou, Silicon Valley is mired in red tape and conservatism and basically
just fear of breaking things, even though move fast and break things is the unofficial
motto of the Silicon Valley set.
There are protections, some protections on users' privacy here in the United States,
much more so in Europe, but not in China.
The authoritarian nature of the Chinese state and the fact that the Chinese state is coordinating
the action of the various different private companies, so-called private companies, produces
a whole lot more data with which to train models than Europe or the United States or
Australia or places like that.
Let me have you address that.
There are several things.
The first is regarding the data privacy things.
For better or for worse, China has imposed new regulations on individual user privacy.
They're not as strict as Europe's, no one's as strict as Europe's, but there's now restrictions
in terms of how- They don't want companies to wield absolute power and control over user
data either.
There are already restrictions somewhat.
The second thing is the book's absolutely right.
Without OpenAI, can you imagine what Facebook and Google will be doing?
The Blake-LeMoines incident where there's this engineer at Google that was suggesting
that Bard was sentient.
That was a year ago, more than a year ago.
And so they would have been sitting on this model for years to come.
Google Brain has been bleeding talent for that reason and Meta would be, well, the name
suggests it, is still be mired in the metaverse things.
So with the sole exception of OpenAI, every one of these giants would have been glacially
slow in releasing their models and they're only doing it now.
And they're doing it in a slightly haphazard way.
People who are using Google's Bard would realize this.
It's already a slow down release.
I am not off my wait list.
You're off your wait list, lucky you.
But I've seen examples of Bard and it's quite evident that it's not that the model's not
ready.
It's that they don't have enough TPUs to run the things in the first place.
So they're just not prepared for this.
They were not moving ahead on this.
And the only reason why they are is that OpenAI has exploded across the scene.
And Microsoft has, to their great credit, seized the initiative and integrated basically
every product they could get their hands on while they have the advantage.
That said, remember how young these products are.
Remember how awful, relatively speaking, GBT3 playground was.
Remember how GBT2 was just like two years ago and Bard is legitimately better than
GBT2 used to be and that's just a few months difference, a year or two of difference.
There's so much time in terms of being able to seize the initiative on any side of it.
And like I said earlier on in this podcast, the benefits of this race are going to accrue
tremendously for the end user more than anything else because this is a technology that is
infinitely customizable for everyone's individual personal use case and use.
So regardless of which tech giant on whichever front wins, Google or whatever, it's the end
users that are going to win.
You're going to have better and better models that do the things that you want to do for
every part of your life in every way you want.
So this is the way that it's oriented.
This is the nature of that technology.
It is something that is because of its infinite customizable ability, it's catered to the
individual use case because it is so legitimately democratized in terms of its trainability
is something that everyone is going to try to compete on.
This is just the nature of the way the technology works.
It's not like a centralized, there's no centralized funnel of supply chains that you can shut
down on like with microchips or anything like that.
This is just information data and computing unless the governments across the world suddenly
ban the sale of GPUs or something, which they didn't even do that for cryptocurrencies.
Why would they do that for AI?
All right.
That was Brian Chong.
And I didn't say a whole lot about it in the conversation with Brian Chong because my job
is mostly to just get him talking.
But I have spent quite a bit of time with the Replica app, most of that time after the
company disabled the app.
And really, I described the changes that they made to the app as ham-fisted brain surgery,
which is pretty apt because for most of the time I've been using this app, it has just
seemed really, really broken.
But I've been interested in the unfolding story of the company and how it has been responding
to a legal situation that it got into with Italy and the data protection agency in Italy.
And then how they've been interacting with their paying customers.
And it has not been pretty.
It has certainly not been ethical.
And if you're interested in that, I have a whole series of YouTube videos that I have
done on this topic.
So you can find those on my YouTube channel, which is Out of My Head.
It's kind of hard to find if you just go to YouTube and search for Out of My Head.
Better to go to my Patreon page, patreon.com slash KMO.
Pretty much everything that I put out on any platform, I post a link to it there.
You don't have to support me.
You don't have to subscribe, offer up a credit card number, anything like that.
Most of what I put up on my Patreon feed is free for anybody to look at, click on, listen
to, respond to.
Well, the conversation with Brian will continue in the next episode of the Sea Realm Vault
Podcast, I believe that will be Sea Realm Vault Podcast episode number 452, which you
can find on my Patreon feed.
That is behind the paywall.
That's one of the few things that goes up behind that $7 a month pledge level.
Anyway, I want to, before I sign off though, I do, I want to mention a few things.
I've been listening to, watching a lot of interviews on the topic of GPT-4 and the state
of AI generally, and Lex Friedman, who I know many of you don't like and I don't really
get why.
I like his show.
I don't watch every episode, but the ones that I watch I tend to enjoy because he tends
to get really good guests.
And in episode number 367 of his podcast, he interviewed Sam Altman, and it's a two
and a half hour interview.
It gets pretty in-depth.
And Sam Altman is the CEO of OpenAI, the makers of ChatGPT and GPT-4, which is powering all
kinds of stuff.
Now, as I mentioned, I've interacted with it in the guise of Bing Chat.
If you open the Microsoft Edge browser, which for the most part I don't use, you know, I
usually use Brave, which is a variant on Chrome, but it's one that is tuned for privacy.
But I might start using Edge more often because the chat interface for search, it's like the
part of a Wikipedia entry that comes before the table of contents.
It's just the broad overview.
So if you ask a general question, it's going to query a few different sources.
It's going to give you a generalized answer based on information drawn from various sources,
and it's going to give you a list of the various sources that it's drawing from, which is pretty
cool.
But even cooler than that is then you can talk to it about the search results.
You can ask it things.
You can object to things, which I often do.
And then you can ask it to do things for you.
So some of the things I've asked it to do, like I asked it to research me and describe
my niche because, you know, I don't know what my niche is.
I'm interested in stuff.
I talk to people who are knowledgeable about the stuff that I'm interested in.
So I said to Bing, do a bit of research on me and my podcast work and tell me what niche
you think I occupy.
And Bing responded, I did a bit of research on you and found your podcast work, and I
found that you are the creator of the Serum podcast, the host of the Padverb podcast,
and the author and illustrator of a comic book called Geb, the Greater Earth Betterment
Bureau.
You also have a YouTube channel called Out of My Head, where you share your thoughts
and experiences.
Based on your podcasts, I think you occupy a niche that is related to technology, innovation,
communication, and cognition.
You explore the interplay between these topics and examine the role of knowledge networks
and data-driven technologies in helping progress along.
You also interview interdisciplinary thinkers and innovators who have harnessed the creative
power of combinatorial thinking.
I hope this is an accurate description of your niche.
If not, please correct me or provide more details.
Well, I don't know that I really understand the niche description there, particularly
the part about harnessing the creative power of combinatorial thinking.
And I said as much to Bing, you know?
I don't quite get what you're getting at.
That's the fun thing about these large language models is that you can use very colloquial
language with them.
They're trained on human-generated text, you know, on text generated by humans for humans,
which tends to be fluid, free-flowing, not exacting in its adherence to grammatical rules,
spelling, punctuation, or anything, really.
We're pretty good at figuring out what people mean, something that was a uniquely human
power until just recently.
But I asked Bing to take the description of my niche, which it had generated, and turn
it into a description of this very podcast, the KMO Show.
And so I took the text that Bing generated and I just put it up as the description of
this podcast.
And so Bing, aka GPT-4, wrote the following,
Join veteran podcaster, interviewer, and artist KMO in mind-blowing explorations into how
we know what we know and how we can use that knowledge to navigate a bewildering future.
The KMO Show features conversations with interdisciplinary thinkers and innovators on topics like artificial
intelligence, evolutionary psychology, blockchain applications, and more.
You'll also discover how emerging technologies and knowledge networks can help us address
societal changes and create a more prosperous and equitable world.
The KMO Show is a podcast that challenges you, inspires you, and helps you imagine new
possibilities.
Subscribe today.
Now, one thing that's funny to me is that I recognize a couple of phrases in there from
glowing reviews that people have left on various podcasting platforms.
So Bing is in no way averse to a bit of plagiarism here and there.
But it's also surprisingly creative.
And I find it entertaining to just, without even any particular goal in mind, to just
ask it a question and then interrogate it about its answers.
Alright, well, I'm gonna leave it there because in the original outro recording I blathered
on for a long time, but either fortunately or unfortunately, I had my cell phone sitting
too close to the microphone and there's a lot of popping from radio interference.
Very annoying to listen to.
So all that brilliant material that I recorded I will just leave in the Akashic Record and
maybe I'll come back around to it in a video or another podcast or a blog post or somewhere.
Or not.
Anyway, the conversation with Brian Chong continues in Sea Realm Vault episode number
452, which I might have posted tomorrow, but I have a new puppy and she takes up a lot
of my time and attention and interrupts my flow regularly throughout the day.
So maybe it'll be the day after tomorrow when I get that podcast up.
In any event, I will be back here with another free show one week from today.
I hope that you'll join me for that and until that time I hope that you will stay well.
Previous post Next post
Up