People have loved playing around with the advances AI has brought to images, making your friends look old or putting them in a Van Goth painting - but no one has made it easy for people to use the recent advances in natural language processing to create their own AI's to write, generate and converse from textual data.
How would your friend speak as an AI? Could you get an AI to rewrite the last episode of Game of Thrones? Which celebrity would you most like to speak with?
Once off the ground, this can be extended into developing conversational AI for videogames, helping draft documents just like you would and more - democratizing access to NLP.
This isn't only for fun, there are important ethical considerations. Many don't appreciate the high level of text and voice cloning that is available, making them vulnerable to scams or fake news. By playing with the current state of the art, we will give people insight and understanding into it's power and limitations, and soothe anxieties about advancing technology.
What it does
Uses transcripts to understand a celebrity, friend or document - and then generate new text and audio like them. As a centrepiece, we cloned Joe Rogan - so that you can be a guest on his podcast today.
How I built it
We are leveraged machine learning models pretrained on millions of text documents (such as all of wikipedia and the news form the last 5 years) and fine tuned them on podcast transcripts and audio for our purposes - just as users will be able to do. We used the state-of-the-art ML models GPT-2 and DC-TTS for text and audio respectively.
Challenges I ran into
A model is only as good as the data you give it. We ran into large challenges in training the audio model, and in integrating the various components. Because the models are standalone and take up significant GPU memory, voice and text generation can't run at the same time. In future, we would fix this with batching user requests on separate cloud machines.
Accomplishments that I'm proud of
Getting the text generation model to generate incredibly realistic conversations. This was a careful balance between training the model to imitate Joe's personality and not overtraining it to the extent that it lost the ability to generalise.
What I learned
How to sensitively balance training regimes on ML models, integrate ML models with a website for production. The often surprising things that an AI will say!
What's next for Joe.AI
Productionising what we used to train Joe Rogan into an easy to use product so people can use their conversations with friends or other data with our models. Then see what consumers use the product for. We have many potential applications to explore such as:
- Generating scripts and audio for video game characters drastically reducing production costs.
- Draft messages, improving productivity.
- Generating voices for figures in wax museums like US Presidents! -Rewriting the final episode of Game of Thrones.
Example transcripts: (Both sides generated by the model)
[Joe Rogan]: You're basically advocating that if you have a problem, you should put your money where your mouth is, and not contribute to something.
[INPUT]: This is the best. The only thing that could stop me -- Well, everyone, to be clear. I'm only talking about my own personal life when I say this, and I'm sure that I'm wrong. But I think, just at a basic level, anybody in a position of authority has no business telling somebody else, no matter what their qualifications or what their position with the rest of society, what to do. People who have superpowers are never going to be able to convince a judge that their amazing powers are not real.
[Joe Rogan]: What do you make of the whole whole Trump thing?
[INPUT]: I mean, there's obviously a huge amount of distrust of the mainstream media, a huge amount of distrust of the political system. I haven't really been paying the most detailed attention to what the politics of this are in this country, but it seems like, there's also a massive amount of enthusiasm for a brand new, different, different way of organizing and behaving. And I think, more and more people are realizing that this system is not going to provide a fair, just, equal distribution of resources, and I think, like any generation, we have to look to where the lessons are that we're bringing to bear, and not to throw our hands up and move on.
[Joe Rogan]: Well, in the last election, in the last four years, have you thought about what could have been done to change the outcome and change the way you perceived the direction of this country, that you -- To your knowledge, did you consider that maybe if you'd done a little more of those little plays in 2016 that might have moved things towards a different outcome?
[INPUT]: To a certain extent. Yeah. I mean, I think, like, I think that, you know, the real takeaway is there's an opportunity out there, where we -- You know, you see, like, we've seen revolutions happen before, where people have had too much. It's not just a one-time thing that happens to you. Where we have a situation where we have enough resources to do something monumental. So, in a sense, when things look like they're going so wrong, it's a moment to look at what the lessons ought to be. And I think that -- Again, I'm not talking about this one particular time. Like I think that the -- In other words, I think the real takeaway is, like, the power of ideas and the ability to amplify them, the ability to engage. And I think that's what made the Arab Spring possible. It was a fundamental engagement with the way governments and the elites treat their own citizens, and being able to say, okay, look, this is why I'm doing this particular thing, and how do I amplify that message if possible?
[Joe Rogan]: It's a tough conversation in many ways because, again, there are things that we need to learn as an individual, and then, as a society, as a world, that we can all learn from.
[INPUT]: The lesson that I would hope is, you know, you have -- You know, if we look at a lot of these problems of inequality, racism, sexism, all the -- you know, we saw this in the Ferguson case that really showed how unequal this country actually is. There was some people, I would guess around ninety per cent, were black. Pretty much, the rest was white. And we don't have a sense how this is going to -- How this will shake out over time.
[Joe Rogan]: You know, on the topic of policing, what can we-- What's the right kind of policing? And what's the wrong kind of policing? And what kinds of techniques are in play that actually reduce racial tensions? And what are the mistakes that police departments are making, and are they making good ones and are they making bad ones? And just general guidelines for policing.
[INPUT]: Hey, Sam, I have this interesting quote. I have this quote from a journalist, and it relates to the idea of training police departments. And I thought you could read it, but I don\t want to read it. And you know, what it is is this idea that in cities, we train police departments to be like the best SWAT teams in the world, because that's what we think police officers should do. But what the fuck should police departments do?
[Joe Rogan]: Because we have so many people that want to use police as a means to an end
##################################### Input from us
When was the last time we talked, 10000 years ago?
Joe Rogan: I don't know. I imagine about 3000 and then I think we'll be back to that. That's a realistic date that I would bring that is a million years. That would be the best estimate. I don't know how much younger that might actually be, but just the idea of us really sitting back and having a billion years of evolution, you know, is pretty exciting. It's probably the most exciting thing about humanity.
Do you think we will change much over the next billion years? Or have we fully evolved.
Joe Rogan: You know, when I look at the world, I see, you know, this is what appears to be a very complex multicellular creature with eyes and a nose.
Is that the only thing we will have in the end, eyes and a nose? No heart?
Joe Rogan: It's not the heart, but you get the idea. There's a lot going on.
Sounds like there is only two things going on.
Joe Rogan: Well, there are a whole lot of things. But I will say, there's probably 20% of the stuff that doesn't even get noticed, that is, are more intelligent than us. But we're not thinking much about it. And in fact, I'm sure some of it is good. Like, I think, probably 90% of what we call the physical world is probably more magical than real.
How can we access that 90%, through psychedelics?
Joe Rogan: Well, obviously, you can reach it through drugs.
Joe Rogan: Yeah, you can reach it through drugs. So the real question is, I mean, in terms of the mind, I think, we're in an insane phase of technology, right?
Yeah, I'm most excited for neuralink - an exciting new technology that will let us access it without drugs.
Joe Rogan: Neuralink is a chip that is implanted in the brain.