Superfast AI 2/13/23

Google’s Bard vs Bing and ChatGPT, RunwayML’s Gen-1 model, and robots responding to natural language prompts.

Happy Superbowl weekend! I hope your team won, otherwise… we don’t have to talk about it. We can just talk about the ads and the half-time show. ✨

Anyway, today we’re covering Google’s Bard vs Bing and ChatGPT, RunwayML’s Gen-1 model, and robots responding to natural language prompts.

Let’s dive in.

🗞️ News

Google vs Microsoft

Last week there were two major media events: Bing’s unveiling of ChatGPT on platform, and Google’s unveiling of Bard, their chatbot rival.

The overall sentiment: Bing had a successful press conference, while Google’s event flopped a bit.

Bing released a demo of ChatGPT x Bing which sent Bing’s app soaring from the 175th place in the app store to the 2nd spot. Why? In order to get on the Bing-ChatGPT waitlist, you need to sign-up on the app. Good move.

Google meanwhile couldn’t find a Pixel phone to demo Bard on, its chatbot responses (based on LaMDA) returned inaccurate results.

What’s going on with Google? Last week, they announced a hefty $300M investment in Anthropic, but used their own in-house LaMDA model to launch Bard to the public. Google also has access and a lot of ties to AI talent now: in-house AI labs at Google Brain, the DeepMind subsidiary, and now an investment in cutting-edge research company, Anthropic. Why is Google falling behind, or is that just 0-1 for now?

It’ll be exciting to see how the developments with both Microsoft and Google unfold over the coming weeks and months. We’ll be keeping our eyes peeled 👀 that’s for sure.

RunwayML releases Gen-1

Their video[+text]-to-video model (link). Think of this like GarageBand or iMovie for videos, except you can change the video theme with one line of text (make this video claymation-style; turn this video into the style of a charcoal sketchbook). Check out the demo video here:

Speaking of GarageBand, last week I wrote about Google’s text-to-music model MusicLM and its awesome capabilities so far.

The Batch covered an analysis of the MusicLM compare here, which is worth checking out but a few quick stats stood out to me. Researchers compared MusicLM to Riffusion and Mubert, other text-to-music models, and determined that MusicLM performed the best:

  • MusicLM created the best match (of music to text prompt) 30% of the time

  • Riffusion: 15.2%

  • Mubert: 9.3%

It will be interesting to see how AI music generation progresses, and if it has similar effects on music artists as DALL-E, Stable Diffusion and Midjourney have had for visual artists. Imagine a world where:

  • lyrics are created on LLMs like ChatGPT

  • music is created on models like MusicLM

  • storyboards are created on image models like DALL-E and

  • all of these parts are pieced together on video models like RunwayML’s Gen-1

Exciting stuff!

A report from a16z on everyday AI

a16z breaks down the different use cases of generative AI in every day consumer goods, including:

  • Search and Product Discovery

  • Education

  • Human Connection

  • Mental Health

  • Content Creation

  • Gaming

  • E-commerce

Check out the full article here.

Assorted links

  • Some examples of how you can jailbreak ChatGPT (link)

  • Quora has a new chatbot named Poe (link)

📚 Concepts & Learning

Innovator’s Dilemma

The innovator’s dilemma is the idea that incumbents are unable or unwilling to disrupt their core business to adopt new technology, which is what allows new entrants, like startups, to disrupt legacy industries.

Is this what we’re seeing with Google and OpenAI? If Google adopts a primarily chat based search product, how will that affect their ads revenue?

Forbes dives into a fuller piece which you can check out here, and Pete Huang has a great summary which you can check out here.

We teach kids the basics first, what about AI and robots?

In this article, the authors explore what improvements on training can be made on models who are taught the basics about the world before diving into complex tasks. A shorthand version of this is: people learn simpler concepts before moving onto more complex tasks (arithmetic before calculus), so what if we applied the same learning schema to machine models. Do they learn better? Faster? Do they retain more?

Researchers in a project called SayCan tested this idea. They built a text-to-action model that helps a robot complete multi-step tasks in a test environment from ambiguous, indirect (natural) language. In the video below, the robot is taught to help clean up a spill with a sponge by learning:

  • what a sponge is

  • where it is located in the room

  • that it should pick up the sponge

  • that it should transport it to the relevant spill location and

  • that is should drop it off

That’s a lot of steps for an abstract prompt like “I spilled my drink, can you help?”

You can check out the full video here:

This makes me think of Rice Robotics, a Hong Kong-based robotics company that builds food delivery robots that autonomously navigate buildings, can call elevators and move across multiple building levels. You can check out a demo here:

This also makes me think of Atlas from Boston Dynamics that launched a fantastic demo video of the newest robot versatility just 3 weeks ago, which you can check out here:

What’s next for robotics and AI?

🎁 Miscellaneous

Is ChatGPT politically biased?

  • Check out this quick run-down by Morningbrew (link)

Google recently

Wild stat

  • 672 million users visited ChatGPT in January 2023 alone (link)

What did you think about this week’s newsletter? Send me a DM on Twitter @barralexandra

That’s it! Have a great day and see you next week! 👋

Thanks for reading Superfast AI. If you enjoyed this post, feel free to share it with any AI-curious friends. Cheers!