Superfast AI 12/18/22

The cost of AI, watermarks, and custom AI-generated gift ideas... superfast!

Hi everyone, welcome to the second Superfast AI newsletter. Today we’ll cover ML hardware, generative AI watermarks, and AI alignment methods… superfast!

Okay… let’s dive in.

🗞 News

Hardware

Cerebras Systems has unveiled Andromeda (link)

  • Andromeda is a 13.5 million core AI supercomputer with more cores than 1,953 Nvidia A100 GPUs.

  • It delivers over 1 exaflop of AI compute and 120 petaflops of dense compute.

  • Andromeda demonstrates near-perfect linear scaling on large language model workloads using simple data parallelism.

Why this matters:

  • Scale: Hardware matters. ML training is constrained by hardware capabilities. Andromeda is one of the largest AI supercomputers ever built. This large scale can enable it to perform complex tasks more quickly and efficiently, potentially leading to significant time and cost savings for users.

  • Near-perfect scaling: Andromeda demonstrates near-perfect linear scaling on large language model workloads using simple data parallelism, meaning that as additional CS-2s are used, training time is reduced in near-perfect proportion. This can be especially valuable for tasks that require long sequence lengths, which are difficult or impossible to achieve on GPUs.

The cost of ML training

Where do hardware and software meet? How can startups leverage the state-of-the-art hardware to build best-in-class software? Cerebras and Cirrascale have a solution:

  1. Cerebras Systems, a GPU system maker, and Cirrascale, a cloud computing partner, have launched a service to rent their ML training systems for companies looking to build proprietary models. In this article, the two companies provide the first public data on the cost of training GPT-level models on their systems.

  2. The pricing information was released in partnership with a current customer, Jasper AI. Jasper AI leverages the training system built by Cerebras and Cirrascale to support their own AI-driven content creation services.

  3. Why does this matter? This training service offers a cost-effective alternative to buying an Andromeda AI supercomputer outright, which costs nearly $30M. This will lower the barrier to entry for startups who want to build in the ML space.

Want to learn more? Dive into the full article here.

Related: GPT-3 quality models for <$500k (Mosaic ML) and Chinchilla Scaling Laws (Arxiv)

OpenAI is funding AI startups

Want to learn more about their investment thesis? Check out this article with their COO, Brad Lightcap.

Stack Overflow bans ChatGPT outputs (link)

But how do they verify this? Check out the section on Watermarks in the Concepts & Learning section below.

📚 Concepts & Learning

💧 Watermarks

Scott Aaronson, researcher at OpenAI, is working on cryptographic methods to trace GPT-generated text. Here are the key takeaways:

  • A prototype tool was developed to statistically watermark outputs of a text model like GPT to make it harder to pass off GPT output as human-generated.

  • Watermarking works by selecting tokens pseudorandomly using a cryptographic pseudorandom function, with a key known only to OpenAI.

  • The watermarking approach is theoretically analyzable and has a rigorous upper bound on the number of tokens needed to distinguish watermarked from non-watermarked text with a certain level of confidence.

Dive into his full lecture here.

Related: The Montreal AI Ethics Institute recently dove into the same subject — how will we be able to distinguish machine generated text from original human generation as it becomes more entangled with our every day work? Check out MAIEI’s full article here.

🥪 Sandwiching

  • What is it? Sandwiching is an experimental method for testing scalable oversight ideas in AI alignment.

  • The problem? As AI scales, it has the potential to become more sophisticated and intelligent than humans. As it does, humans may no longer be able to test the AI’s alignment with human values and goals using existing methods.

  • What’s the new method? Sandwiching pairs a language model with a non-expert human to solve a hard task. Then, those results are compared to results produced by an expert human. If some procedure can enable the language model and a non-expert to produce similar results to an expert, then we have hope that this procedure will allow the non-expert to oversee the expert even when the expert is… a sophisticated model.

  • Why does it matter? Sandwiching helps researchers uncover whether AI systems are aligned with human values and goals, which is essential for the safe and ethical development and use of AI. By testing oversight methods that are scalable as AI improves (and potentially becomes more intelligent than humans), researchers can gain valuable insights into how to design and implement aligned models.

You can read the original discussion by Ajeya Cotra here.

You can read Sam Bowman’s recent discussion here.

  • Sandwiching experiments can be slow, requiring multiple iterations and many human subjects to measure average performance accurately.

  • Some sandwiching experiments may be able to be trialed more quickly by using a weaker language model as a proxy for the non-expert human, but it is difficult to determine which experiments these are.

✍️ Prompt Generation

🎁 Miscellaneous

AI Generated Gifts

Thinking about holiday gifts this year? What about a personalized, AI-generated profile picture? Mix that with Etsy or Redbubble or just hit print and… boom! You have some custom swag for your friends and family.

Here are some AI profile picture generators:

What’s on Twitter

Do you hate making decks? Try this:

For all those Miss Excel fans out there, here’s the ChatGPT version:

Can AI grade your homework?

What’s on GitHub

  • Awesome ChatGPT prompts here.

  • Two awesome lists of curated tools, demos, docs using ChatGPT and GPT-3 here and here.

From the web

  • Save the date: AI Film Festival hosted by Runway ML: (link)

  • Looking for an interactive AI dictionary? Try this.

  • Enjoy a rap about the business model canvas and more in this witty blog by a Wharton professor on how he might use AI to help him teach classes (link)

🎨 Generative AI

AI as a prompt engineer

Recently, there have been some great threads about language models designing prompts for image models (very recursive). Here’s a great example of AI “making a movie”:

So I thought I’d give this a try! Here’s what I did:

ChatGPT prompt: Can you generate an image prompt for DALL-E? Make it winter and holiday themed.

ChatGPT output: A festive Christmas scene featuring a snow-covered village with colorful houses and shops, a cozy fireplace in the foreground, and a group of happy carolers singing under the twinkling lights of a Christmas tree.

Image outputs:

Those images look pretty standard. So I asked ChatGPT to give me a more unusual prompt.

Updated ChatGPT output: A surreal, stylized depiction of a winter wonderland, featuring towering ice sculptures resembling abstract, geometric shapes and patterns, with a brilliant, glowing sun setting behind them. The scene is dotted with colorful, festive lights, and a group of happy carolers can be seen singing under the twinkling lights of a Christmas tree in the distance.

Image outputs:

What do you think?

AI as a marketing agent

Which ad resonates with you the most?

Prompt: Generate a social ad copy for the product: Wireless Earbuds.

That’s it! Have a great week and see you next Sunday! 👋

Thanks for reading Superfast AI. If you enjoyed this post, feel free to share it with any AI-curious friends. Cheers!

🎤 ChatGPT, can you tell me a joke?

… I guess not. 😄