This week in multimodal AI art! (07/May - 14/May)

Multi-purpose agent GATO announced; MindsEye Lite, a CLIP-CLOP - a collage Maker, Latent Princess Generator and more released

multimodal ai art

May 15, 2022

Follow on Twitter, come hang on Discord, consider supporting on Patreon. Try MindsEye, our image generation GUI

* code released

Multi-modal single models:

- GATO announced (Paper, Blog post)

by DeepMind

DeepMind announced GATO - a generalist agent capable of performing tasks that go from Image Captioning to playing Atari. The innovation here is that this is using the same pre-trained model and weights for all those different tasks, rather than a few models stitched together. A huge step for multi-modal models and beyond! Very promising - but nothing open source released

DeepMind @DeepMind

Gato🐈a scalable generalist agent that uses a single transformer with exactly the same weights to play Atari, follow text instructions, caption images, chat with people, control a real robot arm, and more: dpmd.ai/Gato Paper: dpmd.ai/Gato-paper 1/

Collage synthesizers updates:

- CLIP-CLOP collage assembler released (Colab, Paper, GitHub)

by DeepMind

A collage assembler using CLIP! One can load their image datasets, type a prompt and CLIP will guide rotating, resizing, moving it around the canvas to fulfill the prompt

AK @ak92501

CLIP-CLOP: CLIP-Guided Collage and Photomontage abs: arxiv.org/abs/2205.03146

Text-to-Image synthesizers:

- MindsEye Lite released (Hugging Face Spaces)

by multimodalart (yes, that's us!)

We are happy to have released MindsEye Lite this week! A tool to use multiple simplified versions of many multi-modal AI art models in one place. Can be good for quick experiments and to introduce new people to this technology. Try it out! For a full version try out MindsEye beta

multimodal ai art @multimodalart

I've released MindsEye Lite👁️🧠: a UI that runs multiple text-to-image models without Colabs or logins - directly on Hugging Face Spaces Run Diffusion, DALLE replicas, VQGAN+CLIP. Try it out and consider sending it to someone that tried used AI art yet! huggingface.co/spaces/multimo…

- Centipede Diffusion v3 Inpainting Upgrade (colab)

by Zalring

We reported Centipede Diffusion three weeks ago here, it generates images with Latent Diffusion and then upscales with Disco Diffusion. It got an update to add a UI inpainting check it out!

Zalring @ZalringTW

Centipede Diffusion V3 is out: with real-time mask drawing for inpainting and Real-ESRGAN upscaling. colab.research.google.com/github/Zalring…

- Latent Princess Generator (Colab)

by Dango233 and multimodalart

We have released Dango233 new model! Dango is a prolific developer in this space and it was an honor to work with him to put Latent Princess Generator in a Colab and release it. Despite the name, it is a general purpose model and one of the best available so far. We are still trying to figure out default settings. So if you get something good, share with us on Twitter.

multimodal ai art @multimodalart

Latent Princess Generator v1 released! 👸 In the last few days I've worked with Dango233 to release his newest work: a very specially flavored CLIP Guided Latent Diffusion approach Results are superb - try it out on the Colab (soon on MindsEye) colab.research.google.com/github/Dango23…

New CLIP and CLIP like models:

Multi-Modal-Comparators released (GitHub)

by David Marx (@DigThatData)

David Marx released a very useful library: Multi-Modal-Comparators. It is basically a wrapper for all CLIP and CLIP-like models available online. This can be plugged into any multimodal AI art model (VQGAN+CLIP, Guided Diffusion, etc.) so new CLIP models beyond OpenAI's can be leveraged. It was released together with thenew Pytti-Tools and I have hooked it to the Latent Princess Generator as well.

David Marx @DigThatData

Been working on a new tool to facilitate quickly adding support for new CLIP perceptors to AI art colabs. The tool is modality agnostic (i.e. I'll be adding models for other modalities soon) and can "mock" the OpenAI CLIP API for "drop-in" support! github.com/dmarx/Multi-Mo… 1/2

github.comGitHub - dmarx/Multi-Modal-Comparators: Unified API to facilitate usage of pre-trained “perceptor” models, a la CLIPUnified API to facilitate usage of pre-trained "perceptor" models, a la CLIP - GitHub - dmarx/Multi-Modal-Comparators: Unified API to facilitate usage of pre-trained "perceptor"...

Learning AI Art:

AIAIArt course (GitHub, Discord)

AIArt is a free and open source AI art course by John Whitaker. There are synchronous classes for the next few Saturdays 4 PM UTC on Twitch. All previous classes stay recorded and available on Google Colabs on the GitHub link

Multimodal AI art News

Discussion about this post