This (longer) week in multimodal ai art (25/Jun - 05/Jul)
This week we got VQ-Diffusion colab, 1-line DIsco Diffusion, face edition with CLIP, a CLIP in Turkish and an amazing synthetic Aesthetic Captions dataset
Hi all! Again the newsletter on a bit of a weird timeline - this time is because I’m on vacation. However, the updates this week are really exciting!
Text-to-image updates
VQ-Diffusion Colab released* Discoart released* Deep Image Diffusion Prior released* TeCM-CLIP - manipulate faces with CLIP* MGAD - image prompts to Diffusion*
CLIP and CLIP-like models
TrCLIP - Turkish CLIP released*
Datasets
Simulacra Aesthetic Captions released*
Education
* code released
Text-to-Image synthesizers:
- VQ-Diffusion Colab released (Colab)
by Cene655
VQ-Diffusion is a new approach for text-to-image generation released by Microsoft, where a diffusion model is used on a VQ-VAE latent space. We have reported ithere a few weeks agoand now a Colab notebook is out for it



- Discoart released (GitHub)
by Jina AI
A Python library to run the famous Disco Diffusion text-to-image model with one line of code, while still supporting most of the colab notebook features.


- Deep Image Diffusion Prior released (GitHub)
by @nousr_ and @laion_ai
Deep Image Diffusion Prior is a technique that combines the DALL-E 2 CLIp text to image embedding together with Deep Image Prior technique by Katherine Crowson and Daniel Russell to visualize the features in CLIP's weights corresponding to activations from your prompt.


- MGAD - image prompts to Diffusion (GitHub)
by Nisha Huang
MGAD enables more modalities (other than text) to be used as inputs for text-to-image diffusion models - so image prompts can be used to help guide and give style to images.


- TeCM-CLIP - manipulate faces with CLIP (GitHub)
by Nisha Huang
TeCM CLIP is a face editing/manipulation tool that enables the use of natural language to provide text-editing to pre-existing faces.


New CLIP and CLIP-like models:
TrCLIP - Turkish CLIP released (GitHub)
by Yusuf Anı
Yusuf Anı on GitHub released TrCLIP - a CLIP model in the Turkish language. More information will be out after the INISTA 2022 conference.
Datasets:
Simulacra Aesthetic Captions (GitHub)
by John David Pressman
JD Pressman released Simulacra Aesthetic Captions - a dataset of images generated from text together with their prompts and the aesthetic score given by users for it. This dataset enables for prompt analysis, training aesthetic predictors (such as LAION's), and many more use-cases as listed on the GitHub page.