Inference using Conditional Diffusion Model¶

In this notebook we show how to perform inference using GT4SD and the models implemented in Diffusers.

Let's see how we can use a latent diffusion model (LDM) to generate images conditioning on text. The first time you run this cell a bunch of tokenizers, encoders, unet models will be downloaded.

In [1]:

from gt4sd.algorithms.registry import ApplicationsRegistry

NOTE: Redirects are currently not supported in Windows or MacOs.
Using TensorFlow backend.

INFO:toxsmi.utils.wrappers:Class weights are (1, 1).
15:11:41   Class weights are (1, 1).
INFO:toxsmi.utils.wrappers:Class weights are (1, 1).
15:11:41   Class weights are (1, 1).
INFO:tape.models.modeling_utils:Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
15:11:43   Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .

In [2]:

prompt = "Generative models are cool!"

algorithm = ApplicationsRegistry.get_application_instance(
    target=None,
    algorithm_type='generation',
    domain='vision',
    algorithm_name='DiffusersGenerationAlgorithm',
    algorithm_application='LDMTextToImageGenerator',
    prompt=prompt
)

INFO:gt4sd.algorithms.core:runnning DiffusersGenerationAlgorithm with configuration=LDMTextToImageGenerator(algorithm_version='CompVis/ldm-text2im-large-256', modality='token2image', model_type='latent_diffusion_conditional', scheduler_type='discrete', prompt='Generative models are cool!')
15:11:54   runnning DiffusersGenerationAlgorithm with configuration=LDMTextToImageGenerator(algorithm_version='CompVis/ldm-text2im-large-256', modality='token2image', model_type='latent_diffusion_conditional', scheduler_type='discrete', prompt='Generative models are cool!')
INFO:gt4sd.algorithms.generation.diffusion.core:ensure artifacts for the application are present.
15:11:54   ensure artifacts for the application are present.
INFO:gt4sd.s3:starting syncing
15:11:54   starting syncing
INFO:gt4sd.s3:syncing complete
15:11:55   syncing complete
INFO:gt4sd.s3:starting syncing
15:11:55   starting syncing
INFO:gt4sd.s3:syncing complete
15:11:55   syncing complete

{'cross_attention_dim'} was not found in config. Values will be initialized to default values.
{'set_alpha_to_one'} was not found in config. Values will be initialized to default values.

In [3]:

image = list(algorithm.sample(1))[0]

  0%|          | 0/50 [00:00<?, ?it/s]

In [4]:

image

Out[4]:

In [ ]:

prompt = "Generative models on the moon!"

algorithm = ApplicationsRegistry.get_application_instance(
    target=None,
    algorithm_type='generation',
    domain='vision',
    algorithm_name='DiffusersGenerationAlgorithm',
    algorithm_application='LDMTextToImageGenerator',
    prompt=prompt
)

  0%|          | 0/50 [00:00<?, ?it/s]

In [ ]:

image = list(algorithm.sample(1))[0]

In [ ]:

image

Out[ ]:

In [ ]:

prompt = "Generative models for scientific discovery!"

algorithm = ApplicationsRegistry.get_application_instance(
    target=None,
    algorithm_type='generation',
    domain='vision',
    algorithm_name='DiffusersGenerationAlgorithm',
    algorithm_application='LDMTextToImageGenerator',
    prompt=prompt
)

  0%|          | 0/50 [00:00<?, ?it/s]

In [ ]:

image = list(algorithm.sample(1))[0]

In [ ]:

image

Out[ ]:

Changing diffusion model is as easy as changing one line of code with StableDiffusionGenerator. Here we use stable-diffusion to generate images conditioning on text. You will have to accept the license here and authenticate on huggingface to use this model.

In [ ]:

huggingface-cli login

In [5]:

prompt = "Generative models on mars!"

algorithm = ApplicationsRegistry.get_application_instance(
    target=None,
    algorithm_type='generation',
    domain='vision',
    algorithm_name='DiffusersGenerationAlgorithm',
    algorithm_application='StableDiffusionGenerator', # authenticate on huggingface
    prompt=prompt
)

INFO:gt4sd.algorithms.core:runnning DiffusersGenerationAlgorithm with configuration=StableDiffusionGenerator(algorithm_version='CompVis/stable-diffusion-v1-4', modality='token2image', model_type='stable_diffusion', scheduler_type='discrete')
14:16:40   runnning DiffusersGenerationAlgorithm with configuration=StableDiffusionGenerator(algorithm_version='CompVis/stable-diffusion-v1-4', modality='token2image', model_type='stable_diffusion', scheduler_type='discrete')
INFO:gt4sd.algorithms.generation.diffusion.core:ensure artifacts for the application are present.
14:16:40   ensure artifacts for the application are present.
INFO:gt4sd.s3:starting syncing
14:16:40   starting syncing
INFO:gt4sd.s3:syncing complete
14:16:41   syncing complete
INFO:gt4sd.s3:starting syncing
14:16:41   starting syncing
INFO:gt4sd.s3:syncing complete
14:16:41   syncing complete

Downloading:   0%|          | 0.00/14.9k [00:00<?, ?B/s]

ftfy or spacy is not installed using BERT BasicTokenizer instead of ftfy.

In [6]:

image = list(algorithm.sample(1))[0]

0it [00:00, ?it/s]

In [7]:

image

Out[7]:

In [ ]:

prompt = "Draw a logo for the Generative Toolkit for Scientific Discovery (GT4SD) project!"

algorithm = ApplicationsRegistry.get_application_instance(
    target=None,
    algorithm_type='generation',
    domain='vision',
    algorithm_name='DiffusersGenerationAlgorithm',
    algorithm_application='StableDiffusionGenerator', # authenticate on huggingface
    prompt=prompt
)

0it [00:00, ?it/s]

In [ ]:

image = list(algorithm.sample(1))[0]

In [ ]:

image

Out[ ]:

In [ ]: