Shap-E: Generative AI Large Models for 3D Assets

OpenAI has just released Shap-E, a generative model for creating 3D assets based on text cues and images, capable of generating textured meshes and neural radiation fields for a variety of 3D outputs.

insert image description here

Recommendation: Use NSDT Designer to quickly build programmable 3D scenes.

In this tutorial, we'll walk you through setting up Shap-E on Google Colab (free), running code to generate 3D objects based on text cues and images. Thanks to Google Colab, you don't need a powerful GPU, because we will use the one provided by Google.

The code we're running can be found here (taken from openai/shap-e Github):

1. Quick demo

In this short demo, we will install and run Shap-E on Google Colab.

2. Set up Shap-E on Google Colab

Open Google Colab .

Click File > New Notebook to create a new Colab notebook.

3. Enable GPU on Google Colab

Then, we need to enable the graphics processing unit (GPU) in our Colab notebook. It is often required for resource-intensive tasks such as deep learning.

To enable GPU in Google Colab, follow these steps:

  • A new Colab notebook is opened.

  • Click the "Runtime" menu in the top toolbar.

  • Select "Change Runtime Type" from the drop-down menu.

  • In the "Runtime Type" dialog, select "GPU" from "Hardware Accelerator"

  • insert image description here
    Click Save to apply the changes.

4. Install Shap-E

In Google Colab, we need to first clone the Shap-E repository from GitHub and then install the required packages. To do this, follow these steps:

Step 1. In the first cell of the Colab notebook, paste the following code:

!git clone https://github.com/openai/shap-e.git

This command clones the Shap-E repository from GitHub to your Colab environment. It will download the code, examples and required files for you to use Shap-E.

Run the cell by clicking the play button or pressing Shift+Enter.

insert image description here

Step 2. In a new cell, paste the following code:

%cd shap-e

This command changes the current working directory to the shap-e folder, which is where we cloned the Shap-E repository in the previous step. We need to install the required packages in this folder.

insert image description here

Run the cell by clicking the play button or pressing Shift+Enter.

Step 3. In another new cell, paste the following code:

!pip install -e .

This command will install the packages required by Shap-E in your Colab environment. The -e flag installs the package in "editable" mode, meaning that any changes made to the package files will be reflected in the installed package without requiring a reinstall.

Run the unit to complete the installation.
insert image description here

Now that the Shap-E repository has been cloned and the required packages installed, you can proceed to generate 3D objects using the code provided earlier in this tutorial.

5. Use Shap-E to generate 3D objects from text

To generate 3D objects based on text cues, follow these steps:

Step 1. In a new cell in the Colab notebook, paste the following code:

import torch
from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images, gif_widget

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
xm = load_model('transmitter', device=device)
model = load_model('text300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))

batch_size = 4
guidance_scale = 15.0
prompt = "a shark"

latents = sample_latents(
    batch_size=batch_size,
    model=model,
    diffusion=diffusion,
    guidance_scale=guidance_scale,
    model_kwargs=dict(texts=[prompt] * batch_size),
    progress=True,
    clip_denoised=True,
    use_fp16=True,
    use_karras=True,
    karras_steps=64,
    sigma_min=1e-3,
    sigma_max=160,
    s_churn=0,
)

render_mode = 'nerf'  # you can change this to 'stf'
size = 64  # this is the size of the renders; higher values take longer to render.

cameras = create_pan_cameras(size, device)
for i, latent in enumerate(latents):
    images = decode_latent_images(xm, latent, cameras, rendering_mode=render_mode)
    display(gif_widget(images))

This code sets up the necessary imports, loads the Shap-E model, and configures generation parameters such as text hints and rendering options. The text hint in this example is a shark, but you can change this to whatever object you want to spawn.

Step 2. Run the cell to generate 3D objects according to the text prompts. The output will be displayed as an animated GIF showing the resulting 3D object from different angles.
insert image description here

You can experiment with different text prompts and rendering options by changing the prompt, render_mode, and size variables in your code.

6. Save the generated 3D object as a mesh

To save the resulting 3D object as a mesh file (PLY format), follow these steps:

Step 1. In a new cell, paste the following code:

from shap_e.util.notebooks import decode_latent_mesh

for i, latent in enumerate(latents):
    with open(f'example_mesh_{i}.ply', 'wb') as f:
        decode_latent_mesh(xm, latent).tri_mesh().write_ply(f)

Step 2. Run the unit and save the generated 3D object as a PLY file. These files will be saved in the shap-e folder in your Colab environment.

They will be saved as a file called example_mesh_0.ply.
insert image description here

Step 3. To download the generated PLY file to your local computer, click on the folder icon in the left sidebar of Colab, navigate to the shape-e folder, and right-click on the PLY file you want to download. Select "Download" to save them to your local computer.

Now you can use these generated 3D objects in any 3D modeling software that supports PLY files.

insert image description here

7. Use Shap-E to generate 3D objects from images

You can also use Shap-E to generate 3D objects from images.

To do this, first we will use the sample image provided in the examples.

insert image description here

First download the image and upload it to Google Colab's shap-e directory.

Just hover over a directory in the file browser on the left and you'll see a three-dot menu. Click it and then click Upload to upload corgi.png.
insert image description here

Next, assuming you have GPU enabled and Shap-E installed, run the following code:

import torch

from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images, gif_widget
from shap_e.util.image_util import load_image

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

xm = load_model('transmitter', device=device)
model = load_model('image300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))

batch_size = 4
guidance_scale = 3.0

image = load_image("example_data/corgi.png")

latents = sample_latents(
    batch_size=batch_size,
    model=model,
    diffusion=diffusion,
    guidance_scale=guidance_scale,
    model_kwargs=dict(images=[image] * batch_size),
    progress=True,
    clip_denoised=True,
    use_fp16=True,
    use_karras=True,
    karras_steps=64,
    sigma_min=1e-3,
    sigma_max=160,
    s_churn=0,
)

render_mode = 'nerf' # you can change this to 'stf' for mesh rendering
size = 64 # this is the size of the renders; higher values take longer to render.

cameras = create_pan_cameras(size, device)
for i, latent in enumerate(latents):
    images = decode_latent_images(xm, latent, cameras, rendering_mode=render_mode)
    display(gif_widget(images))

The results don't seem to be that great. But hopefully with some tweaking or using other images you'll get better results.

insert image description here

8. Conclusion

OpenAI's Shap-E is a powerful tool that enables users to generate 3D objects from text and images.

By leveraging Google Colab, you can easily set up and run Shap-E without any complicated installation or powerful hardware.


Original link: Shap-E combat 3D model generation — BimAnt

Guess you like

Origin blog.csdn.net/shebao3333/article/details/130690495