Generate stable toy brick structures from text prompts.
Create a video from an image with camera motion