Generate spatial audio from images (and optionally text)
Generate music from text descriptions
Generate 3D models and videos from images