@Jaward on Hugging Face: "This is supercool!! LlaVA-3D: adds 3D-awareness to LVMs without compromising…"

Post

1934

This is supercool!!
LlaVA-3D: adds 3D-awareness to LVMs without compromising 2D understanding capabilities.

Method: they developed a unified architecture that maps 2D clip patch features to their corresponding positions in 3D space - enabling joint 2D and 3D vision-language instruction tuning.

Project: https://zcmax.github.io/projects/LLaVA-3D/

Join the conversation