From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities Paper • 2401.15071 • Published Jan 26 • 34
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection Paper • 2311.10122 • Published Nov 16, 2023 • 26