Juan Delgadillo's picture

116 71

Juan Delgadillo

juandelgadillo

·

https://us.jdelgadillo.com/huggingface

AI & ML interests

None yet

Organizations

juandelgadillo's activity

upvoted 4 papers 6 days ago

Instant Facial Gaussians Translator for Relightable and Interactable Facial Rendering

Paper • 2409.07441 • Published 9 days ago • 9

SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories

Paper • 2409.07440 • Published 9 days ago • 6

Agent Workflow Memory

Paper • 2409.07429 • Published 9 days ago • 25

Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models

Paper • 2409.07452 • Published 9 days ago • 18

upvoted 2 papers 12 days ago

From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents

Paper • 2409.03512 • Published 15 days ago • 25

GenAgent: Build Collaborative AI Systems with Automated Workflow Generation -- Case Studies on ComfyUI

Paper • 2409.01392 • Published 18 days ago • 9

upvoted a paper 19 days ago

Efficient Detection of Toxic Prompts in Large Language Models

Paper • 2408.11727 • Published 30 days ago • 11

upvoted 2 papers about 1 month ago

HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors

Paper • 2408.06019 • Published Aug 12 • 12

VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads

Paper • 2407.18245 • Published Jul 25 • 7

upvoted a paper about 2 months ago

IMAGDressing-v1: Customizable Virtual Dressing

Paper • 2407.12705 • Published Jul 17 • 12

upvoted 2 papers 2 months ago

Towards Building Specialized Generalist AI with System 1 and System 2 Fusion

Paper • 2407.08642 • Published Jul 11 • 9

Magic Insert: Style-Aware Drag-and-Drop

Paper • 2407.02489 • Published Jul 2 • 20

upvoted 8 papers 3 months ago

4K4DGen: Panoramic 4D Generation at 4K Resolution

Paper • 2406.13527 • Published Jun 19 • 7

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21 • 60

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Paper • 2406.11896 • Published Jun 14 • 18

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

Paper • 2406.10601 • Published Jun 15 • 65

HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors

Paper • 2406.12459 • Published Jun 18 • 11

Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis

Paper • 2406.06216 • Published Jun 10 • 17

Depth Anything V2

Paper • 2406.09414 • Published Jun 13 • 91

Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion

Paper • 2406.04338 • Published Jun 6 • 34

upvoted a paper 4 months ago

FIFO-Diffusion: Generating Infinite Videos from Text without Training

Paper • 2405.11473 • Published May 19 • 53

upvoted 7 papers 5 months ago

AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

Paper • 2404.12753 • Published Apr 19 • 41

AniClipart: Clipart Animation with Text-to-Video Priors

Paper • 2404.12347 • Published Apr 18 • 12

EdgeFusion: On-Device Text-to-Image Generation

Paper • 2404.11925 • Published Apr 18 • 21

MeshLRM: Large Reconstruction Model for High-Quality Mesh

Paper • 2404.12385 • Published Apr 18 • 25

Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17 • 41

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing

Paper • 2404.05717 • Published Apr 8 • 24

ByteEdit: Boost, Comply and Accelerate Generative Image Editing

Paper • 2404.04860 • Published Apr 7 • 24

upvoted 16 papers 6 months ago

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Paper • 2404.03653 • Published Apr 4 • 32

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

Paper • 2404.02733 • Published Apr 3 • 20

ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Paper • 2403.18818 • Published Mar 27 • 24

Garment3DGen: 3D Garment Stylization and Texture Generation

Paper • 2403.18816 • Published Mar 27 • 20

LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25 • 64

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Paper • 2403.16990 • Published Mar 25 • 24

FlashFace: Human Image Personalization with High-fidelity Identity Preservation

Paper • 2403.17008 • Published Mar 25 • 18

SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions

Paper • 2403.16627 • Published Mar 25 • 20

MusicHiFi: Fast High-Fidelity Stereo Vocoding

Paper • 2403.10493 • Published Mar 15 • 16

DragAnything: Motion Control for Anything using Entity Representation

Paper • 2403.07420 • Published Mar 12 • 12

Pix2Gif: Motion-Guided Diffusion for GIF Generation

Paper • 2403.04634 • Published Mar 7 • 14

StableDrag: Stable Dragging for Point-based Image Editing

Paper • 2403.04437 • Published Mar 7 • 25

Design2Code: How Far Are We From Automating Front-End Engineering?

Paper • 2403.03163 • Published Mar 5 • 93

AtomoVideo: High Fidelity Image-to-Video Generation

Paper • 2403.01800 • Published Mar 4 • 20

VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1 • 44

OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Paper • 2403.01779 • Published Mar 4 • 26

upvoted 14 papers 7 months ago

MOSAIC: A Modular System for Assistive and Interactive Cooking

Paper • 2402.18796 • Published Feb 29 • 23

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

Paper • 2402.16840 • Published Feb 26 • 23

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 590

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27 • 185

Music Style Transfer with Time-Varying Inversion of Diffusion Models

Paper • 2402.13763 • Published Feb 21 • 9

Aria Everyday Activities Dataset

Paper • 2402.13349 • Published Feb 20 • 28

YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Paper • 2402.13616 • Published Feb 21 • 45

How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts

Paper • 2402.13220 • Published Feb 20 • 12

Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability

Paper • 2402.12225 • Published Feb 19 • 5

Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots

Paper • 2402.10329 • Published Feb 15 • 13

EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 19

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

Paper • 2402.05054 • Published Feb 7 • 25

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

Paper • 2402.04324 • Published Feb 6 • 23

Training-Free Consistent Text-to-Image Generation

Paper • 2402.03286 • Published Feb 5 • 64

upvoted 2 papers 8 months ago

Anything in Any Scene: Photorealistic Video Object Insertion

Paper • 2401.17509 • Published Jan 30 • 16

Repositioning the Subject within Image

Paper • 2401.16861 • Published Jan 30 • 13