Mask Generation
ONNX
efficientvit-sam / README.md
han-cai's picture
Update README.md
446108c verified
metadata
license: apache-2.0
pipeline_tag: mask-generation

EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Pretrained Models

Latency/Throughput is measured on NVIDIA Jetson AGX Orin, and NVIDIA A100 GPU with TensorRT, fp16. Data transfer time is included.

Model Resolution COCO mAP LVIS mAP Params MACs Jetson Orin Latency (bs1) A100 Throughput (bs16) Checkpoint
EfficientViT-SAM-L0 512x512 45.7 41.8 34.8M 35G 8.2ms 762 images/s link
EfficientViT-SAM-L1 512x512 46.2 42.1 47.7M 49G 10.2ms 638 images/s link
EfficientViT-SAM-L2 512x512 46.6 42.7 61.3M 69G 12.9ms 538 images/s link
EfficientViT-SAM-XL0 1024x1024 47.5 43.9 117.0M 185G 22.5ms 278 images/s link
EfficientViT-SAM-XL1 1024x1024 47.8 44.4 203.3M 322G 37.2ms 182 images/s link

Table1: Summary of All EfficientViT-SAM Variants. COCO mAP and LVIS mAP are measured using ViTDet's predicted bounding boxes as the prompt. End-to-end Jetson Orin latency and A100 throughput are measured with TensorRT and fp16.

Usage

# segment anything
from efficientvit.sam_model_zoo import create_sam_model

efficientvit_sam = create_sam_model(
  name="xl1", weight_url="assets/checkpoints/sam/xl1.pt",
)
efficientvit_sam = efficientvit_sam.cuda().eval()
from efficientvit.models.efficientvit.sam import EfficientViTSamPredictor

efficientvit_sam_predictor = EfficientViTSamPredictor(efficientvit_sam)
from efficientvit.models.efficientvit.sam import EfficientViTSamAutomaticMaskGenerator

efficientvit_mask_generator = EfficientViTSamAutomaticMaskGenerator(efficientvit_sam)