Edit model card

Vision Transformer (ViT) for Music Genre Classification

Model Overview

Model Name: ghermoso/vit-eGTZANplus
Task: Image Classification
Dataset: egtzan_plus
Model Architecture: Vision Transformer (ViT)
Finetuned from model: This model is a fine-tuned version of google/vit-base-patch16-224-in21k on an egtzan_plus dataset.

It achieves the following results on the evaluation set:

Loss: 0.8358
Accuracy: 0.7460

Downloads last month: 4

Inference Examples

Image Classification

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ghermoso/vit-eGTZANplus

Base model

google/vit-base-patch16-224-in21k

Finetuned

(1668)

this model

Dataset used to train ghermoso/vit-eGTZANplus

Evaluation results

Metadata error: specify a dataset to view leaderboard