metadata
license: mit
datasets:
- deepghs/anime_classification
metrics:
- accuracy
pipeline_tag: image-classification
tags:
- art
The model used to predict the types of anime images, which includes the following four categories:
- 3D: Images rendered in 3D, including Mikumikudance, Koikatsu, etc.
- Bangumi: Screenshots from anime videos.
- Comic: Images of manga that contain a significant amount of text or panel sequences.
- Illustration: General anime illustrations.
Model | FLOPs | Accuracy | Confusion Matrix | Description |
---|---|---|---|---|
caformer_s36 | 22.10G | 88.19% | Confusion Matrix | Model: caformer_s36 from timm |
caformer_s36_plus | 22.10G | 93.47% | Confusion Matrix | Model: caformer_s36.sail_in22k_ft_in1k_384 pratrained from timm |
mobilenetv3 | 0.63G | 88.96% | Confusion Matrix | Model: mobilenetv3_large_100 from timm |
mobilenetv3_dist | 0.63G | 91.98% | Confusion Matrix | Distrillated from caformer_s36_plus, using mobilenetv3_large_100 with focal loss |
mobilenetv3_sce | 0.63G | 89.92% | Confusion Matrix | Model: mobilenetv3_large_100 from timm, use SCELoss as loss function |
mobilenetv3_sce_dist | 0.63G | 92.35% | Confusion Matrix | Distrillated from caformer_s36_plus, using mobilenetv3_large_100 with SCELoss |
mobilevitv2_150 | 9.09G | 88.21% | Confusion Matrix | Model: mobilevitv2_150 from timm |
Name | FLOPS | Params | Accuracy | AUC | Confusion | Labels |
---|---|---|---|---|---|---|
caformer_s36 | 22.10G | 37.22M | 88.19% | N/A | confusion | 3d , bangumi , comic , illustration |
caformer_s36_plus | 22.10G | 37.22M | 93.47% | 0.9891 | confusion | 3d , bangumi , comic , illustration |
caformer_s36_v1.1_focal | 22.10G | 37.22M | 95.99% | 0.9967 | confusion | 3d , bangumi , comic , illustration , not_painting |
caformer_s36_v1.2_focal | 22.10G | 37.22M | 97.23% | 0.9982 | confusion | 3d , bangumi , comic , illustration , not_painting |
caformer_s36_v1.3_focal | 22.10G | 37.22M | 97.16% | 0.9982 | confusion | 3d , bangumi , comic , illustration , not_painting |
caformer_s36_v1.4_focal | 22.10G | 37.22M | 95.82% | 0.9967 | confusion | 3d , bangumi , comic , illustration , not_painting |
caformer_s36_v1 | 22.10G | 37.22M | 94.72% | 0.9934 | confusion | 3d , bangumi , comic , illustration , not_painting |
mobilenetv3 | 0.63G | 4.18M | 88.96% | N/A | confusion | 3d , bangumi , comic , illustration |
mobilenetv3_dist | 0.63G | 4.18M | 91.98% | 0.9879 | confusion | 3d , bangumi , comic , illustration |
mobilenetv3_sce | 0.63G | 4.18M | 89.92% | 0.9786 | confusion | 3d , bangumi , comic , illustration |
mobilenetv3_sce_dist | 0.63G | 4.18M | 92.35% | 0.9854 | confusion | 3d , bangumi , comic , illustration |
mobilenetv3_v1.2_dist | 0.63G | 4.18M | 96.53% | 0.9972 | confusion | 3d , bangumi , comic , illustration , not_painting |
mobilenetv3_v1.3_dist | 0.63G | 4.18M | 96.41% | 0.9973 | confusion | 3d , bangumi , comic , illustration , not_painting |
mobilenetv3_v1.4_dist | 0.63G | 4.18M | 94.77% | 0.9950 | confusion | 3d , bangumi , comic , illustration , not_painting |
mobilenetv3_v1_dist | 0.63G | 4.18M | 94.04% | 0.9928 | confusion | 3d , bangumi , comic , illustration , not_painting |
mobilevitv2_150 | 9.09G | 9.79M | 88.21% | N/A | confusion | 3d , bangumi , comic , illustration |