Model Card for Model ID
This model card aims to be a baseline model for using RVL-CDIP with Donut. The model has been trained on small scale dataset of RVL-CDIP (specically 100 images from this dataset).
Model Details
The model using Donut with VisionEncoderDecoder and Transformers as the backbone model for an end-to-end Document Classification task
Downstream Use [optional]
This model can be use for fine-tuning task related Document Classification in different area like Food Document, Financial Document, etc. For further task downstream fine-tune, please related to the orignal model from Naver.
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.