DataScienceProject/Vit · Hugging Face

Model Card for Model ID

This model is designed for classifying images as either 'real' or 'fake-AI generated' using a Vision Transformer (VIT) .

Our goal is to accurately classify the source of the image with at least 85% accuracy and achieve at least 80% in the recall test.

Model Description

This model leverages the Vision Transformer (ViT) architecture, which applies self-attention mechanisms to process images. The model classifies images into two categories: 'real ' and 'fake - AI generated'. It captures intricate patterns and features that help in distinguishing between the two categories without the need for Convolutional Neural Networks (CNNs).

Direct Use

This model can be used to classify images as 'real art' or 'fake art' based on visual features learned by the Vision Transformer.

Out-of-Scope Use

The model may not perform well on images outside the scope of art or where the visual characteristics are drastically different from those in the training dataset.

Recommendations

Run the traning code on pc with an nvidia gpu better then rtx 3060 and at least 6 core cpu / use google collab.

How to Get Started with the Model

Prepare Data: Organize your images into appropriate folders and run the code.

model architecture

Training Details

-Dataset: DataScienceProject/Art_Images_Ai_And_Real_

Preprocessing: Images are resized, converted to 'rgb' format , transformed into tensor and stored in special torch dataset.

Training Hyperparameters

optimizer = optim.Adam(model.parameters(), lr=0.001) num_epochs = 10 criterion = nn.CrossEntropyLoss()

Evaluation

The model takes 15-20 minutes to run , based on our dataset , equipped with the following pc hardware: cpu :i9 13900 ,ram: 32gb , gpu: rtx 3080 your mileage may vary.

Testing Data, Factors & Metrics

-precision -recall -f1 -confusion_matrix -accuracy

Results

-test accuracy = 0.92

-precision = 0.893

-recall = 0.957

-f1 = 0.924

Summary

This model is by far the best of what we tried (CNN , Resnet , CNN + ELA).

DataScienceProject
/

Vit