donut running on runpod

#6
by ilovepie - opened

Hello, I am running this model succefully in my local machine. On my server I dont have gpu and cpu is too slow. I want to try runpod but I have no idea how to use it. I created a serverless runpod selecting the donut llm. I tried the open ai interface but not sure on the parameters.

client = OpenAI(
base_url=f"https://api.runpod.ai/v2/{endpoint_id}/openai/v1",
api_key=api_key,
)

Encode your image to base64

def encode_image_to_base64(image_path):
"""Encodes an image file to base64."""
try:
with open(image_path, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
return encoded_string
except Exception as e:
raise ValueError(f"Error encoding image: {str(e)}")

Replace with your image path

image_path = "./img_cuts/img_f10.png"
base64_image = encode_image_to_base64(image_path)

Create the payload

payload = {
"model": "jinhybr/OCR-Donut-CORD", # Specify the Donut LLM model
"messages": [
{"role": "system", "content": "receipt"}, # Classification prompt
{"role": "user", "content": f"what is the text?"}, # VQA prompt
{"role": "user", "content": f"{base64_image}"} # Parsing the image
]
}

Send the request

try:
chat_completion = client.chat.completions.create(**payload)
print(chat_completion)
except Exception as e:
print(f"Error sending request: {str(e)}")

Is this the correct way?? Is there another way? Maybe just a post request with json? Could you please advise?

Sign up or log in to comment