Get Image Embeddings
#2
by
grimpeaper23
- opened
Hi Is it possible to get the image embeddings of the model? I don't need the decoder outputs but I'd love to take the encoder representations alone. Apologies if this is already mentioned in the documentation and I had just missed out!
Hello! There is no dedicated function to do this directly. However, you can retrieve the encoder representations very easily:
- I suggest you clone the repository directly: https://gitlab.teklia.com/dla/doc-ufcn
- Install the environment using
pip install -e .
- To keep only the encoder part, you can simply comment/remove the last part of the forward method in the
model.py
file:
def forward(self, input_tensor):
"""
Define the forward step of the network.
It consists in 4 successive dilated blocks followed by 3
convolutional blocks, a final convolution and a softmax layer.
:param input_tensor: The input tensor.
:return: The output tensor.
"""
with autocast(enabled=self.amp):
tensor = self.dilated_block1(input_tensor)
out_block1 = tensor
tensor = self.dilated_block2(self.pool(tensor))
out_block2 = tensor
tensor = self.dilated_block3(self.pool(tensor))
out_block3 = tensor
tensor = self.dilated_block4(self.pool(tensor))
return tensor
You can then follow the example given in the documentation and apply the model as follows:
import cv2
from doc_ufcn import models
from doc_ufcn.main import DocUFCN
image = cv2.cvtColor(cv2.imread(IMAGE_PATH), cv2.COLOR_BGR2RGB)
model_path, parameters = models.download_model('generic-historical-line')
model = DocUFCN(len(parameters['classes']), parameters['input_size'], 'cpu')
model.load(model_path, parameters['mean'], parameters['std'], mode="eval")
_, raw_output, _, _ = model.predict(image, raw_output=True)
print(raw_output.shape)
Got it. I implemented a similar workaround. Thank you!
grimpeaper23
changed discussion status to
closed