Steps to reproducethe model in a legal dataset

#1
by wilfoderek - opened

Amazing work my friend!
Could you please share the necessary steps or providing any documentation that would enable us to replicate the experiment in a legal domain?
Thanks in advance my friends.

wilfoderek changed discussion title from Steps to reproduce in a legal dataset to Steps to reproducethe model in a legal dataset
WhereIsAI org

Many thanks for following our work.
Our secret is to use angle optimization. You can build upon our model and fine-tune your data using angle optimization.
We have provided a friendly training interface; prepare your data and train your model with a few lines of code. Refer to https://github.com/SeanLee97/AnglE#2-custom-train.

SeanLee97 changed discussion status to closed

I would like to test it in a spanish language?
How can I achieve this? Any suggestion is welcomed.

WhereIsAI org

Unfortunately, UAE was only finetuned on English datasets.

For Spanish, I know there is a semantic textual similarity dataset SemEval-2015 Task 2. Maybe you can train on it using AnglE and evaluate its performance. xlm-roberta-large is a good choice to be used as the backbone model.

WhereIsAI org

Unfortunately, UAE was only finetuned on English datasets.

For Spanish, I know there is a semantic textual similarity dataset SemEval-2015 Task 2. Maybe you can train on it using AnglE and evaluate its performance. xlm-roberta-large is a good choice to be used as the backbone model.

To guarantee the generalization ability, you should collect more training data.

Sign up or log in to comment