AmelieSchreiber commited on
Commit
66fb9b9
1 Parent(s): b73ac0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -45,7 +45,7 @@ This model was trained on approximately 70,000 proteins with binding site and ac
45
  The training split was a random 85/15 split for this version, and does not consider anything in the way of family or sequence
46
  similarity. New iterations of the model have been trained on larger datasets (over 200,000 proteins), with the split such that
47
  there are no overlapping families, however they seem to overfit much earlier and have significantly worse performance in terms
48
- of the training metrics (precision, recall, and F1).
49
 
50
  Training Metrics for the Model in the form of the `trainer_state.json` can be
51
  [found here](https://huggingface.co/AmelieSchreiber/esm2_t6_8M_general_binding_sites_v2/blob/main/trainer_state.json).
 
45
  The training split was a random 85/15 split for this version, and does not consider anything in the way of family or sequence
46
  similarity. New iterations of the model have been trained on larger datasets (over 200,000 proteins), with the split such that
47
  there are no overlapping families, however they seem to overfit much earlier and have significantly worse performance in terms
48
+ of the training metrics (precision, recall, and F1). To address this we plan to implement LoRA (and hopefully QLoRA).
49
 
50
  Training Metrics for the Model in the form of the `trainer_state.json` can be
51
  [found here](https://huggingface.co/AmelieSchreiber/esm2_t6_8M_general_binding_sites_v2/blob/main/trainer_state.json).