FinBERT for Financial News Sentiment Regression

DISCLAIMER: This model has been successfully tested with a test set of the same distribution. However, it is not a production-ready model as it probably needs to be updated continuously. Furthermore, the model should have been trained with more than two years of historical data. Additionally, it would need a supplementary assessment on bias, security and consistency.

Introduction

Analyzing the sentiment of financial news is a complex task that requires a large understanding of the financial slang, as well as the knowledge of the context of each one of the companies, and the interactions of the whole economy as an ecosystem.

The FinBERT model binary classifies the sentiment being positive or negative. However, the idea of binary classification is too simple and does not comply with the reality.

RavenPack has an excellent hand-labelled large dataset with a continuous sentiment label variable that goes from -1 to 1. We have collected data from two previous years and tested it with data from the next two weeks. Additionally we have cut the dataset taking only both one year and six months subsamples to see how the model scales with more data, and to know whether more data helps the model or not.

In this repository you can find the different models by changing the branch name. The main branch is the one with the model trained on the whole dataset. We also uploaded the best regressor FinEAS to the Hub: https://huggingface.co/LHF/FinEAS

Note that the predictions of this HF model will go from 0 to 1 being 0.5 neutral, 1 positive and 0 negative.

Evaluation

Dates	FinEAS	FinBERT
6 months	0.0044	0.0050
12 months	0.0036	0.0034
24 months	0.0033	0.0040
*Evaluated with the next two weeks.

Code

You can find the code for this model in the following link: https://github.com/lhf-labs/finance-news-analysis-bert

Citation

@misc{gutierrezfandino2021fineas,
      title={FinEAS: Financial Embedding Analysis of Sentiment}, 
      author={Asier Gutiérrez-Fandiño and Miquel Noguer i Alonso and Petter Kolm and Jordi Armengol-Estapé},
      year={2021},
      eprint={2111.00526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}