BERTopic-2024-05-02-165545
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("antulik/BERTopic-2024-05-02-165545")
topic_model.get_topic_info()
Topic overview
- Number of topics: 18
- Number of training documents: 1000
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | theism - church - what - to - about | 12 | -1_theism_church_what_to |
0 | x11r5 - pc - toolkit - application - program | 258 | 0_x11r5_pc_toolkit_application |
1 | nhl - playoffs - rangers - hockey - league | 97 | 1_nhl_playoffs_rangers_hockey |
2 | performance - ram - drivers - monitor - speed | 92 | 2_performance_ram_drivers_monitor |
3 | dos - windows - disk - software - files | 82 | 3_dos_windows_disk_software |
4 | government - states - are - batf - against | 76 | 4_government_states_are_batf |
5 | amp - amps - amplifier - ampere - current | 66 | 5_amp_amps_amplifier_ampere |
6 | scripture - christians - sin - commandment - christian | 47 | 6_scripture_christians_sin_commandment |
7 | nasa - spacecraft - space - solar - spaceship | 40 | 7_nasa_spacecraft_space_solar |
8 | patients - biological - medicine - studies - doctors | 40 | 8_patients_biological_medicine_studies |
9 | - - - - | 38 | 9____ |
10 | bikes - motorcycle - bike - riding - rider | 32 | 10_bikes_motorcycle_bike_riding |
11 | encryption - security - encrypted - privacy - secure | 27 | 11_encryption_security_encrypted_privacy |
12 | armenians - armenian - armenia - turks - genocide | 23 | 12_armenians_armenian_armenia_turks |
13 | paganism - faith - christianity - christians - atheists | 21 | 13_paganism_faith_christianity_christians |
14 | contacted - address - mail - contact - email | 19 | 14_contacted_address_mail_contact |
15 | foolish - quotation - said - quote - hypocrisy | 18 | 15_foolish_quotation_said_quote |
16 | palestinians - palestinian - antisemitism - gaza - israel | 12 | 16_palestinians_palestinian_antisemitism_gaza |
Training hyperparameters
- calculate_probabilities: False
- language: english
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: None
- seed_topic_list: [['drug', 'cancer', 'drugs', 'doctor'], ['windows', 'drive', 'dos', 'file'], ['space', 'launch', 'orbit', 'lunar']]
- top_n_words: 10
- verbose: False
- zeroshot_min_similarity: 0.7
- zeroshot_topic_list: None
Framework versions
- Numpy: 1.23.5
- HDBSCAN: 0.8.33
- UMAP: 0.5.6
- Pandas: 2.0.3
- Scikit-Learn: 1.2.2
- Sentence-transformers: 2.7.0
- Transformers: 4.40.1
- Numba: 0.58.1
- Plotly: 5.15.0
- Python: 3.10.12
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.