Edit model card

BERTopic-2024-05-02-165545

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("antulik/BERTopic-2024-05-02-165545")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 18
  • Number of training documents: 1000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 theism - church - what - to - about 12 -1_theism_church_what_to
0 x11r5 - pc - toolkit - application - program 258 0_x11r5_pc_toolkit_application
1 nhl - playoffs - rangers - hockey - league 97 1_nhl_playoffs_rangers_hockey
2 performance - ram - drivers - monitor - speed 92 2_performance_ram_drivers_monitor
3 dos - windows - disk - software - files 82 3_dos_windows_disk_software
4 government - states - are - batf - against 76 4_government_states_are_batf
5 amp - amps - amplifier - ampere - current 66 5_amp_amps_amplifier_ampere
6 scripture - christians - sin - commandment - christian 47 6_scripture_christians_sin_commandment
7 nasa - spacecraft - space - solar - spaceship 40 7_nasa_spacecraft_space_solar
8 patients - biological - medicine - studies - doctors 40 8_patients_biological_medicine_studies
9 - - - - 38 9____
10 bikes - motorcycle - bike - riding - rider 32 10_bikes_motorcycle_bike_riding
11 encryption - security - encrypted - privacy - secure 27 11_encryption_security_encrypted_privacy
12 armenians - armenian - armenia - turks - genocide 23 12_armenians_armenian_armenia_turks
13 paganism - faith - christianity - christians - atheists 21 13_paganism_faith_christianity_christians
14 contacted - address - mail - contact - email 19 14_contacted_address_mail_contact
15 foolish - quotation - said - quote - hypocrisy 18 15_foolish_quotation_said_quote
16 palestinians - palestinian - antisemitism - gaza - israel 12 16_palestinians_palestinian_antisemitism_gaza

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: [['drug', 'cancer', 'drugs', 'doctor'], ['windows', 'drive', 'dos', 'file'], ['space', 'launch', 'orbit', 'lunar']]
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.23.5
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.6
  • Pandas: 2.0.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.7.0
  • Transformers: 4.40.1
  • Numba: 0.58.1
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.