Edit model card

bertopic_first

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("DobreMihai/bertopic_first")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 50
  • Number of training documents: 24020
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 it - be - to - the - alarm 10 -1_it_be_to_the
0 app - math - the - alarm - to 9997 0_app_math_the_alarm
1 alarm - snooze - not - be - it 5678 1_alarm_snooze_not_be
2 subscription - - - - 1906 2_subscription___
3 picture - photo - take - the - emergency 1326 3_picture_photo_take_the
4 wake - up - it - annoying - love 603 4_wake_up_it_annoying
5 snooze - - - - 293 5_snooze___
6 loud - - - - 288 6_loud___
7 barcode - scan - code - scanner - qr 285 7_barcode_scan_code_scanner
8 mission - the - be - you - to 282 8_mission_the_be_you
9 easy - use - simple - very - and 279 9_easy_use_simple_very
10 loud - - - - 256 10_loud___
11 help - up - early - wake - helpful 246 11_help_up_early_wake
12 ring - not - do - it - sometimes 243 12_ring_not_do_it
13 app - work - open - not - phone 195 13_app_work_open_not
14 ads - - - - 179 14_ads___
15 volume - vibrate - mute - sound - vibration 176 15_volume_vibrate_mute_sound
16 work - use - great - easy - well 170 16_work_use_great_easy
17 star - give - it - because - five 162 17_star_give_it_because
18 prevent - phone - off - power - switch 140 18_prevent_phone_off_power
19 loud - - - - 130 19_loud___
20 work - off - sometimes - not - go 128 20_work_off_sometimes_not
21 music - song - spotify - own - file 127 21_music_song_spotify_own
22 weather - - - - 110 22_weather___
23 annoying - job - but - work - it 88 23_annoying_job_but_work
24 student - helpful - for - useful - very 73 24_student_helpful_for_useful
25 perfect - word - good - amazing - be 66 25_perfect_word_good_amazing
26 reliable - easy - dependable - use - and 58 26_reliable_easy_dependable_use
27 minute - 10 - set - scroll - add 53 27_minute_10_set_scroll
28 aap - very - this - student - good 50 28_aap_very_this_student
29 android - 10 - work - update - support 45 29_android_10_work_update
30 reliable - alarm - very - sometimes - not 40 30_reliable_alarm_very_sometimes
31 alarmy - thank - premium - wake - much 37 31_alarmy_thank_premium_wake
32 application - student - very - excellent - study 31 32_application_student_very_excellent
33 exit - message - quote - love - smile 28 33_exit_message_quote_love
34 paywall - behind - lock - feature - real 25 34_paywall_behind_lock_feature
35 mb - space - storage - size - take 22 35_mb_space_storage_size
36 easy - set - setup - up - to 21 36_easy_set_setup_up
37 squat - premium - the - mission - do 21 37_squat_premium_the_mission
38 overheat - hot - phone - heat - run 19 38_overheat_hot_phone_heat
39 star - give - deserve - it - scarey 18 39_star_give_deserve_it
40 add - instal - pretty - plaster - interfaceit 16 40_add_instal_pretty_plaster
41 uninstall - deactivate - logo - not - let 14 41_uninstall_deactivate_logo_not
42 develper - legend - describe - word - clear 13 42_develper_legend_describe_word
43 team - thank - alarmy - lot - help 13 43_team_thank_alarmy_lot
44 update - ui - the - version - new 13 44_update_ui_the_version
45 love - get - perfect - time - thay 13 45_love_get_perfect_time
46 scare - medication - it - weapon - secret 12 46_scare_medication_it_weapon
47 accurate - dependable - 8n - fashion - ti 11 47_accurate_dependable_8n_fashion
48 procrastinator - exms - help - anoye - hv 11 48_procrastinator_exms_help_anoye

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 50
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.85
  • zeroshot_topic_list: ['android', 'premium*', 'ads', 'math', 'subscription', 'update', 'camera', 'shake', 'weather', 'snooze', 'loud', 'doesn', 'off']

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.38.post1
  • UMAP: 0.5.6
  • Pandas: 2.2.1
  • Scikit-Learn: 1.5.1
  • Sentence-transformers: 3.0.1
  • Transformers: 4.44.0
  • Numba: 0.60.0
  • Plotly: 5.23.0
  • Python: 3.10.14
Downloads last month
76
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.