--- tags: - bertopic library_name: bertopic pipeline_tag: text-classification --- # xsum_55555_3000_1500_test This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("KingKazma/xsum_55555_3000_1500_test") topic_model.get_topic_info() ``` ## Topic overview * Number of topics: 26 * Number of training documents: 1500
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | said - mr - also - would - people | 5 | -1_said_mr_also_would | | 0 | police - said - mr - court - heard | 716 | 0_police_said_mr_court | | 1 | syria - turkey - syrian - military - said | 112 | 1_syria_turkey_syrian_military | | 2 | foul - win - kick - half - shot | 72 | 2_foul_win_kick_half | | 3 | growth - year - bank - business - economy | 68 | 3_growth_year_bank_business | | 4 | council - said - building - development - new | 63 | 4_council_said_building_development | | 5 | england - cricket - captain - test - wicket | 48 | 5_england_cricket_captain_test | | 6 | league - club - season - loan - transfer | 42 | 6_league_club_season_loan | | 7 | sport - gold - world - athlete - olympic | 38 | 7_sport_gold_world_athlete | | 8 | film - music - best - star - song | 36 | 8_film_music_best_star | | 9 | party - labour - mr - leader - said | 33 | 9_party_labour_mr_leader | | 10 | ireland - wales - leinster - rugby - player | 32 | 10_ireland_wales_leinster_rugby | | 11 | care - nhs - hospital - patient - said | 27 | 11_care_nhs_hospital_patient | | 12 | road - crash - police - collision - car | 26 | 12_road_crash_police_collision | | 13 | dog - animal - greyhound - racing - owner | 23 | 13_dog_animal_greyhound_racing | | 14 | ship - beach - said - lifeguard - rnli | 22 | 14_ship_beach_said_lifeguard | | 15 | school - education - child - council - said | 20 | 15_school_education_child_council | | 16 | wales - bill - welsh - labour - assembly | 19 | 16_wales_bill_welsh_labour | | 17 | eu - uk - european - europe - referendum | 18 | 17_eu_uk_european_europe | | 18 | fire - blaze - bus - flame - said | 18 | 18_fire_blaze_bus_flame | | 19 | mr - president - besigye - maduro - election | 16 | 19_mr_president_besigye_maduro | | 20 | race - froome - stage - second - lap | 13 | 20_race_froome_stage_second | | 21 | rail - train - rmt - scotrail - transport | 10 | 21_rail_train_rmt_scotrail | | 22 | planet - earth - electron - theory - mars | 10 | 22_planet_earth_electron_theory | | 23 | ryder - cup - tour - pga - mcilroy | 7 | 23_ryder_cup_tour_pga | | 24 | email - lazar - fbi - guccifer - ferizi | 6 | 24_email_lazar_fbi_guccifer |
## Training hyperparameters * calculate_probabilities: True * language: english * low_memory: False * min_topic_size: 10 * n_gram_range: (1, 1) * nr_topics: None * seed_topic_list: None * top_n_words: 10 * verbose: False ## Framework versions * Numpy: 1.22.4 * HDBSCAN: 0.8.33 * UMAP: 0.5.3 * Pandas: 1.5.3 * Scikit-Learn: 1.2.2 * Sentence-transformers: 2.2.2 * Transformers: 4.31.0 * Numba: 0.57.1 * Plotly: 5.13.1 * Python: 3.10.12