|
|
|
--- |
|
tags: |
|
- bertopic |
|
library_name: bertopic |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
# BERTopic |
|
|
|
This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. |
|
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. |
|
|
|
## Usage |
|
|
|
To use this model, please install BERTopic: |
|
|
|
``` |
|
pip install -U bertopic |
|
``` |
|
|
|
You can use the model as follows: |
|
|
|
```python |
|
from bertopic import BERTopic |
|
topic_model = BERTopic.load("keonju/BERTopic") |
|
|
|
topic_model.get_topic_info() |
|
``` |
|
|
|
## Topic overview |
|
|
|
* Number of topics: 158 |
|
* Number of training documents: 10158 |
|
|
|
<details> |
|
<summary>Click here for an overview of all topics.</summary> |
|
|
|
| Topic ID | Topic Keywords | Topic Frequency | Label | |
|
|----------|----------------|-----------------|-------| |
|
| -1 | and - the - of - in - to | 10 | -1_and_the_of_in | |
|
| 0 | holocene - china - the - monsoon - bp | 3858 | 0_holocene_china_the_monsoon | |
|
| 1 | energy - biofuels - production - biodiesel - bioenergy | 291 | 1_energy_biofuels_production_biodiesel | |
|
| 2 | coal - coals - the - basin - seams | 248 | 2_coal_coals_the_basin | |
|
| 3 | yr - holocene - the - bp - and | 205 | 3_yr_holocene_the_bp | |
|
| 4 | hg - mercury - mehg - of hg - in | 202 | 4_hg_mercury_mehg_of hg | |
|
| 5 | ch4 - methane - emissions - fluxes - flux | 159 | 5_ch4_methane_emissions_fluxes | |
|
| 6 | data - forest - spectral - for - mapping | 118 | 6_data_forest_spectral_for | |
|
| 7 | bp - the - holocene - pollen - lake | 116 | 7_bp_the_holocene_pollen | |
|
| 8 | wetlands - wetland - and - are - of | 104 | 8_wetlands_wetland_and_are | |
|
| 9 | co2 - ecosystem - nee - exchange - net | 103 | 9_co2_ecosystem_nee_exchange | |
|
| 10 | species - of - fen - the - restoration | 100 | 10_species_of_fen_the | |
|
| 11 | peat - tropical - peatlands - palm - peatland | 98 | 11_peat_tropical_peatlands_palm | |
|
| 12 | pb - lead - atmospheric - metal - deposition | 96 | 12_pb_lead_atmospheric_metal | |
|
| 13 | the - lake - of the - of - poland | 93 | 13_the_lake_of the_of | |
|
| 14 | pm2 - haze - burning - air - aerosol | 90 | 14_pm2_haze_burning_air | |
|
| 15 | doc - catchments - carbon - organic carbon - export | 88 | 15_doc_catchments_carbon_organic carbon | |
|
| 16 | the - carbon - of - co2 - of the | 73 | 16_the_carbon_of_co2 | |
|
| 17 | wetland - wetlands - classification - mapping - and | 69 | 17_wetland_wetlands_classification_mapping | |
|
| 18 | uv - ozone - o3 - isoprene - elevated | 67 | 18_uv_ozone_o3_isoprene | |
|
| 19 | mediterranean - the - glacial - iberian - during | 66 | 19_mediterranean_the_glacial_iberian | |
|
| 20 | media - compost - growing media - growing - biochar | 63 | 20_media_compost_growing media_growing | |
|
| 21 | 137cs - of 137cs - sup - ce sup - radiocaesium | 63 | 21_137cs_of 137cs_sup_ce sup | |
|
| 22 | testate - amoebae - testate amoebae - of testate - amoeba | 62 | 22_testate_amoebae_testate amoebae_of testate | |
|
| 23 | peat - pyrolysis - lignin - gc - of | 62 | 23_peat_pyrolysis_lignin_gc | |
|
| 24 | cu - zn - metals - peat - elements | 62 | 24_cu_zn_metals_peat | |
|
| 25 | alkanes - alkane - chain - values - plants | 61 | 25_alkanes_alkane_chain_values | |
|
| 26 | permafrost - active layer - thermal - ground - layer | 60 | 26_permafrost_active layer_thermal_ground | |
|
| 27 | streams - diatom - species - macroinvertebrate - stream | 60 | 27_streams_diatom_species_macroinvertebrate | |
|
| 28 | records - the - of - record - ireland | 60 | 28_records_the_of_record | |
|
| 29 | water - flow - groundwater - recharge - runoff | 59 | 29_water_flow_groundwater_recharge | |
|
| 30 | habitat - species - breeding - bird - nest | 57 | 30_habitat_species_breeding_bird | |
|
| 31 | brgdgts - gdgts - glycerol - brgdgt - branched | 56 | 31_brgdgts_gdgts_glycerol_brgdgt | |
|
| 32 | deposition - nitrogen - nitrogen deposition - sphagnum - of | 55 | 32_deposition_nitrogen_nitrogen deposition_sphagnum | |
|
| 33 | oil sands - sands - fen - oil - reclamation | 54 | 33_oil sands_sands_fen_oil | |
|
| 34 | fire - burned - severity - burning - post fire | 54 | 34_fire_burned_severity_burning | |
|
| 35 | acidification - deposition - acid - ph - catchment | 54 | 35_acidification_deposition_acid_ph | |
|
| 36 | farm - land - agricultural - farmers - policy | 53 | 36_farm_land_agricultural_farmers | |
|
| 37 | cdom - doc - dom - dissolved organic - dissolved | 53 | 37_cdom_doc_dom_dissolved organic | |
|
| 38 | redd - indonesia - deforestation - in indonesia - forest | 50 | 38_redd_indonesia_deforestation_in indonesia | |
|
| 39 | ash - wood ash - wood - growth - of wood | 49 | 39_ash_wood ash_wood_growth | |
|
| 40 | fungal - fungi - mycorrhizal - species - root | 49 | 40_fungal_fungi_mycorrhizal_species | |
|
| 41 | stand - growth - models - tree - stands | 49 | 41_stand_growth_models_tree | |
|
| 42 | smouldering - smoldering - spread - peat - combustion | 49 | 42_smouldering_smoldering_spread_peat | |
|
| 43 | pollen - of pollen - vegetation - of - from | 49 | 43_pollen_of pollen_vegetation_of | |
|
| 44 | arsenic - as - of as - fe - of arsenic | 49 | 44_arsenic_as_of as_fe | |
|
| 45 | ch4 - methane - production - peat - methanogenesis | 47 | 45_ch4_methane_production_peat | |
|
| 46 | africa - the - bp - south - late | 46 | 46_africa_the_bp_south | |
|
| 47 | soc - carbon - soil - stocks - land | 45 | 47_soc_carbon_soil_stocks | |
|
| 48 | soil - organic - carbon - soil organic - soils | 45 | 48_soil_organic_carbon_soil organic | |
|
| 49 | wetlands - constructed - wetland - treatment - phosphorus | 43 | 49_wetlands_constructed_wetland_treatment | |
|
| 50 | microbial - rare - soil - bacterial - diversity | 43 | 50_microbial_rare_soil_bacterial | |
|
| 51 | litter - decomposition - mass loss - litter decomposition - mass | 39 | 51_litter_decomposition_mass loss_litter decomposition | |
|
| 52 | co2 - pco2 - emissions - carbon - ch4 | 39 | 52_co2_pco2_emissions_carbon | |
|
| 53 | soc - carbon - wetland - wetlands - soil | 39 | 53_soc_carbon_wetland_wetlands | |
|
| 54 | countries - emissions - emission - to - climate | 38 | 54_countries_emissions_emission_to | |
|
| 55 | services - ecosystem - ecosystem services - es - pes | 37 | 55_services_ecosystem_ecosystem services_es | |
|
| 56 | catalyst - peat - pyrolysis - char - catalysts | 37 | 56_catalyst_peat_pyrolysis_char | |
|
| 57 | clearfelling - water - phosphorus - buffer - nutrient | 35 | 57_clearfelling_water_phosphorus_buffer | |
|
| 58 | forest - forests - trees - tree - stands | 35 | 58_forest_forests_trees_tree | |
|
| 59 | carbon - climate - atmosphere - earth - carbon cycle | 34 | 59_carbon_climate_atmosphere_earth | |
|
| 60 | tephra - volcanic - cryptotephra - eruptions - tephras | 34 | 60_tephra_volcanic_cryptotephra_eruptions | |
|
| 61 | testate - arcellinida - coi - species - amoebae | 34 | 61_testate_arcellinida_coi_species | |
|
| 62 | methane - methanogenic - community - methanogen - methanogens | 34 | 62_methane_methanogenic_community_methanogen | |
|
| 63 | consolidation - soil - embankment - road - the | 33 | 63_consolidation_soil_embankment_road | |
|
| 64 | species - spider - bogs - spiders - habitat | 33 | 64_species_spider_bogs_spiders | |
|
| 65 | evaporation - energy - model - was - the | 33 | 65_evaporation_energy_model_was | |
|
| 66 | phosphorus - catchment - in - tp - concentrations | 33 | 66_phosphorus_catchment_in_tp | |
|
| 67 | co2 - ch4 - marsh - wetland - emissions | 33 | 67_co2_ch4_marsh_wetland | |
|
| 68 | runoff - peat - channels - flow - catchment | 33 | 68_runoff_peat_channels_flow | |
|
| 69 | nutrient - nitrogen - fertilizer - litter - of | 32 | 69_nutrient_nitrogen_fertilizer_litter | |
|
| 70 | brazil - bp - the - of - in the | 31 | 70_brazil_bp_the_of | |
|
| 71 | tsunami - holocene - the - volcanic - deposits | 30 | 71_tsunami_holocene_the_volcanic | |
|
| 72 | climate change - change - climate - biodiversity - ecosystem | 30 | 72_climate change_change_climate_biodiversity | |
|
| 73 | gpr - resistivity - radar - penetrating - penetrating radar | 29 | 73_gpr_resistivity_radar_penetrating | |
|
| 74 | holocene - the - andes - and - bp | 29 | 74_holocene_the_andes_and | |
|
| 75 | permafrost - soc - soil - soils - arctic | 28 | 75_permafrost_soc_soil_soils | |
|
| 76 | policy - forest - owners - arguments - forest owners | 28 | 76_policy_forest_owners_arguments | |
|
| 77 | bog - poland - peatland - europe - ca | 28 | 77_bog_poland_peatland_europe | |
|
| 78 | ch4 - oxidation - methane - paddy - aom | 28 | 78_ch4_oxidation_methane_paddy | |
|
| 79 | enzyme - enzymes - eea - soil - activities | 28 | 79_enzyme_enzymes_eea_soil | |
|
| 80 | channel - catchment - flow - bends - model | 28 | 80_channel_catchment_flow_bends | |
|
| 81 | soil - soil science - science - of soil - eu | 27 | 81_soil_soil science_science_of soil | |
|
| 82 | pahs - pah - polycyclic aromatic - polycyclic - aromatic | 27 | 82_pahs_pah_polycyclic aromatic_polycyclic | |
|
| 83 | n2o - n2o emissions - emissions - emission - nitrous | 26 | 83_n2o_n2o emissions_emissions_emission | |
|
| 84 | peat water - adsorption - electrocoagulation - brackish peat - brackish peat water | 26 | 84_peat water_adsorption_electrocoagulation_brackish peat | |
|
| 85 | mangrove - mangroves - carbon - coastal - b2 | 26 | 85_mangrove_mangroves_carbon_coastal | |
|
| 86 | species - retention - alien - richness - forests | 25 | 86_species_retention_alien_richness | |
|
| 87 | colloidal - river - elements - fe - colloids | 25 | 87_colloidal_river_elements_fe | |
|
| 88 | sulfate - sulfur - 34s - peat - sulphur | 24 | 88_sulfate_sulfur_34s_peat | |
|
| 89 | caribou - habitat - woodland caribou - populations - wolf | 24 | 89_caribou_habitat_woodland caribou_populations | |
|
| 90 | food - agriculture - food system - change - covid 19 | 24 | 90_food_agriculture_food system_change | |
|
| 91 | microbial - community - microbial community - communities - bacterial | 23 | 91_microbial_community_microbial community_communities | |
|
| 92 | sorption - cu - ions - ii - cu ii | 22 | 92_sorption_cu_ions_ii | |
|
| 93 | fire - fires - algorithm - frp - hotspot | 22 | 93_fire_fires_algorithm_frp | |
|
| 94 | choice - wtp - preferences - valuation - choice experiment | 22 | 94_choice_wtp_preferences_valuation | |
|
| 95 | nematodes - earthworm - soil - food - nematode | 22 | 95_nematodes_earthworm_soil_food | |
|
| 96 | conservation - orangutan - habitat - forest - species | 21 | 96_conservation_orangutan_habitat_forest | |
|
| 97 | cushion - accumulation - peat - amazonian - vegetation | 21 | 97_cushion_accumulation_peat_amazonian | |
|
| 98 | ch4 - oxidation - ch4 oxidation - uptake - ch4 uptake | 20 | 98_ch4_oxidation_ch4 oxidation_uptake | |
|
| 99 | tidal - sediment - coastal - delta - the | 20 | 99_tidal_sediment_coastal_delta | |
|
| 100 | emissions - co2 - ghg - n2o - table | 20 | 100_emissions_co2_ghg_n2o | |
|
| 101 | methane - ph - cytochrome - methanotrophs - acetic acid | 20 | 101_methane_ph_cytochrome_methanotrophs | |
|
| 102 | patterns - model - self organization - evolutionary - self | 20 | 102_patterns_model_self organization_evolutionary | |
|
| 103 | nitrogen - denitrification - n2o - soil - n2 | 20 | 103_nitrogen_denitrification_n2o_soil | |
|
| 104 | birch - rotation - biomass - buds - biomass production | 19 | 104_birch_rotation_biomass_buds | |
|
| 105 | fire - wildfire - fires - wildfires - health | 19 | 105_fire_wildfire_fires_wildfires | |
|
| 106 | grazing - heathland - heather - moorland - england | 19 | 106_grazing_heathland_heather_moorland | |
|
| 107 | emissions - fire - burning - fire emissions - biomass burning | 19 | 107_emissions_fire_burning_fire emissions | |
|
| 108 | peat - landslides - failure - of peat - peat compaction | 18 | 108_peat_landslides_failure_of peat | |
|
| 109 | biochar - straw - soil - fe - bc | 18 | 109_biochar_straw_soil_fe | |
|
| 110 | ecosystem - respiration - carbon - ecosystem respiration - meadow | 17 | 110_ecosystem_respiration_carbon_ecosystem respiration | |
|
| 111 | wetland - wetlands - risk - of wetland - the wetland | 17 | 111_wetland_wetlands_risk_of wetland | |
|
| 112 | dom - thm - groundwater - molecular - organic | 17 | 112_dom_thm_groundwater_molecular | |
|
| 113 | geochemistry - landscape geochemistry - rocks - peat - mafic | 17 | 113_geochemistry_landscape geochemistry_rocks_peat | |
|
| 114 | tundra - ch4 - n2o - fluxes - antarctic | 16 | 114_tundra_ch4_n2o_fluxes | |
|
| 115 | cellulose - sphagnum - isotopic - isotope - δ18ocel | 16 | 115_cellulose_sphagnum_isotopic_isotope | |
|
| 116 | solute - transport - chloride - peat - pore | 16 | 116_solute_transport_chloride_peat | |
|
| 117 | charcoal - fire - fires - holocene - fire history | 15 | 117_charcoal_fire_fires_holocene | |
|
| 118 | ghg - agricultural - dairy - abatement - emissions | 15 | 118_ghg_agricultural_dairy_abatement | |
|
| 119 | palm - oil - palm oil - sustainability - industry | 15 | 119_palm_oil_palm oil_sustainability | |
|
| 120 | humic - humic substances - substances - acids - fluorescence | 15 | 120_humic_humic substances_substances_acids | |
|
| 121 | canopy - ndvi - pri - lue - phenological | 15 | 121_canopy_ndvi_pri_lue | |
|
| 122 | pollen - bog - peat - the - human impact | 15 | 122_pollen_bog_peat_the | |
|
| 123 | marshes - tidal - marshes are - salt - or | 15 | 123_marshes_tidal_marshes are_salt | |
|
| 124 | soil - prediction - mapping - covariates - dsm | 15 | 124_soil_prediction_mapping_covariates | |
|
| 125 | si - of si - silicon - biogenic - protozoic | 14 | 125_si_of si_silicon_biogenic | |
|
| 126 | et - evapotranspiration - le - wetland - rice | 14 | 126_et_evapotranspiration_le_wetland | |
|
| 127 | forest - finland - forests - stock - management | 14 | 127_forest_finland_forests_stock | |
|
| 128 | iodine - 129i - sorption - iodide - the sorption | 14 | 128_iodine_129i_sorption_iodide | |
|
| 129 | palm - oil - palm oil - smallholders - certification | 14 | 129_palm_oil_palm oil_smallholders | |
|
| 130 | dndc - model - models - soil - carbon | 14 | 130_dndc_model_models_soil | |
|
| 131 | snow - thaw - cover - sca - data | 14 | 131_snow_thaw_cover_sca | |
|
| 132 | stx2 - microbiota - gut - gut microbiota - microbial | 13 | 132_stx2_microbiota_gut_gut microbiota | |
|
| 133 | dom - doc - organic - dissolved organic - of dom | 13 | 133_dom_doc_organic_dissolved organic | |
|
| 134 | forest - cbm - ontario - cfs3 - cbm cfs3 | 13 | 134_forest_cbm_ontario_cfs3 | |
|
| 135 | wind - wind farms - farms - onshore - onshore wind | 13 | 135_wind_wind farms_farms_onshore | |
|
| 136 | uranium - of uranium - 232th - th - ar | 13 | 136_uranium_of uranium_232th_th | |
|
| 137 | groundwater - springs - spring - gdes - discharge | 13 | 137_groundwater_springs_spring_gdes | |
|
| 138 | fire - forest - boreal - burned - fires | 13 | 138_fire_forest_boreal_burned | |
|
| 139 | metal - metals - cd - sediments - zn | 13 | 139_metal_metals_cd_sediments | |
|
| 140 | slr - sea level - coastal - sea - sea level rise | 13 | 140_slr_sea level_coastal_sea | |
|
| 141 | damo - methane - anaerobic - oxidation - aom | 12 | 141_damo_methane_anaerobic_oxidation | |
|
| 142 | temperature - microbial - soil - co2 - pd | 12 | 142_temperature_microbial_soil_co2 | |
|
| 143 | soil - respiration - root - soil respiration - enchytraeid | 12 | 143_soil_respiration_root_soil respiration | |
|
| 144 | kerp - fusiformisporites - permian - genus - flora | 11 | 144_kerp_fusiformisporites_permian_genus | |
|
| 145 | dust - dust deposition - dust sources - deposition - atmospheric dust | 11 | 145_dust_dust deposition_dust sources_deposition | |
|
| 146 | methane - sources - ch4 - les - de | 11 | 146_methane_sources_ch4_les | |
|
| 147 | n2o - n2o emissions - emissions - permafrost - n2o fluxes | 11 | 147_n2o_n2o emissions_emissions_permafrost | |
|
| 148 | australia - mis - record - ka - crater | 11 | 148_australia_mis_record_ka | |
|
| 149 | oc - fjords - fjord - lakes - of oc | 10 | 149_oc_fjords_fjord_lakes | |
|
| 150 | fe - reduction - fe iii - sr10 - iron | 10 | 150_fe_reduction_fe iii_sr10 | |
|
| 151 | loading - eutrophication - nitrogen - coastal - phytoplankton | 10 | 151_loading_eutrophication_nitrogen_coastal | |
|
| 152 | model - wetlands - groundwater - water - the wetlands | 10 | 152_model_wetlands_groundwater_water | |
|
| 153 | co2 - soil - co2 efflux - soil co2 efflux - soil co2 | 10 | 153_co2_soil_co2 efflux_soil co2 efflux | |
|
| 154 | transfer - transfer functions - transfer function - testate - functions | 10 | 154_transfer_transfer functions_transfer function_testate | |
|
| 155 | peat - spain - bog - matter - autofluorescent | 10 | 155_peat_spain_bog_matter | |
|
| 156 | isbas - insar - subsidence - motion - deformation | 10 | 156_isbas_insar_subsidence_motion | |
|
|
|
</details> |
|
|
|
## Training hyperparameters |
|
|
|
* calculate_probabilities: False |
|
* language: None |
|
* low_memory: False |
|
* min_topic_size: 10 |
|
* n_gram_range: (1, 3) |
|
* nr_topics: None |
|
* seed_topic_list: None |
|
* top_n_words: 30 |
|
* verbose: False |
|
|
|
## Framework versions |
|
|
|
* Numpy: 1.22.4 |
|
* HDBSCAN: 0.8.29 |
|
* UMAP: 0.5.3 |
|
* Pandas: 1.5.3 |
|
* Scikit-Learn: 1.2.2 |
|
* Sentence-transformers: 2.2.2 |
|
* Transformers: 4.30.2 |
|
* Numba: 0.56.4 |
|
* Plotly: 5.13.1 |
|
* Python: 3.10.12 |
|
|