eBiltegia

    • Zer da eBiltegia? 
    •   eBiltegiari buruz
    •   Argitaratu irekian zure ikerketa
    • Sarbide Irekia MUn 
    •   Zer da Zientzia Irekia?
    •   Mondragon Unibertsitatearen dokumentu zientifikoetara eta irakaskuntza-materialetara Sarbide Irekia izateko politika instituzionala
    •   Zure argitalpenak jaso eta zabaldu egiten ditu Bibliotekak

Con la colaboración de:

Euskara | Español | English
  • Kontaktua
  • Zientzia Irekia
  • eBiltegiari buruz
  • Hasi saioa
Ikusi itema 
  •   eBiltegia MONDRAGON UNIBERTSITATEA
  • Ekoizpen zientifikoa - Kongresuak
  • Kongresuak - Ingeniaritza
  • Ikusi itema
  •   eBiltegia MONDRAGON UNIBERTSITATEA
  • Ekoizpen zientifikoa - Kongresuak
  • Kongresuak - Ingeniaritza
  • Ikusi itema
JavaScript is disabled for your browser. Some features of this site may not work without it.
Thumbnail
Ikusi/Ireki
GSMOTE_v2.pdf (275.8Kb)
Erregistro osoa
Eragina
Google Scholar
Partekatu
EmailLinkedinFacebookTwitter
Gorde erreferentzia
Mendely

Zotero

untranslated

Mets

Mods

Rdf

Marc

Exportar a BibTeX
Izenburua
Generalized SMOTE: A universal generation oversampling technique for all data types in imbalanced learning
Egilea
Cernuda, Carlos
Reguera-Bakhache, Daniel cc
Aguirre, Aitor
Iturbe Urretxa, Mikel
Garitano, Iñaki
Zurutuza, Urko
Argitalpen data
2021
Ikerketa taldea
Análisis de datos y ciberseguridad
Bertsioa
Postprinta
Dokumentu-mota
Kongresu-ekarpena
Hizkuntza
Ingelesa
Eskubideak
© Los autores, 2021
Sarbidea
Sarbide irekia
URI
https://hdl.handle.net/20.500.11984/13905
Identifikadorea
https://caepia20-21.uma.es/inicio_files/caepia20-21-actas.pdf
Non argitaratua
Conference of the Spanish Association for Artificial Intelligence (CAEPIA)  19. Málaga, 2021
Argitaratzailea
CAEPIA
Gako-hitzak
Imbalanced Learning
Oversampling Techniques
Laburpena
A common problem that arises when facing classification tasks is the class imbalance problem, which happens when one or more classes are heavily underrepresented compared to the rest, being usually th ... [+]
A common problem that arises when facing classification tasks is the class imbalance problem, which happens when one or more classes are heavily underrepresented compared to the rest, being usually those minority classes the ones of interest. A natural solution consists of correcting the imbalance by sampling methods, being Synthetic Minority Oversampling TEchnique (SMOTE) the most widely used method. In the same way as all other oversampling techniques, it relies on using distances/similarities in order to focus on the neighborhoods of minority samples in the synthetic samples generation procedure, thus it is meant for pure numerical data. Nevertheless, it is really common to collect categorical data or to discretize numeric attributes as a preprocessing step, being limited to random sampling approaches to correct imbalance. Some approaches have been proposed to deal with mixed-type data or pure categorical data, but they ignore part of the information of the samples or end up being almost random approaches. We propose GSMOTE, a generalization of SMOTE method, suitable for any data type. For the neighborhoods determination, the distance between samples is obtained by means of a trans formation of Gower’s General Similarity Coefficient into a novel General Distance Coefficient, in which the part corresponding to the way of measuring similarities between categories in categorical variables has been replaced by a recently presented similarity measure called Variable Entropy measure, inspired by Shannon’s Entropy. GSMOTE has been tested on six public imbalanced datasets, with different characteristics and imbalance levels. [-]
Bildumak
  • Kongresuak - Ingeniaritza [436]

Zerrendatu honako honen arabera

eBiltegia osoaKomunitateak & bildumakArgitalpen dataren araberaEgileakIzenburuakMateriakIkerketa taldeakNon argitaratuaBilduma hauArgitalpen dataren araberaEgileakIzenburuakMateriakIkerketa taldeakNon argitaratua

Nire kontua

SartuErregistratu

Estatistikak

Ikusi erabilearen inguruko estatistikak

Nork bildua:

OpenAIREBASERecolecta

Nork balioztatua:

OpenAIRERebiun
MONDRAGON UNIBERTSITATEA | Biblioteka
Kontaktua | Iradokizunak
DSpace
 

 

Nork bildua:

OpenAIREBASERecolecta

Nork balioztatua:

OpenAIRERebiun
MONDRAGON UNIBERTSITATEA | Biblioteka
Kontaktua | Iradokizunak
DSpace