Seminar on Internet Technologies (Winter 2016/2017): Difference between revisions

m
No edit summary
Line 127: Line 127:
|-
|-
|-
|-
| '''Learning from Imbalanced Data'''   
| '''Learning from Imbalanced Data (assigned to Oleh Astappiev)'''   
When building and training classifiers for classification problems, one commonly encountered problem is that of imbalanced data. For instance, in the case of a binary classifier, this means that one class is hugely overrepresented in the data available. Training classifiers for this kind of datasets has been a problem for some time. In this work, your task is to i) precisely introduce the imbalanced data problem, ii) discuss the state of the art of approaches for mitigating this problem (both from the perspective of learning algorithms and data manipulation techniques) and iii) find out what issues still remain open until today. Note that this topic requires a background in data science, and in particular in classification algorithms. Also, this topic requires a comparatively high reading effort.
When building and training classifiers for classification problems, one commonly encountered problem is that of imbalanced data. For instance, in the case of a binary classifier, this means that one class is hugely overrepresented in the data available. Training classifiers for this kind of datasets has been a problem for some time. In this work, your task is to i) precisely introduce the imbalanced data problem, ii) discuss the state of the art of approaches for mitigating this problem (both from the perspective of learning algorithms and data manipulation techniques) and iii) find out what issues still remain open until today. Note that this topic requires a background in data science, and in particular in classification algorithms. Also, this topic requires a comparatively high reading effort.
| [https://www.net.informatik.uni-goettingen.de/people/David_Koll David Koll ]
| [https://www.net.informatik.uni-goettingen.de/people/David_Koll David Koll ]