Skip to content

Danila2016/news-classification

Repository files navigation

My code for the news scraping / topic prediction kaggle competition.

You can find the scraping code (better than mine) in the competition Code section.

Before running train_kernel_svm.py build the Cython extension

python setup.py build_ext --inplace

Contents:

  • train_kernel_svm.py - approach #1
  • train_rubert.py - approach #2 (trained on different data)
  • fusion.py - late fusion (gives +0.5% to using approach #2)
  • fix_known_documents.py - set labels of the test documents that appear in the training set (4Gb RAM)

Tested on MacOS / kaggle

About

My submission to a kaggle competition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published