Topic Modelling with Latent Dirichlet Allocation: How to pre-process data and tune your model. New tutorial.

Ólavur Mortensen gensim, Machine Learning, Open Source, programming, Student Incubator

If you’ve learned how to train topic models in Gensim, but aren’t able to get satisfying results, then we have a new tutorial that will help you get on the right track on GitHub. Primarily, you will learn some things about pre-processing text data for the LDA model. You will also get some tips about how to set the parameters of the model.

Feel free to continue the discussion in the Gensim mailing list, and share your thoughts and experience with data pre-processing, training and tuning the LDA model.