Gensim at PyCon France 2016 | RARE Technologies

PyCon France 2016 was held in Rennes from the 13th-16th of October at Telecom Bretagne.

We were lucky to have beautiful weather on all the days we were there

Gensim had a presence on both the conference days with Bhargav Srinivasa presenting his talk on day 1 titled “Topic Modelling with Python and Gensim” and me presenting my workshop titled “Twitter User Classification with Gensim and Scikit-learn” (had a pretty boring sounding title before this!) on day 2.

Sprint day (Friday)

I arrived in Rennes on the second day of the sprints. I got the opportunity to interact with the maintainers of libraries such as cpython and maplearn. I added the new MLPClassifier and MLPRegressor released in Scikit-learn 0.18 to maplearn in the sprint. There was also great japanese food for lunch! There was a party planned for Friday evening too but unfortunately I couldn’t attend it 🙁

Day 1 (Saturday)

I sat for a few talks before Bhargav’s. Most talks were in French however I could still understand most of them because of the common language of python that we all spoke! Bhargav’s talk was at noon and got a great audience! The talk went really well but more importantly, the questions that were asked afterwards were very interesting. Hopefully we answered all of them well too! Even after the talk, there was a lot of interest from people towards gensim and how it could cater to their use-cases. We were glad to know that we were able to introduce gensim to an audience which didn’t know about it before!

Bhargav presenting at PyCon France 2016

My workshop

My workshop involved building an end-to-end machine learning pipeline. In my workshop, I have attempted to build a twitter user classification model right from scratch. This includes building our own custom dataset to preprocessing to finally deploying our machine learning model. I have explored different classification techniques ranging from a simple count based classification to a more complicated classification using topic modeling and glove vectors. I have also played around a lot with the topic coherence pipeline in gensim and explored concepts such as filtering an LDA model (next step of LDA as LSI which I talked about in my last post), topic coherence for choosing optimal number of topics for an LDA model among many other use cases. You can find the GitHub repository for my workshop here.

Since this was a 90 minute workshop, I didn’t expect as much attendance as Bhargav’s talk but I still got a decent number there! The participation by the audience was great and again, the questions were intriguing. I do hope I was able to explain all the concepts in my workshop clearly and also hope that this workshop helps people solve their own research and corporate problem statements better in the future.

Conclusion

All in all I had a wonderful time at PyCon France 2016! Kudos to the organisers for organising it so well! The people, venue, food, organisation… basically everything was amazing! I hope we were able to do justice to the wonderful event and also to the brilliant features of gensim through our presentations!

Cheers to PyCon France 2016 and here’s to many more PyCons with gensim!