2nd September, 2017 The final blogpost in the GSoC 2017 series summarising all the work that I did this summer can be found here. 15st August, 2017 During the last two weeks, I had been working primarily on adding a Python implementation of Facebook Research’s Fasttext model to Gensim. I was also simultaneously working on completing the tasks left for adding scikit-learn API for …
Parul’s Google Summer of Code 2017 Live-Blog : a chronicle of adding training and topic visualizations in gensim
19th August 2017 For last phase of my project, i’ll be adding a visualization which is an attempt to overcome some of the limitations of already available topic model visualizations. Current visualizations focus more on topics or topic-term relations leaving out the scope to comprehensively explore the document entity. I’d work on an interface which would allow us to interactively …
Google Summer of Code 2017 – Performance improvement in Gensim and fastText
July 20, 2017 This week, I’ve mostly worked on implementing native unsupervised fastText (PR #1482) in gensim. It’s quite challenging as I had to look into the fasttext C codes, and read the research paper to properly understand how this is working, and then had to figure out the similarity with word2vec code. After lots of discussion with mentors, we …
Google Summer of Code 2017 – Week 1 of Integrating Gensim with scikit-learn and Keras
This is my first post as part of Google Summer of Code 2017 working with Gensim. I would be working on the project ‘Gensim integration with scikit-learn and Keras‘ this summer. I stumbled upon Gensim while working on a project which utilized the Word2Vec model. I was looking for a functionality to suggest words semantically similar to the given input word and Gensim’s …
Dealing mergeytocin: how to run an open source sprint. Based on 8 gensim sprints in 5 countries in 12 months.
In this blog I want to tell you what it takes to organize an open source coding sprint – find a venue, set an agenda and then actually run it.
RRP #3: Andy Müller on scikit-learn and open source
Archive of RRP Podcast Episodes
Text Summarization in Python: Extractive vs. Abstractive techniques revisited
This blog is a gentle introduction to text summarization and can serve as a practical summary of the current landscape. It describes how we, a team of three students in the RaRe Incubator programme, have experimented with existing algorithms and Python tools in this domain. We compare modern extractive methods like LexRank, LSA, Luhn and Gensim’s existing TextRank summarization module …