Chinmaya’s Google Summer of Code 2017 Live-Blog : a Chronicle of Integrating Gensim with scikit-learn and Keras

Chinmaya Pancholi gensim, Student Incubator

2nd September, 2017 The final blogpost in the GSoC 2017 series summarising all the work that I did this summer can be found here. 15st August, 2017 During the last two weeks, I had been working primarily on adding a Python implementation of Facebook Research’s Fasttext model to Gensim. I was also simultaneously working on completing the tasks left for adding scikit-learn API for …

Parul’s Google Summer of Code 2017 Live-Blog : a chronicle of adding training and topic visualizations in gensim

Parul Sethi gensim, Student Incubator

19th August 2017 For last phase of my project, i’ll be adding a visualization which is an attempt to overcome some of the limitations of already available topic model visualizations. Current visualizations focus more on topics or topic-term relations leaving out the scope to comprehensively explore the document entity. I’d work on an interface which would allow us to interactively …

Google Summer of Code 2017 – Performance improvement in Gensim and fastText

Prakhar Pratyush gensim, Student Incubator

July 20, 2017 This week, I’ve mostly worked on implementing native unsupervised fastText (PR #1482) in gensim. It’s quite challenging as I had to look into the fasttext C codes, and read the research paper to properly understand how this is working, and then had to figure out the similarity with word2vec code. After lots of discussion with mentors, we …

Google Summer of Code 2017 – Week 1 of Integrating Gensim with scikit-learn and Keras

Chinmaya Pancholi gensim, Student Incubator

This is my first post as part of Google Summer of Code 2017 working with Gensim. I would be working on the project ‘Gensim integration with scikit-learn and Keras‘ this summer. I stumbled upon Gensim while working on a project which utilized the Word2Vec model. I was looking for a functionality to suggest words semantically similar to the given input word and Gensim’s …

Archive of RRP Podcast Episodes

Radim Řehůřek podcast Leave a Comment

Subscribe with RSS, iTunes, YouTube, Stitcher, SoundCloud. Episode #4: Leonid Boytsov on kNN search and information retrieval Where Leo, a PhD researcher from the Language Technologies Institute of Carnegie Mellon University, talks about fast approximate search in modern information retrieval. How does his NMSLIB library compare to Facebook's FAISS and Spotify's Annoy? [full post] Episode #3: Andy Müller on scikit-learn ...

Text Summarization in Python: Extractive vs. Abstractive techniques revisited

Pranay, Aman and Aayush gensim, Student Incubator, summarization

This blog is a gentle introduction to text summarization and can serve as a practical summary of the current landscape. It describes how we, a team of three students in the RaRe Incubator programme, have experimented with existing algorithms and Python tools in this domain. We compare modern extractive methods like LexRank, LSA, Luhn and Gensim’s existing TextRank summarization module …

Gensim switches to semantic versioning

Lev Konstantinovskiy gensim, Open Source

Starting with release 1.0.0, Gensim adopts semantic versioning. The time went in a flash, but Gensim has reached maturity. It's been cited in nearly 500 academic papers, used commercially in dozens of companies, organized many coding sprints and meetups and generally withstood the test of time. Between the continued Gensim support by our parent company, rare-technologies.com, and our open Student ...