Text Summarization with Gensim

Ólavur Mortensen programming 6 Comments

text-analytics
Text summarization is one of the newest and most exciting fields in NLP, allowing for developers to quickly find meaning and extract key words and phrases from documents. RaRe Technologies’ newest intern, Ólavur Mortensen, walks the user through text summarization features in Gensim.

The gensim implementation is based on the popular “TextRank” algorithm and was contributed recently by the good people from the Engineering Faculty of the University in Buenos Aires. This is the first of many publications from Ólavur, and we expect to continue our educational apprenticeship program with students like Ólavur to help them showcase their talents.

The following example was written in IPython Notebook (newly renamed “Jupyter“), feel free to install the Gensim package and step through the tutorial. The source code for the notebook is available under gensim/docs/notebooks.

Are you also interested in sharpening your open source skills or contributing to open source projects? Get in touch using the contact form below.

Comments 6

  1. turja chaudhuri

    Cannot install gensim inside anaconda python setup 2.7 even with pip. Any ideas?
    Error StackTrace

    File “C:\Users\turjac591\AppData\Local\Continuum\Anaconda2\lib\site-packages\gensim\__init__.py”, line 6, in
    from gensim import parsing, matutils, interfaces, corpora, models, similarities, summarization

    File “C:\Users\turjac591\AppData\Local\Continuum\Anaconda2\lib\site-packages\gensim\models\__init__.py”, line 7, in
    from .coherencemodel import CoherenceModel

    File “C:\Users\turjac591\AppData\Local\Continuum\Anaconda2\lib\site-packages\gensim\models\coherencemodel.py”, line 23, in
    from gensim import interfaces

    ImportError: cannot import name interfaces

  2. Ólavur Mortensen Post
    Author
  3. Ólavur Mortensen Post
    Author
    Ólavur Mortensen

    To summarize multiple documents, you could just concatenate the documents and run the algorithm on that. There may be some caveats to using that method however, I haven’t tried it.

    You should ask that question on the Gensim mailing list, and see if someone else has an idea about it (https://groups.google.com/forum/#!forum/gensim).

  4. Pingback: Text Summarization | IMPULSE

Leave a Reply

Your email address will not be published. Required fields are marked *