A free package for Corpus Linguistics and text analysis with Python.


It contains, among other things, tools for creating:

  • Corpora;
  • Frequency wordlists;
  • Keywords;
  • Concordance lines;
  • Collocates;
  • N-gram lists;
  • Dispersion plots;
  • Excel data files.

The package is built on top of platforms and packages for scientific research: numpy, nltk, pandas, xlsxwriter. All in Anaconda Platorm.

Advantages and Features

  • Speed: Kitconc runs faster on PyPy and Cython implementations of Python.
  • Cross-platform: It works with any operating system (Windows, Mac, Linux).
  • Multi-language: It is possible to use it with different languages.
  • Customizable: use it with existing corpora or with your own texts and resources.
  • Part-of-speech (POS): It supports data annotation for POS.
  • Simple: It comes with a GUI for the most common analysis tools.
  • Powerful: Developers can use the library to develop their own tools to meet specific needs.
  • Easy known formats: Data results are accessible for pandas DataFrame and can be exported to Excel spreadsheets. Compiled texts are saved as numpy arrays to binary files (.npy).

Overview of key aspects

Installation requirements

Kitconc requires a Python 3.6 (or later) installation along with:

  • numpy;
  • nltk;
  • pandas;
  • xlsxwriter.

It is suggested that users install Anaconda Platform as an easy option.

Getting Started

(Make sure you have Python 3.6 (or later) and the required packages.

Quick install using pip

$ pip install kitconc

Or go to the Download page for more installation options.

Learn how to use the library

Access example codes for the most important functionalities in the Guides section.

How to cite Kitconc?

Moreira Filho, J. L. (2020). #Kitconc v. 2.x. [software]. Available at: http://ilexis.net.br/kitconc/