Corpora

Here, you will find some language resources for getting started with Kitconc.

Raw texts

  -    Name    Language    Files    Size  
Job adsen9597 KB
Bulaspt100439 KB

Compiled corpora

Download and import existing corpora:

   -      Name     Language     Files     Tokens     Types     Size     Original source  
Mac-morpho pt-br 3 945.751 54.235 4.5 MB NILC
Leg2kids pt-br 23 153.790.830 452.304 503 MB NILC
Job ads en 95 27.378 4.467 436 KB Web
BNC baby en 182 4.621.403 85.771 16 MB OTA
Corona en 5 3.331.238 84.124 16 MB corpusdata.org
Portuguese pt 1 10.918.906 216.962 42 MB corpusdata.org
Movies en 1 2.048.225 40.917 8 MB corpusdata.org
COCA en 8 10.858.881 192.235 43 MB corpusdata.org

How to import corpora using the Kitconc GUI

1. Open the Kitconc GUI;
2. Click 'File' menu and select 'Import...';
3. Browse to the downloaded corpus file and click 'Open'.

How to import corpora using Python code

from kitconc.kit_corpus import Corpus
corpus = Corpus('E:/KITCONC-WORKSPACE','corona',language='english')
corpus.add_from_export('E:/CORPORA/corona.zip')