Welcome to pressagio’s documentation!¶
Pressagio is a library that predicts text based on n-gram models. For example, you can send a string and the library will return the most likely word completions for the last token in the string.
Example Usage¶
The repository contains two example scripts in the folder example
to
demonstrate how to build a language model and use the model for prediction.
You can check the code of those two scripts how to use pressagio in your own
projects. Here is how to use the two scripts to predict the next word in a
phrase.
First, you have to build a languange model. We will use the script example/text2ngram.py to add 1-, 2- and 3-grams of a given text to a sqlite database. For demonstration purposes we will use a simple text file that comes with pressagio’s tests. You have to run the script three times to create a table for each of the n-grams:
$ python example/text2ngram.py -n 1 -o test.sqlite tests/test_data/der_linksdenker.txt
$ python example/text2ngram.py -n 2 -o test.sqlite tests/test_data/der_linksdenker.txt
$ python example/text2ngram.py -n 3 -o test.sqlite tests/test_data/der_linksdenker.txt
This will create a file test.sqlite
in the current directory. We can now
use this database to get a prediction for a phrase. We will use the script
example/predict.py
which uses the configuration file
example/example_profile.ini.
Note that you will always need a configuration file if you want to use the
built-in predictor. To get a prediction call:
$ python example/predict.py
['warm', 'der', 'und', 'die', 'nicht']
The script will just output a list of predictions.
Running the tests¶
$ python -m unittest discover