Snowball stemming library collection for Javascript
===================================================

What is Stemming?
-----------------

Stemming maps different forms of the same word to a common "stem" - for
example, the English stemmer maps *connection*, *connections*, *connective*,
*connected*, and *connecting* to *connect*.  So a search for *connected*
would also find documents which only have the other forms.

This stem form is often a word itself, but this is not always the case as this
is not a requirement for text search systems, which are the intended field of
use.  We also aim to conflate words with the same meaning, rather than all
words with a common linguistic root (so *awe* and *awful* don't have the same
stem), and over-stemming is more problematic than under-stemming so we tend not
to stem in cases that are hard to resolve.  If you want to always reduce words
to a root form and/or get a root form which is itself a word then Snowball's
stemming algorithms likely aren't the right answer.


How to use library
------------------

The stemming algorithms generally expect the input text to use composed accents
(Unicode NFC or NFKC) and to have been folded to lower case already.

You can use each stemming modules from Javascript code - e.g to use them
with node:

.. code-block:: javascript

    var EnglishStemmer = require('english-stemmer.js');

    var stemmer = new EnglishStemmer();
    console.log(stemmer.stemWord("testing"));

You'll need to bundle ``base-stemmer.js`` and whichever languages you want
stemmers for (e.g. ``english-stemmer.js`` for English).

FIXME: Document how to use in a web browser.

The stemmer code is re-entrant, but not thread-safe if the same stemmer object
is used concurrently in different threads.

If you want to perform stemming concurrently in different threads, we suggest
creating a new stemmer object for each thread.  The alternative is to share
stemmer objects between threads and protect access using a mutex or similar
but that's liable to slow your program down as threads can end up waiting for
the lock.
