Moses Support Digest:Building POS language model with SRILM
[Moses-support] Building POS language model with SRILM
Hi,
The Moses manual recommends using the following switches when building a language model with SRILM:
  -interpolate -kndiscount
I assume this recommendation applies specifically to surface-string language models. For a part-of-speech language model, KN-discounting is inappropriate because it is based on counts-of-counts, and the counts-of-counts for POSes are odd in that there are very few POSes that occur only once or twice in a given corpus.
Are there particular switches that are recommended for building a POS
language model with SRILM?
Regards,
Ben Gottesman
Re: [Moses-support] Building POS language model with SRILM
Hi,
you are correct that for POS LMs the lower order n-gram counts are very different and smoothing is less relevant.
You could train a 7-gram LM with Good Turing smoothing for the lower order n-grams and Kneser-Ney for the higher order n-grams.
I have done this occasionally.
-phi
NOTICE:This is digested from the Moses-support mailing list, which supports for the moses SMT decoder.
Related posts:
- Moses Support Digest:POS LM
- Moses Support Digest:Tuning failure with Language model type unknown
- Moses Support Digest:Binarized SRILM
- Moses Support Digest:SRILM installation problem
- Moses Support Digest:About the hierarchical model of Moses
- Moses Support Digest:Aligned phrase counts
- Moses Support Digest: different bleu scores from nist and moses scripts
- Moses Support Digest:Translation from English to Foreign Language
- Moses Support Digest:Is reordering model a must-be-used component to use?
- Moses Support Digest: Moses seems to hang