Moses Support Digest:Building POS language model with SRILM

[Moses-support] Building POS language model with SRILM
Hi,
The Moses manual recommends using the following switches when building a language model with SRILM:
  -interpolate -kndiscount
I assume this recommendation applies specifically to surface-string language models. For a part-of-speech language model, KN-discounting is inappropriate because it is based on counts-of-counts, and the counts-of-counts for POSes are odd in that there are very few POSes that occur only once or twice in a given corpus.
Are there particular switches that are recommended for building a POS
language model with SRILM?
Regards,
Ben Gottesman


Re: [Moses-support] Building POS language model with SRILM

Hi,
you are correct that for POS LMs the lower order n-gram counts are very different and smoothing is less relevant.
You could train a 7-gram LM with Good Turing smoothing for the lower order n-grams and Kneser-Ney for the higher order n-grams.
I have done this occasionally.

-phi

NOTICE:This is digested from the Moses-support mailing list, which supports for the moses SMT decoder.

Related posts:

  1. Moses Support Digest:POS LM
  2. Moses Support Digest:Tuning failure with Language model type unknown
  3. Moses Support Digest:Binarized SRILM
  4. Moses Support Digest:About the hierarchical model of Moses
  5. Moses Support Digest:SRILM installation problem
  6. Moses Support Digest:Aligned phrase counts
  7. Moses Support Digest: different bleu scores from nist and moses scripts
  8. Moses Support Digest:Translation from English to Foreign Language
  9. Moses Support Digest:Is reordering model a must-be-used component to use?
  10. Moses Support Digest: Moses seems to hang
This entry was posted in Moses, SMT and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>