I Love Natural Language Processing

I LOVE NLP

Moses Support Digest:POS LM

without comments

[Moses-support] POS LM

Hi

There is pos.lm of the target language in factored model training. I want to know the steps involved in preparing the POS.lm and the kind of input parameters altogether.

-Doren


Re:[Moses-support] POS LM

Hi Doren,

please read the tutorial on factored models:

http://www.statmt.org/moses/?n=Moses.FactoredTutorial

-phi

Re:[Moses-support] POS LM

Hi Doren,

I’ve used SRILM to generate POS LMs. The LM, as you might expect, needs to be training on a corpus consisting of sequences of POSes instead of sequences of surface forms, e.g. instead of

The cat sat on the mat

the corpus should contain

DET N V P DET N

or whatever.

Furthermore, the set of POSes is probably small as vocabularies go, so smoothing methods that rely on counts-of-counts, such as Kneser-Ney, are inappropriate. The SRILM website’s FAQrecommends Witten-Bell discounting (command line option ‘-wbdiscount’) for such cases. (See question C3, answer (b) at the FAQ.)

Also because the vocabulary is small, you can get away with using
higher-order n-grams than you would use for a surface LM.

Other than that, it’s the same as preparing a surface LM.

Regards,
Ben

NOTICE:This is digested from the Moses-support mailing list, which supports for the moses SMT decoder.

Related posts:

  1. Moses Support Digest:Building POS language model with SRILM
  2. Moses Support Digest:Aligned phrase counts
  3. Moses Support Digest:Moses Error in training phrase
  4. Moses Support Digest:Translation from English to Foreign Language
  5. Moses Support Digest:Moses step 1 – data preparation step
  6. Moses Support Digest:How to run giza++ with a dictionary
  7. Moses Support Digest:Hierarchical rule extraction
  8. Moses Support Digest:About giza++ options when running moses
  9. Moses Support Digest:Binarized SRILM
  10. Moses Support Digest: Moses seems to hang

Written by 52nlp

December 30th, 2009 at 8:52 pm

Posted in Moses,SMT

Tagged with , , , ,

Leave a Reply