Moses Support Digest:POS LM
[Moses-support] POS LM
Hi
There is pos.lm of the target language in factored model training. I want to know the steps involved in preparing the POS.lm and the kind of input parameters altogether.
-Doren
Re:[Moses-support] POS LM
Hi Doren,
please read the tutorial on factored models:
http://www.statmt.org/moses/?n=Moses.FactoredTutorial
-phi
Re:[Moses-support] POS LM
Hi Doren,
I’ve used SRILM to generate POS LMs. The LM, as you might expect, needs to be training on a corpus consisting of sequences of POSes instead of sequences of surface forms, e.g. instead of
The cat sat on the mat
the corpus should contain
DET N V P DET N
or whatever.
Furthermore, the set of POSes is probably small as vocabularies go, so smoothing methods that rely on counts-of-counts, such as Kneser-Ney, are inappropriate. The SRILM website’s FAQ
Also because the vocabulary is small, you can get away with using
higher-order n-grams than you would use for a surface LM.
Other than that, it’s the same as preparing a surface LM.
Regards,
Ben
NOTICE:This is digested from the Moses-support mailing list, which supports for the moses SMT decoder.
Related posts:
- Moses Support Digest:Building POS language model with SRILM
- Moses Support Digest:Aligned phrase counts
- Moses Support Digest:Moses Error in training phrase
- Moses Support Digest:Translation from English to Foreign Language
- Moses Support Digest:Moses step 1 – data preparation step
- Moses Support Digest:How to run giza++ with a dictionary
- Moses Support Digest:Hierarchical rule extraction
- Moses Support Digest:About giza++ options when running moses
- Moses Support Digest:Binarized SRILM
- Moses Support Digest: Moses seems to hang