Moses Support Digest:About giza++ options when running moses
[Moses-support] About giza++ options when running moses
hi all,
About Giza++ options, I found this on moses website:
————————————————————————————————–
GIZA++ Options
GIZA++ takes a lot of parameters to specify the behavior of the training process and limits on sentence length, etc. Please refer to the corresponding documentation for details on this. Parameters can be passed on to GIZA++ with the switch –giza-option. For instance, if you want to the change the number of iterations for the different IBM Models to 4 iterations of Model 1, 0 iterations of Model 2, 4 iterations of the HMM Model, 0 iterations of Model 3, and 3 iterations of Model 4, you can specify this by
train-phrase-model.perl [...] –giza-option m1=4,m2=0,mh=4,m3=0,m4=3
————————————————————————————————-
so, if I want to use IBM model 1, I can just set parameters like this:
train-phrase-model.perl [...] –giza-option m1=5,m2=0,mh=0,m3=0,m4=0
and if I want to use IBM model 3,
train-phrase-model.perl [...] –giza-option m1=5,m2=5,mh=5,m3=3,m4=0
Is that right?
I print logs and I find there’re also model5 and model6, so I got confused.
I need your help.
2009-12-11
Best regards,
Lee Xianhua
Re:[Moses-support] About giza++ options when running moses
Yes, that’s right.
Model 6 is described in this journal article.
http://aclweb.org/anthology-new/J/J03/J03-1002.pdf
It also explanains of some of the other parameters and reasonable sequences of model iterations.
Adam
Re:[Moses-support] About giza++ options when running moses
Hi,
I once tried to use GIZA with the IBM1 model in the described manner (… –giza-option m1=5,m2=0,mh=0,m3=0,m4=0 …), and the training stopped after the following error message:
ERROR: Giza did not produce the output file ibm1/giza.est-eng/est-eng.A3.final. Is your corpus clean (reasonably-sized sentences)? at
/home/smt/tools/moses/scripts/training/train-factored-phrase-model.perl line 734.
After that I tried running GIZA separately from the training script, and it appears that it doesn’t generate any output file — is there a way of telling GIZA to not only stop after the IBM1 model iterations but actually save the model?
Thanks in advance,
Mark
Re:[Moses-support] About giza++ options when running moses
If you run GIZA only up to model 1, the final output will be
src-tgt.A1.*
instead of src-tgt.A3.final
You have to modify the script to manually rename it.
Also, this page may help you on GIZA parameters.
http://geek.kyloo.net/software/doku.php/mgiza:configure
–Q
Re:[Moses-support] About giza++ options when running moses
Finally got it to work. In order to get the IBM1 alignments from GIZA in addition to
–giza-option m1=5,m2=0,mh=0,m3=0,m4=0
you have to set the ibm1 dump frequency to at most the number of the iterations: t1=5. Also the training script sets nodumps=1, which has to be overriden; so finally the required params are
–giza-option m1=5,m2=0,mh=0,m3=0,m4=0,t1=5,nodumps=0
ibm2 etc alignments can be obtained in the same way.
Also the final file will not be src-tgt.A1.final but src-tgt.A1.5. So as noted by Qin Gao, you have to modify the training script to expect that name instead of src-tgt.A3.final. Someone has already done that for HMM alignments for instance (–hmm-align switch of the training script).
By the way even with these parameters GIZA still performs a parameter transfer from ibm2 to ibm3.
Mark
Re:[Moses-support] About giza++ options when running moses
hi all,
Thanks for your generous help. I set parameters to –giza-option m1=5,m2=0,mh=0,m3=0,m4=0 and there was some problem. When running on small corpora, it seems OK.
I got the phrase table and reordering table. But it stopped on large corpora.
If I adopt –giza-option m1=5, m2=0, mh=0, m3=0, m4=0 , no “model” folder was generated.
If I adopt –giza-option m1=5, m2=0, mh=3, m3=3, m4=0 , I got in “model”:
phrase-table.0-0.half.f2n.part0000
phrase-table.0-0.half.f2n.part0001
but no phrase table or reordering table.
I’ll try to set parameters to
–giza-option m1=5,m2=0,mh=0,m3=0,m4=0,t1=5,nodumps=0
and do what you advise me to do.
I hope it works.
It must be necessary for me to view the codes of GIZA++, to know how it actually works.
Again, thanks for your generous help, all of you
2009-12-11
Best regards,
Lee Xianhua
NOTICE:This is digested from the Moses-support mailing list, which supports for the moses SMT decoder.
Related posts:
- Moses Support Digest:running giza in parts
- Moses Support Digest:How to run giza++ with a dictionary
- Moses Support Digest:GIZA++ error
- Moses Support Digest:Moses step 1 – data preparation step
- Moses Support Digest:Hierarchical rule extraction
- Moses Support Digest:Moses Error in training phrase
- Moses Support Digest: Moses seems to hang
- Moses Support Digest:problem running mosesserver
- Moses Support Digest:Alignment information from binary phrase table
- Moses Support Digest:Dictonary use during training