Moses Support Digest:ConfusionNet GetSubString error when using lattice with UTF8 input
[Moses-support] ConfusionNet::GetSubString error when using lattice with UTF8 input
Hi,
I’m having a strange problem that moses crashes when fed with a lattice that has non-ASCII characters in it. If the input is:
(((‘damit’,1.0,1),),((‘ist’,1.0,1),(‘war’,1.0,1),(‘sei’,1.0,1),),((‘der’,1.0,1),(‘die’,1.0,1),(‘das’,1.0,1),),((‘arbeitsplan’,1.0,1),),)
then moses completes without any problems. However, if I change the
last edge into Chinese,
(((‘damit’,1.0,1),),((‘ist’,1.0,1),(‘war’,1.0,1),(‘sei’,1.0,1),),((‘der’,1.0,1),(‘die’,1.0,1),(‘das’,1.0,1),),((‘ç»™’,1.0,1),),)
.. then moses crashes with the following error:
…
…
Translating: word lattice: 4
0 — (damit , 0.000, 1)
1 — (ist , 0.000, 1) (war , 0.000, 1) (sei , 0.000, 1)
2 — (der , 0.000, 1) (die , 0.000, 1) (das , 0.000, 1)
3 — (ç»™ , 0.000, 1)
path stats for current CN:
CN (full): 8.000 15.000 18.000 9.000
CN (explored): 1 0 0 0
ERROR: call to ConfusionNet::GetSubString
My command is:
./???/moses-cmd/src/moses
-inputtype 2 -weight-i 0
-config `pwd`/moses.ini
-input-file `pwd`/src.lattice
-verbose 3
And below is my moses.ini, although there’s nothing particularly
interesting about it:
# MERT optimized configuration
# decoder /???/moses-cmd/src/moses
# BLEU 0.517812 -> 0.517945 on dev /???/working/dev.zh
# We were before running iteration 7
# finished Tue Dec 22 02:44:42 SGT 2009
### MOSES CONFIG FILE ###
#########################
# input factors
[input-factors]
0
# mapping steps
[mapping]
0 T 0
# translation tables: source-factors, target-factors, number of scores, file
[ttable-file]
0 0 5 /my-working-directory/working/model/phrase-table.gz
# no generation models, no generation-file section
# language models: type(srilm/irstlm), factors, order, file
[lmodel-file]
0 0 5 /???/en.lm
# limit on how many phrase translations e for each phrase f are loaded
# 0 = all elements loaded
[ttable-limit]
20
# distortion (reordering) files
[distortion-file]
0-0 monotonicity-bidirectional-f 4 /???/working/model/reordering-table.gz
# distortion (reordering) weight
[weight-d]
0.008227
-0.085606
0.024868
0.064056
0.074998
# language model weights
[weight-l]
0.218681
# translation model weights
[weight-t]
0.042811
0.084593
0.147177
0.045780
0.073527
# no generation models, no weight-generation section
# word penalty
[weight-w]
-0.129677
[distortion-limit]
6
[drop-unknown]
1
Any help is greatly appreciated.
Regards,
Liu Chang
National University of Singapore
Re:[Moses-support] ConfusionNet::GetSubString error when using lattice with UTF8 input
Confusion network input causes this error when verbose=3. You can fix this by using a lower level verbosity.
Re:[Moses-support] ConfusionNet::GetSubString error when using lattice with UTF8 input
That solved my problem. Thanks!
NOTICE:This is digested from the Moses-support mailing list, which supports for the moses SMT decoder.
Related posts:
- Moses Support Digest: Moses seems to hang
- Moses Support Digest:word lattice and multiple translation tables optimization problem
- Moses Support Digest:Is reordering model a must-be-used component to use?
- Moses Support Digest: Regarding moses.weight-reused.ini
- Moses Support Digest:CreateBerkeleyPt and On-Disk Rule Table
- Moses Support Digest:Error compiling on Linux
- Moses Support Digest:Moses Error in training phrase
- Moses Support Digest:mt3_chart compilation error
- Moses Support Digest:Alignment information from binary phrase table
- Moses Support Digest:Tuning failure with Language model type unknown