[Moses-support] moses-irstlm memory racing with 5-gram lm
I’m troubleshooting a new moses system with these components:
1) GIZA++ (SVN rev 8, v 1.0.3)
2) IRSTLM (SVN rev 38, v 5.40.01)
3) Moses (SVN rev 3210, dated 4-26-2010)
4) Ubuntu-server 10.04 LTS 64-bit.
5) 3.4 Ghz Pentium-D with 4gb ram.
Using a 3-gram lm, the system works as expected. Training, tuning and
evaluation a small (135K pairs) en-nl subset of europarl.v5 work fine. BLEU
score was 23.
I then built a 5-gram model, edited the moses.ini config and started
mert-moses-new. It creates a filtered model, and then launches moses. The
memory usage grows and within 10 minutes, the system kills moses.
In both cases, the lm is only the target half of the bitext corpus, about
135K lines.
The moses.ini files:
[lmodel-file]
1 0 3 /media/models/irstlm/europarl.v5.mini/3-gram.nl.blm
[lmodel-file]
1 0 5 /media/models/irstlm/europarl.v5.mini/5-gram.nl.blm
I know of one other who has anyone the same problem with the 4-1-2010
moses build and irstlm from March/April last year.
Any suggestions? Could it be the new Ubuntu or the g++-4.4.1 compiler?
Thanks,
Tom
Re: [Moses-support] moses-irstlm memory racing with 5-gram lm
i’m not an expert on irstlm but i think you have to rename or softlink
the file so that the extension ends in .mm to minimise memory usage.
http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#ntoc7
otherwise it’ll just allocate as much memory as needed to load the LM file
Re: [Moses-support] moses-irstlm memory racing with 5-gram lm
Problem solved.
To review the symptoms, I ran the following two mert-moses-new.pl command
lines:
CASE 1:
nice mert-moses-new.pl
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl/mert.en
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl/mert.nl
/usr/local/lib/moses-irstlm/moses-cmd/src/moses
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl/moses0.ini
–working-dir
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl
–rootdir /usr/local/lib/moses-irstlm/scripts
–mertdir=/usr/local/lib/moses-irstlm/mert
–nbest=50
–decoder-flags -v 0
CASE 2:
nice mert-moses-new.pl
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl/mert.en
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl/mert.nl
/usr/local/lib/moses-irstlm/moses-cmd/src/moses
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl/moses0.ini
–working-dir
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl
–rootdir /usr/local/lib/moses-irstlm/scripts
–mertdir=/usr/local/lib/moses-irstlm/mert
–nbest=50
–decoder-flags -v 0
Only one line (the lmodel-file) was different in the respective starting
config files:
CASE 1:
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl/moses0.ini:
[ttable-file]
0 0 0 5
/media/models/tables/europarl.v5.mini/en-nl/model.en-nl/phrase-table.gz
[lmodel-file]
1 0 3 /media/models/irstlm/europarl.v5.mini/3-gram.nl.blm.mm
[distortion-file]
0-0 wbe-msd-bidirectional-fe-allff 6
/media/models/tables/europarl.v5.mini/en-nl/model.en-nl/reordering-table.wbe-msd-bidirectional-fe.gz
CASE 2:
/media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl/moses0.ini:
[ttable-file]
0 0 0 5
/media/models/tables/europarl.v5.mini/en-nl/model.en-nl/phrase-table.gz
[lmodel-file]
1 0 5 /media/models/irstlm/europarl.v5.mini/5-gram.nl.blm.mm
[distortion-file]
0-0 wbe-msd-bidirectional-fe-allff 6
/media/models/tables/europarl.v5.mini/en-nl/model.en-nl/reordering-table.wbe-msd-bidirectional-fe.gz
In both cases, mert-moses-new.pl filtered the phrase table successfully.
In CASE 1, the tuning process continued and concluded with a final
moses.ini file with new weights. In CASE 2, however, mert-moses-new.pl
created run1.moses.ini. The moses process rapidly (less than 5 minutes)
consumed all RAM and virtual memory leaving nothing for other processes. It
never sent output to the run1.out file. The system killed moses and
mert-moses-new.pl. This occurred from the mert-moses-new.pl script or from
the command line using the run1.moses.ini file.
Furthermore, I changed run1.moses.ini to use the binarized phrase and
reordering tables:
0 0 0 5
/media/models/tables/europarl.v5.mini/en-nl/model.en-nl/phrase-table.gz
changed to:
1 0 0 5
/media/models/tables/europarl.v5.mini/en-nl/model.en-nl/phrase-table
0-0 wbe-msd-bidirectional-fe-allff 6
/media/models/tables/europarl.v5.mini/en-nl/model.en-nl/reordering-table.wbe-msd-bidirectional-fe.gz
changed to
0-0 wbe-msd-bidirectional-fe-allff 6
/media/models/tables/europarl.v5.mini/en-nl/model.en-nl/reordering-table.wbe-msd-bidirectional-fe
With this modified config from the command line (not mert-moses-new.pl),
moses loaded in seconds and translated stdin/stdout just fine. Only
configurations with the full .gz model and filtered model exhibited the
problems. The filtered model, by the way, is only 20 MB for phrase AND
reordering tables.
SOLUTION:
When I first built IRSTLM with MACHTYPE=x86_64, it created
$IRSTLM/bin/x86_64. Then, building moses using –with-irstlm=$IRSTLM
finished without fatal errors. I recently read the moses-support threads
about using a $SRILM/bin/i686 folder. So, I applied the same solution to
IRSTLM. I rebuilt IRSTLM and I created two symlinks:
ln -s $IRSTLM/bin/x86_64 $IRSTLM/bin/i686
ln -s $IRSTLM/lib/x86_64 $IRSTLM/lib/i686
Then, I rebuild moses –with-irstlm=$IRSTLM.
RESULTS: the mert-moses-new.pl script runs flawlessly with 3-gram and
5-gram IRSTLM language models and the exact same config files in CASE 1 and
CASE 2 above.
Go figure!
Hope this helps others.
Tom
Related posts:
- Moses Support Digest: Moses seems to hang
- Moses Support Digest:word lattice and multiple translation tables optimization problem
- Moses Support Digest:A translation chain prototype with Moses + IRSTLM
- Moses Support Digest:ConfusionNet GetSubString error when using lattice with UTF8 input
- Moses Support Digest: Moses on the iPhone
- Moses Support Digest:CreateBerkeleyPt and On-Disk Rule Table
- Moses Support Digest:mert extractor
- Moses Support Digest: Issues with Score data
- Moses Support Digest:Tuning failure with Language model type unknown
- Moses Support Digest: About mert-moses in mose-chart
Hi, what are the advantages of using a 3-gram model and a 5-gram model? Which of the two is better and why?
-Rex