Moses Support Digest:dictionary problem solved
[Moses-support] dictionary problem solved
Hi all,
This dictionary problem is finally solved. “-d” option works well. I made a silly mistake here and caused the problem. I converted the dictionary file to UTF8, but the coding of other files is:7bit ASCII characters. So sorry to bother you for such a long time…
I really appreciate your kind help, especially Mark Fishel and Chris Dyer. You have helped this green hand a lot
As I google this dictionary problem, all I found is my own question. So, to those who may use dictionary and don’t know how, here’s the advice:
1. well…make sure your texts of the same coding
2. check your giza++ source code, and find variable “useDict”, make sure it’s set to ture
3. add a “-d” option to your command, followed by your dictionary the dictionary should be in this format:
target-word-id source-word-id
it must be sorted by the target-word-id.
here’s my command line:
(you may have to know those options which are set to 0 or 1, or a lot of files would be generated )
./GIZA++
5 -CoocurrenceFile korean-chinese.cooc
6 -c korean-chinese-int-train.snt
7 -m1 5 -m2 0 -mh 5 -m3 3 -m4 3
8 -model1dumpfrequency 1
9 -model2dumpfrequency 1
10 -model345dumpfrequency 1
11 -hmmdumpfrequency 1
12 -model4smoothfactor 0.4
13 -nbestalignments 1
14 -onlyaldumps 0
15 -nodumps 0
16 -nsmooth 4
17 -d ck.txt
18 -o korean-chinese
19 -onlyaldumps 1
20 -p0 0.999
21 -s chinese.vcb
22 -t korean.vcb
2009-12-23
Best regards,
Lee Xianhua
NOTICE:This is digested from the Moses-support mailing list, which supports for the moses SMT decoder.
Related posts:
- Moses Support Digest:How to run giza++ with a dictionary
- Moses Support Digest:Dictonary use during training
- Moses Support Digest:GIZA++ error
- Moses Support Digest:How do you solve this moses problem
- Moses Support Digest:ConfusionNet GetSubString error when using lattice with UTF8 input
- Moses Support Digest: Word Alignment – Moses
- Moses Support Digest:running giza in parts
- Moses Support Digest:moses decoder results on cygwin and dos
- A Cool Dictionary for Natural Language Processing
- Moses Support Digest:About giza++ options when running moses