Archive for November, 2009
A Cool Dictionary for Natural Language Processing
I found Professor Bill Wilson’s “The Natural Language Processing Dictionary” accidentally tonight, and thought it very cool for nlpers. Except from the NLP Dictionary, you also can find the Prolog, Artificial Intelligence and Machine learning Dictionary in this web page. Below is from this Dictionary:
Read the rest of this entry »
Moses Support Digest:Suffix arrays in Moses
[Moses-support] Suffix arrays in Moses
Hi all,
I’m just wanting to double-check the current state of the suffix array code in Moses. Can it be used to extract translation table entries on-the-fly?
Also, has anyone written up a paper on this in Moses? I’d like to know who to cite if this has been written up.
Cheers,
Lane
Read the rest of this entry »
Moses Support Digest:RDBMS for the decoder
[Moses-support] RDBMS for the decoder?
I wonder what would be the implications if a RDBMS would be used for the decoder. I;m also wondering what Google Translate uses – is there any public info about them in this respect?
Catalin Braescu
Omlulu.com
Read the rest of this entry »
Moses Support Digest:openTMS supports Moses as a data source
[Moses-support] problem with boost library
openTMS is a Open Source Translation Memory initiative which has produced its first software versions on www.opentms.de. The latest version also includes an interface for using MOSES within the translation process. MOSES is accessed thru a so-called data source – a set of interface methods defining the access to translation sources.
For more details on the integration of MOSES in openTMS see also
http://www.opentms.de/?q=node/64.
Regards
Dr. Klemens Waldhör (Chief Archtect of openTMS)
Read the rest of this entry »
Moses Support Digest:moses decoder results on cygwin and dos
[Moses-support] moses decoder results on cygwin and dos
Dear All
Running the moses decoder on cygwin and dos gives slightly different results, even though I’m using the same executable and the same models.
For example, translating from Welsh to English:
Welsh: bydd y bore ‘n oer .
English: the morning will be cold .moses at cygwin: morning will be cold .
moses at dos: bydd the morning will be cold .
The main problem is that on dos, moses is always returning the first word of the source language, prepended to the translation itself. Easy to strip off but annoying. The translation itself is often slightly better on dos than on cygwin, as above (which is if anything even stranger).
Can anyone account for this strange behaviour? More important, how can I stop the first word of source language returning?
Thanks and best wishes
Moses Support Digest:Code monkey available,Will work for peanuts
[Moses-support] ’Code monkey available. Will work for peanuts’ !
Hello,
I’m a graduate student at Bowling Green State University and I’m really interested in working on a term project using MOSES. So can anyone suggest me some good project ideas related to AI techniques.
Thanks,
Arjun Upadhyaya
Read the rest of this entry »
Moses Support Digest:A translation chain prototype with Moses + IRSTLM
[Moses-support] Moses Support Digest:A translation chain prototype with Moses + IRSTLM
http://code.google.com/p/moses-for-mere-mortals/
This site offers a set of 3 scripts that, together, create a basic translation chain prototype (with Moses + IRSTLM) able of processing very large corpora. The idea is to help build a translation chain for the real world, but it should also enable a quick evaluation of Moses for actual translation work and guide users in their first steps of using Moses. Read the rest of this entry »
Moses Support Digest:Pulling source data
[Moses-support] Pulling source data
I am experimenting with the Moses application now, and I have it so that it is pulling in my data from two flat, aligned text files.
My question is, can I pull in data from a mysql database table rather than a text file, or would the best approach be to dump the data on a regular basis to a text file and then process from there?
Thanks,
John
Read the rest of this entry »
Moses Support Digest:Hierarchical rule extraction
[Moses-support] Hierarchical rule extraction
Hieu / Philipp,
I’d like to extract a hierarchical grammar using the Moses tools (which Philipp helpfully pointed out have some documentation at http://www.statmt.org/moses/?n=Moses.ChartDecoding)
Here’s the catch: I already have run train-factored-phrase-model.perl to extract a regular Moses phrase table using trunk. That process already ran GIZA++, and I’d really rather not have to re-run alignment over the exact same data set just to get the branched version of train-factored-phrase-model.perl to let me pass the additional -hierarchical-glue-grammar flags.
Is there a way to run the branch version of train-factored-phrase-model.perl with -hierarchical -glue-grammar, and have it re-use the existing alignments that were created by my previous run of (trunk) train-factored-phrase-model.perl?
Cheers,
Lane
Read the rest of this entry »
Moses Support Digest:Building POS language model with SRILM
[Moses-support] Building POS language model with SRILM
Hi,
The Moses manual recommends using the following switches when building a language model with SRILM:
-interpolate -kndiscount
I assume this recommendation applies specifically to surface-string language models. For a part-of-speech language model, KN-discounting is inappropriate because it is based on counts-of-counts, and the counts-of-counts for POSes are odd in that there are very few POSes that occur only once or twice in a given corpus.
Are there particular switches that are recommended for building a POS
language model with SRILM?
Regards,
Ben Gottesman
Read the rest of this entry »