<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>52nlp&#039;s Learning Notes &#187; NLP</title>
	<atom:link href="http://www.52nlp.com/tag/nlp/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.52nlp.com</link>
	<description>Natural Language Processing, Machine Learning, Programming Skill, Mathematics</description>
	<lastBuildDate>Sat, 23 Apr 2011 05:17:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>A Cool Dictionary for Natural Language Processing</title>
		<link>http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/</link>
		<comments>http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/#comments</comments>
		<pubDate>Mon, 30 Nov 2009 15:53:53 +0000</pubDate>
		<dc:creator>52nlp</dc:creator>
				<category><![CDATA[NLP]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Dictionary]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[nlpers]]></category>

		<guid isPermaLink="false">http://www.52nlp.com/?p=163</guid>
		<description><![CDATA[I found Professor Bill Wilson&#8217;s &#8220;The Natural Language Processing Dictionary&#8221; accidentally tonight, and thought it very cool for nlpers. Except from the NLP Dictionary, you also can find the Prolog, Artificial Intelligence and Machine learning Dictionary in this web page. &#8230; <a href="http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/">Continue reading <span class="meta-nav">&#8594;</span></a>


Related posts:<ol><li><a href='http://www.52nlp.com/hello-world/' rel='bookmark' title='Permanent Link: Hello, Natural Language Processing World!'>Hello, Natural Language Processing World!</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-how-to-run-giza-with-a-dictionary/' rel='bookmark' title='Permanent Link: Moses Support Digest:How to run giza++ with a dictionary'>Moses Support Digest:How to run giza++ with a dictionary</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-dictionary-problem-solved/' rel='bookmark' title='Permanent Link: Moses Support Digest:dictionary problem solved'>Moses Support Digest:dictionary problem solved</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-dictonary-use-during-training/' rel='bookmark' title='Permanent Link: Moses Support Digest:Dictonary use during training'>Moses Support Digest:Dictonary use during training</a></li>
<li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-lmodel-dub-parameter/' rel='bookmark' title='Permanent Link: Moses Support Digest:-lmodel-dub parameter'>Moses Support Digest:-lmodel-dub parameter</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I found Professor Bill Wilson&#8217;s &#8220;The Natural Language Processing Dictionary&#8221; accidentally tonight, and thought it very cool for nlpers. Except from the NLP Dictionary, you also can find the Prolog, Artificial Intelligence and Machine learning Dictionary in this web page. Below is from this Dictionary:<br />
<span id="more-163"></span><br />
<!--adsense--><br />
&#8220;You should use The NLP Dictionary to clarify or revise concepts that you have already met. The NLP Dictionary is not a suitable way to begin to learn about NLP. Further information on NLP can be found in the class web page lecture notes section.</p>
<p>Other places to find out about artificial intelligence include the AAAI (American Association for Artificial Intelligence) AI Overview page or AI Reference Shelf</p>
<p>If you wish to suggest an item or items that should be included, or if you found an item that you felt was unclear, please let me know (E-mail: billw at cse.unsw.edu.au). &#8221;</p>
<p>If you are interested in NLP Dictionary and others, visit them:<br />
The Natural Language Processing Dictionary &#8211; URL:<a href="http://www.cse.unsw.edu.au/~billw/nlpdict.html"target=_blank>http://www.cse.unsw.edu.au/~billw/nlpdict.html</a><br />
The Prolog Dictionary &#8211; URL: <a href="http://www.cse.unsw.edu.au/~billw/prologdict.html"target=_blank>http://www.cse.unsw.edu.au/~billw/prologdict.html</a><br />
The Artificial Intelligence Dictionary &#8211; URL: <a href="http://www.cse.unsw.edu.au/~billw/aidict.html"target=_blank>http://www.cse.unsw.edu.au/~billw/aidict.html</a><br />
The Machine Learning Dictionary &#8211; URL: <a href="http://www.cse.unsw.edu.au/~billw/mldict.html"target=_blank>http://www.cse.unsw.edu.au/~billw/mldict.html</a></p>


<p>Related posts:<ol><li><a href='http://www.52nlp.com/hello-world/' rel='bookmark' title='Permanent Link: Hello, Natural Language Processing World!'>Hello, Natural Language Processing World!</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-how-to-run-giza-with-a-dictionary/' rel='bookmark' title='Permanent Link: Moses Support Digest:How to run giza++ with a dictionary'>Moses Support Digest:How to run giza++ with a dictionary</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-dictionary-problem-solved/' rel='bookmark' title='Permanent Link: Moses Support Digest:dictionary problem solved'>Moses Support Digest:dictionary problem solved</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-dictonary-use-during-training/' rel='bookmark' title='Permanent Link: Moses Support Digest:Dictonary use during training'>Moses Support Digest:Dictonary use during training</a></li>
<li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-lmodel-dub-parameter/' rel='bookmark' title='Permanent Link: Moses Support Digest:-lmodel-dub parameter'>Moses Support Digest:-lmodel-dub parameter</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>From nlpers:Getting Started In Summarization</title>
		<link>http://www.52nlp.com/from-nlpersgetting-started-in-summarization/</link>
		<comments>http://www.52nlp.com/from-nlpersgetting-started-in-summarization/#comments</comments>
		<pubDate>Mon, 23 Nov 2009 11:37:02 +0000</pubDate>
		<dc:creator>52nlp</dc:creator>
				<category><![CDATA[NLP]]></category>
		<category><![CDATA[Summarization]]></category>

		<guid isPermaLink="false">http://www.52nlp.com/?p=69</guid>
		<description><![CDATA[&#8220;Getting Started In Summarization&#8221; is seconde post of the &#8220;GSI&#8221; series in nlpers blog. Following is from nlpers blog: I&#8217;ll kick off the &#8220;Getting Started In&#8221; series with summarization, since it is near and dear to my heart. Warning: I &#8230; <a href="http://www.52nlp.com/from-nlpersgetting-started-in-summarization/">Continue reading <span class="meta-nav">&#8594;</span></a>


Related posts:<ol><li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/moses-support-hierarchical-rule-extraction/' rel='bookmark' title='Permanent Link: Moses Support Digest:Hierarchical rule extraction'>Moses Support Digest:Hierarchical rule extraction</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-is-reordering-model-a-must-be-used-component-to-use/' rel='bookmark' title='Permanent Link: Moses Support Digest:Is reordering model a must-be-used component to use?'>Moses Support Digest:Is reordering model a must-be-used component to use?</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>&#8220;Getting Started In Summarization&#8221; is seconde post of the &#8220;GSI&#8221; series in <a href="http://nlpers.blogspot.com/">nlpers</a> blog. Following is from nlpers blog:<span id="more-69"></span></p>
<p>I&#8217;ll kick off the &#8220;Getting Started In&#8221; series with summarization, since it is near and dear to my heart.</p>
<p>Warning: I should point out at the get-go that these posts are not intended to be comprehensive or present a representative sample of work in an area. These are blog posts, not survey papers, tutorials or anything of that sort. If I don&#8217;t cite a paper, it&#8217;s not because I don&#8217;t think it&#8217;s any good. It&#8217;s also probably not a good idea to base a career move on what I say.</p>
<p>Summarization is the task of taking a long utterance (or a collection of long utterances) and producing a short utterance. The most common is when utterance=document, but there has also been work on speech summarization too. There are roughly three popular types of summarization: headline generation, sentence extraction and sentence compression (or document compression). In sentence extraction, a summary is created by selecting a bunch of sentences from the original document(s) and gluing them together. In headline generation, a very short summary (10 words or so) is created by selecting a bunch of words from the original document(s) and gluing them together. In sentence (document) compression a summary is created by dropping words and phrases from a sentence (document).</p>
<p>One of the big problems in summarization is that if you give two humans the same set of documents and ask them to write a summary, they will do wildly different things. This happens because (a) its unclear what information is important and (b) each human has different background knowledge. This is partially alleviated by moving to a task-specific setting, like the query-focused summarization model, which has grown increasingly popular in the past few years. In the query-focused setting, a document (or document collection) is provided along with a user query that serves to focus the summary on a particular topic. This doesn&#8217;t fix problem (b), but goes a long way to fixing (a), and human agreement goes up dramatically. Another advantage to the query-focused setting is that, at least with news documents, the most important information is usually presented first (this is actually stipulated by many newspaper editor&#8217;s guidelines). This means that producing a summary by just taking leading sentences often does incredibly well.</p>
<p>A related problem is that of evaluation. The best option is to do a human evaluation, preferably in some simulated real world setting. A reasonable alternative is to ask humans to write reference summaries and compare system output to these. The collection of Rouge metrics has been designed to automate this comparison (Rouge is essentially a collection of similarity metrics for matching human to system summaries). Overall, however, evaluation is a long-standing and not-well-solved problem in summarization.</p>
<p>Techniques for summarization vary by summary type (extraction, headline, compression).</p>
<p>The standard recipe for sentence extraction works as follows. A summary is created by first extracting the &#8220;best&#8221; sentence according to a scoring module. The second sentence is selected by finding the next-best sentence according to the scoring module, <span style="font-style: italic;">minus</span> some redundancy penalty (we don&#8217;t want to extract the same information over and over again). This process ends when the summary is sufficiently long. The scoring component assigns to each sentence in the original document collection a score that says how important it is (in query-focused summarization, for instance, the word overlap between the sentence and the query would be a start). The redundancy component typically computes the similarity between a sentence and the previously extracted sentences.</p>
<p>Headline generation and sentence compression have not yet reached a point of stability in the same way that sentence extraction has. A very popular and successful approach to headline generation is to train a hidden Markov model something like what you find in statistical machine translation (similar to IBM model 1, for those familiar with it). For sentence compression, one typically parses a sentence and then attempts to summarize it by dropping words and phrases (phrases = whole constituents).</p>
<p>Summarization has close ties to question answering and information retrieval; in fact, having some limited background in standard IR techniques (tf-idf, vector space model, etc.) are pretty much necessary in order to understand what goes on in summarization.</p>
<p>Here are some papers/tutorials/slides worth reading to get one started (I&#8217;d also recommend Inderjeet Mani&#8217;s book, <span style="font-style: italic;">Automatic Summarization</span> if you don&#8217;t mind spending a little money):</p>
<ol>
<li><a href="http://www.csee.umbc.edu/%7Eian/irF02/lectures/07Models-VSM.pdf">Background IR material</a></li>
<li>Sentence extraction
<ol>
<li><a href="http://www.isi.edu/%7Emarcu/acl-tutorial.ppt">Marcu&#8217;s ACL tutorial</a></li>
<li><a href="http://citeseer.ist.psu.edu/619827.html">Multi-Document Summarization By Sentence Extraction</a></li>
<li><a href="http://www.isi.edu/%7Ecyl/papers/NeATS-HLT2002-notebook-paper.pdf">Automated Multi-document Summarization in NeATS</a></li>
<li><a href="http://citeseer.ist.psu.edu/kupiec95trainable.html">A Trainable Document Summarizer</a></li>
</ol>
</li>
<li>Headline generation: <a href="ftp://ftp.umiacs.umd.edu/pub/bonnie/AutomaticHeadlineGenerationFinalRevisedAug7.pdf">Automatic Headline Generation for Newspaper Stories</a></li>
<li>Sentence compression: <a href="http://www.isi.edu/natural-language/mt/statsum.ps">Statistics-Based Summarization &#8212; Step One: Sentence Compression</a></li>
<li>Other:
<ol>
<li>Discussion: <a href="ftp://ftp.cl.cam.ac.uk/papers/ksj/ksj-whats-in-a-summary.ps.gz">What might be in a summary?</a></li>
<li>Automatic evaluation: <a href="http://www.isi.edu/%7Ecyl/papers/WAS2004.pdf">ROUGE: a Package for Automatic Evaluation of Summaries</a></li>
<li>Recent fun stuff:  <a href="http://www.cs.columbia.edu/%7Ehjing/papers/cutpaste.ps">Cut and paste based text summarization</a>, <a href="http://people.csail.mit.edu/regina/my_papers/statpar.ps">Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment</a> and <a href="http://www.cs.cmu.edu/afs/cs.cmu.edu/user/chiori/www/publication/EURASIP.pdf">A Statistical Approach for Automatic Speech Summarization</a></li>
</ol>
</li>
</ol>
<p><!--adsense--></p>


<p>Related posts:<ol><li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/moses-support-hierarchical-rule-extraction/' rel='bookmark' title='Permanent Link: Moses Support Digest:Hierarchical rule extraction'>Moses Support Digest:Hierarchical rule extraction</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-is-reordering-model-a-must-be-used-component-to-use/' rel='bookmark' title='Permanent Link: Moses Support Digest:Is reordering model a must-be-used component to use?'>Moses Support Digest:Is reordering model a must-be-used component to use?</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.52nlp.com/from-nlpersgetting-started-in-summarization/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>From nlpers:Getting Started in NLP</title>
		<link>http://www.52nlp.com/from-nlpers-getting-started-in-nlp/</link>
		<comments>http://www.52nlp.com/from-nlpers-getting-started-in-nlp/#comments</comments>
		<pubDate>Fri, 20 Nov 2009 13:58:35 +0000</pubDate>
		<dc:creator>52nlp</dc:creator>
				<category><![CDATA[NLP]]></category>
		<category><![CDATA[nlpers]]></category>

		<guid isPermaLink="false">http://www.52nlp.com/?p=63</guid>
		<description><![CDATA[　　nlpers blog is very famous in the natural language processing world, but it&#8217;s very pity in China we can&#8217;t visit it directly. From now I will choose some useful posts in nlpers and post them here as a mirror, hope &#8230; <a href="http://www.52nlp.com/from-nlpers-getting-started-in-nlp/">Continue reading <span class="meta-nav">&#8594;</span></a>


Related posts:<ol><li><a href='http://www.52nlp.com/from-nlpersgetting-started-in-summarization/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started In Summarization'>From nlpers:Getting Started In Summarization</a></li>
<li><a href='http://www.52nlp.com/hello-world/' rel='bookmark' title='Permanent Link: Hello, Natural Language Processing World!'>Hello, Natural Language Processing World!</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/' rel='bookmark' title='Permanent Link: A Cool Dictionary for Natural Language Processing'>A Cool Dictionary for Natural Language Processing</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>　　<a href="http://nlpers.blogspot.com/"target=_blank>nlpers</a> blog is very famous in the natural language processing world, but it&#8217;s very pity in China we can&#8217;t visit it directly. From now I will choose some useful posts in nlpers and post them here as a mirror, hope these will be a bridge between Chinese NLP lovers and nlpers blog. <span id="more-63"></span><br />
　　&#8221;Getting Started in NLP&#8221; is posted in 2006, but I think it is very useful for NLP learners, especially for the NLP beginners. I have translated in Chinese, if you are interesting it, you can find it in my Chinese blog: <a href="http://www.52nlp.cn/getting-started-in-natural-language-processing">getting started in natural language processing</a>.<br />
　　Following is from nlpers blog:</p>
<p>　　Since starting the blog, a few people have asked me how one can get started in NLP, while residing in a department lacking NLP researchers. This is a difficult question: I fell into NLP quite naturally when I was at CMU and made an easy transition to grad school at USC, both of which have awesome NLP groups. Lacking such internal support, one has to be much more ambitious to get to the point where one could do real research in the field. The obvious avenues for support are: reading books (which ones?) and papers (from where and by whom?), going to nearby conferences (which ones?) and experimentation (on what?). (New option: read and post to this blog!)<br />
　　The four standard books in the field are Statistical NLP (Manning + Schutze), Speech and Language Processing (Jurafsky + Martin), Statistical Language Learning (Charniak) and Natural Language Understanding (Allen). The latter two are much older, though some people prefer Charniak to Manning + Schutze. I would probably pick up Manning + Schutze if I could only buy one. From this book, I think that skimming Chapters 1, 4, 6 and 13 should give a reasonable (but not uniformly sampled) representation of background knowledge everyone should know. Unfortunately, this misses many topics: information extraction, summarization, question answering, dialog systems, discourse, morphology, ontologies, pragmatics, semantics, sentiment analysis and textual entailment.<br />
　　Finding good papers for beginners is hard. Without guidance, skimming titles and abstracts of papers published in ACL, NAACL, HLT or COLING since 2002 or 2003 should enable someone to find out what looks interesting to them. I know many advisors take this approach with new students. The ACL anthology is great for finding old papers. I&#8217;ll probably post at a later date about what are the &#8220;must reads&#8221; for the areas I know best. Once you&#8217;ve found a few papers you like, I&#8217;d check out the respective author&#8217;s web pages and see if they have any related work (best bet is probably to look at the advisor&#8217;s page: often s/he will have multiple students working on similar topics). Also, advisor&#8217;s often have course material and slides from tutorials: these are great places to get introductory-level material.<br />
　　If you happen to get lucky (I never have) and one of the above conferences is located nearby, I&#8217;d just go. Presentations of papers (if they&#8217;re good) are often better from the perspective of getting the high-level overview than the papers themselves, since papers have to be technically complete.<br />
　　I&#8217;m perhaps overcommitting myself, given my promise to talk more about structured prediction, but over the next few weeks/months, I&#8217;ll work on a &#8220;Getting Starting in X&#8221; series. X will likely range over the set { summarization, sequence labeling, information extraction, machine translation, language modeling }. Requests for other topics will be heard, keeping in mind I&#8217;m not an expert in many areas.</p>
<p><!--adsense--></p>


<p>Related posts:<ol><li><a href='http://www.52nlp.com/from-nlpersgetting-started-in-summarization/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started In Summarization'>From nlpers:Getting Started In Summarization</a></li>
<li><a href='http://www.52nlp.com/hello-world/' rel='bookmark' title='Permanent Link: Hello, Natural Language Processing World!'>Hello, Natural Language Processing World!</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/' rel='bookmark' title='Permanent Link: A Cool Dictionary for Natural Language Processing'>A Cool Dictionary for Natural Language Processing</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.52nlp.com/from-nlpers-getting-started-in-nlp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Graphical Models and Bayesian Networks Tutorial Reading</title>
		<link>http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/</link>
		<comments>http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/#comments</comments>
		<pubDate>Wed, 18 Nov 2009 16:14:50 +0000</pubDate>
		<dc:creator>52nlp</dc:creator>
				<category><![CDATA[NLP]]></category>
		<category><![CDATA[Bayesian Networks]]></category>
		<category><![CDATA[Graphical Models]]></category>

		<guid isPermaLink="false">http://www.52nlp.com/?p=48</guid>
		<description><![CDATA[The following excerpt from &#8220;A Brief Introduction to Graphical Models and Bayesian Networks&#8221; by Kevin Murphy. Books In reverse chronological order. Daphne Koller and Nir Friedman, &#8220;Probabilistic graphical models: principles and techniques&#8221;, MIT Press 2009 Adnan Darwiche, &#8220;Modeling and reasoning &#8230; <a href="http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/">Continue reading <span class="meta-nav">&#8594;</span></a>


Related posts:<ol><li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-tuning-tree-based-models/' rel='bookmark' title='Permanent Link: Moses Support Digest:tuning tree-based models'>Moses Support Digest:tuning tree-based models</a></li>
<li><a href='http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/' rel='bookmark' title='Permanent Link: A Cool Dictionary for Natural Language Processing'>A Cool Dictionary for Natural Language Processing</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>The following excerpt from &#8220;<a href="http://people.cs.ubc.ca/~murphyk/Bayes/bnintro.html"target=_blank>A Brief Introduction to Graphical Models and Bayesian Networks</a>&#8221; by Kevin Murphy.<span id="more-48"></span></p>
<h2>Books</h2>
<p>In reverse chronological order.</p>
<ul>
<li>Daphne Koller and Nir Friedman, &#8220;Probabilistic graphical models: principles and techniques&#8221;, MIT Press 2009 </a></li>
<li>Adnan Darwiche, &#8220;Modeling and reasoning with Bayesian networks&#8221;, Cambridge 2009 </a></li>
<li>F. V. Jensen. &#8220;Bayesian Networks and Decision Graphs&#8221;. Springer. 2001.<br />
Probably the best introductory book available. </a></li>
<li>D. Edwards. &#8220;Introduction to Graphical Modelling&#8221;,  2nd ed. Springer-Verlag. 2000.<br />
Good treatment of <em>undirected</em> graphical models from a statistical perspective. </a></li>
<li>J. Pearl. &#8220;Causality&#8221;. Cambridge. 2000.<br />
The definitive book on using causal DAG modeling. </a></li>
<li>R. G. Cowell, A. P. Dawid, S. L. Lauritzen and D. J. Spiegelhalter. &#8220;Probabilistic Networks and Expert Systems&#8221;. Springer-Verlag. 1999.<br />
Probably the best book available, although the treatment is restricted to   exact inference. </a></li>
<li>M. I. Jordan (ed). &#8220;Learning in Graphical Models&#8221;. MIT Press. 1998.<br />
Loose collection of papers on machine learning, many related to graphical models. One of the few books to discuss <em>approximate</em> inference. </a></li>
<li>B. Frey. &#8220;Graphical models for machine learning and digital communication&#8221;, MIT Press. 1998.<br />
Discusses pattern recognition and turbocodes using (directed) graphical models. </a></li>
<li>E. Castillo and J. M. Gutierrez and A. S. Hadi. &#8220;Expert systems and probabilistic network models&#8221;. Springer-Verlag, 1997.<br />
A <a href="http://personales.unican.es/gutierjm/BookCGH.html">Spanish version</a> is available online for free.</li>
<li> F. Jensen. &#8220;An introduction to Bayesian Networks&#8221;. UCL Press. 1996. Out of print.<br />
Superceded by his 2001 book.</li>
<li> S. Lauritzen. &#8220;Graphical Models&#8221;, Oxford. 1996.<br />
The definitive mathematical exposition of the theory of graphical models.</li>
<li> S. Russell and P. Norvig. &#8220;Artificial Intelligence: A Modern Approach&#8221;. Prentice Hall. 1995.<br />
Popular undergraduate textbook that includes a readable chapter on directed graphical models.</li>
<li> J. Whittaker. &#8220;Graphical Models in Applied Multivariate Statistics&#8221;, Wiley. 1990.<br />
This is the first book published on graphical modelling from a statistics perspective.</li>
<li> R. Neapoliton. &#8220;Probabilistic Reasoning in Expert Systems&#8221;. John Wiley &amp; Sons. 1990.</li>
<li> J. Pearl. &#8220;Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.&#8221; Morgan Kaufmann. 1988.<br />
The book that got it all started! <!-- This is the first book published on directed graphical models from an AI/ cognitive science perspective (rather than a statistics perspective). --> A very insightful book, still relevant today.</li>
</ul>
<h2>Review articles</h2>
<ul>
<li> P. Smyth, 1998. <a href="http://ftp.ics.uci.edu/pub/smyth/papers/prl.ps.Z"> &#8220;Belief networks, hidden Markov models, and Markov random fields: a unifying view&#8221;</a>, Pattern Recognition Letters.</li>
<li> E. Charniak, 1991. <a href="http://people.cs.ubc.ca/%7Emurphyk/Bayes/Charniak_91.pdf">&#8220;Bayesian Networks without Tears&#8221;</a>, AI magazine.</li>
<li> Sam Roweis &amp; Zoubin Ghahramani, 1999. <a href="http://www.gatsby.ucl.ac.uk/%7Eroweis/papers/NC110201.pdf"> A Unifying Review of Linear Gaussian Models</a>, Neural Computation 11(2) (1999) pp.305-345</li>
</ul>
<h2>Exact Inference</h2>
<ul>
<li> C. Huang and A. Darwiche, 1996. <a href="http://www.aub.edu.lb/people/darwiche/Papers/ijar95.pdf"> &#8220;Inference in Belief Networks: A procedural guide&#8221;</a>, Intl. J. Approximate Reasoning, 15(3):225-263.</li>
<li> R. McEliece and S. M. Aji, 2000. <!--<a href="http://www.systems.caltech.edu/EE/Faculty/rjm/papers/GDL.ps" mce_href="http://www.systems.caltech.edu/EE/Faculty/rjm/papers/GDL.ps">&#8211;> <a href="http://people.cs.ubc.ca/%7Emurphyk/Bayes/GDL.pdf"> The Generalized Distributive Law</a>, IEEE Trans. Inform. Theory, vol. 46, no. 2 (March 2000), pp. 325&#8211;343.</li>
<li> F. Kschischang, B. Frey and H. Loeliger, 2001. <a href="http://www.cs.toronto.edu/%7Efrey/papers/fgspa.abs.html">Factor graphs and the sum product algorithm</a>, IEEE Transactions on Information Theory, February, 2001.</li>
<li> M. Peot and R. Shachter, 1991. &#8220;Fusion and propogation with multiple observations in belief networks&#8221;, Artificial Intelligence, 48:299-318.</li>
</ul>
<h2>Approximate Inference</h2>
<ul>
<li> M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul, 1997. <a href="http://www.cs.berkeley.edu/%7Ejordan/papers/variational-intro.ps.Z"> &#8220;An introduction to variational methods for graphical models.&#8221;</a></li>
<li> D. MacKay, 1998. <a href="http://www.cs.toronto.edu/%7Emackay/erice.ps.gz"> &#8220;An introduction to Monte Carlo methods&#8221;</a>.</li>
<li> <a name="Jaakkola98"> T. Jaakkola and M. Jordan, 1998. </a><a href="http://www.cs.berkeley.edu/%7Ejordan/papers/varqmr.ps.Z"> &#8220;Variational probabilistic inference and the QMR-DT database&#8221; </a></li>
</ul>
<h2>Learning</h2>
<ul>
<li> W. L. Buntine, 1994. <a href="http://www.ultimode.com/%7Ewray/lwgmJAIR.ps.Z"> &#8220;Operations for Learning with Graphical Models&#8221;</a>, J. AI Research, 159&#8211;225.</li>
<li> D. Heckerman, 1996. <a href="ftp://ftp.research.microsoft.com/pub/tr/TR-95-06.PS"> &#8220;A tutorial on learning with Bayesian networks&#8221;</a>, Microsoft Research tech. report, MSR-TR-95-06.  <!--
<li> P. Krause, 1998. <A HREF="http://www.auai.org/auai-tutes.html/bayesUS_krause.ps.gz" mce_HREF="http://www.auai.org/auai-tutes.html/bayesUS_krause.ps.gz"> &#8220;Learning probabilistic networks&#8221;,</a> Philips Research Labs tech. report.
<li> N. Friedman, 1998. <a href="http://www.cs.huji.ac.il/~nir/Abstracts/Fr2.html" mce_href="http://www.cs.huji.ac.il/~nir/Abstracts/Fr2.html"> &#8220;The Bayesian Structural EM Algorithm&#8221;</a>, UAI. &#8211;></li>
</ul>
<h2>DBNs</h2>
<ul>
<li> L. R. Rabiner, 1989. <a href="http://people.cs.ubc.ca/%7Emurphyk/Bayes/rabiner.pdf">&#8220;A Tutorial in Hidden Markov Models and Selected Applications in Speech Recognition&#8221;</a>, Proc. of the IEEE, 77(2):257&#8211;286.</li>
<li> Z. Ghahramani, 1998. <a href="ftp://ftp.cs.toronto.edu/pub/zoubin/vietri.ps.gz"> Learning Dynamic Bayesian Networks </a> In  C.L. Giles and M. Gori (eds.), <em> Adaptive Processing             of Sequences and Data Structures </em>. Lecture Notes in Artificial           Intelligence, 168-197. Berlin: Springer-Verlag.</li>
</ul>
<p><!--adsense--></p>


<p>Related posts:<ol><li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-tuning-tree-based-models/' rel='bookmark' title='Permanent Link: Moses Support Digest:tuning tree-based models'>Moses Support Digest:tuning tree-based models</a></li>
<li><a href='http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/' rel='bookmark' title='Permanent Link: A Cool Dictionary for Natural Language Processing'>A Cool Dictionary for Natural Language Processing</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bayesian Modeling for Language Tutorial Reading</title>
		<link>http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/</link>
		<comments>http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/#comments</comments>
		<pubDate>Sat, 14 Nov 2009 15:04:01 +0000</pubDate>
		<dc:creator>52nlp</dc:creator>
				<category><![CDATA[NLP]]></category>
		<category><![CDATA[Bayesian Modeling]]></category>

		<guid isPermaLink="false">http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/</guid>
		<description><![CDATA[This is reprint from Sharon Goldwater&#8217;s &#8220;Reading list on Bayesian modeling for language&#8220;. People often ask me what they can read to learn more about recent Bayesian modeling techniques and their applications to language learning. Here is a list of &#8230; <a href="http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/">Continue reading <span class="meta-nav">&#8594;</span></a>


Related posts:<ol><li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/' rel='bookmark' title='Permanent Link: A Cool Dictionary for Natural Language Processing'>A Cool Dictionary for Natural Language Processing</a></li>
<li><a href='http://www.52nlp.com/moses-support-digesta-code-monkey-available-will-work-for-peanuts/' rel='bookmark' title='Permanent Link: Moses Support Digest:Code monkey available,Will work for peanuts'>Moses Support Digest:Code monkey available,Will work for peanuts</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-about-the-moses-chart-reordering/' rel='bookmark' title='Permanent Link: Moses Support Digest:about the moses-chart reordering'>Moses Support Digest:about the moses-chart reordering</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-any-documentation-about-the-multiple-decoding-path-functionality/' rel='bookmark' title='Permanent Link: Moses Support Digest: Any documentation about the Multiple Decoding Path functionality'>Moses Support Digest: Any documentation about the Multiple Decoding Path functionality</a></li>
<li><a href='http://www.52nlp.com/from-nlpersgetting-started-in-summarization/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started In Summarization'>From nlpers:Getting Started In Summarization</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-call-for-papers-pbml/' rel='bookmark' title='Permanent Link: Moses Support Digest: CALL FOR PAPERS &#8211; PBML'>Moses Support Digest: CALL FOR PAPERS &#8211; PBML</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>This is reprint from Sharon Goldwater&#8217;s &#8220;<a href="http://homepages.inf.ed.ac.uk/sgwater/reading_list.html" target="_blank">Reading list on Bayesian modeling for language</a>&#8220;.<span id="more-44"></span></p>
<p>People often ask me what they can read to learn more about recent Bayesian modeling techniques and their applications to language learning.  Here is a list of the papers I have found to be most useful and relevant to my own research.  I try to emphasize the papers aimed at a slightly less technical/more cognitively inclined audience.  This is not intended to be a complete list, only a starting point.</p>
<hr /><strong> General introductory material </strong></p>
<p>Thomas L. Griffiths and Alan Yuille (2006). <a href="http://cocosci.berkeley.edu/tom/papers/tutorial.pdf">A primer on probabilistic inference.</a> Trends in Cognitive Sciences. Supplement to special issue on Probabilistic Models of Cognition (volume 10, issue 7).</p>
<ul>
<li> Reviews many of the basic concepts underlying probabilistic (especially Bayesian) modeling and inference, using simple examples.</li>
</ul>
<p>Sharon Goldwater (2006). <a href="http://homepages.inf.ed.ac.uk/sgwater/papers/thesis_1spc.pdf">Nonparametric Bayesian Models of Lexical Acquisition.</a> Unpublished doctoral dissertation, Brown University, 2006.</p>
<ul>
<li> Aimed primarily at computational linguists, but should (I hope) be accessible to anyone who has a basic familiarity with generative probabilistic models. Chapters 2 and 3 cover many useful topics, including Bayesian integration in finite and infinite models (i.e., Dirichlet distribution, Dirichlet process, Chinese restaurant process) and a brief introduction to sampling techniques (Gibbs sampling and Metropolis-Hastings sampling).</li>
</ul>
<p>Daniel J. Navarro, Thomas L. Griffiths, Mark Steyvers, and Michael D. Lee (2006). <a href="http://cocosci.berkeley.edu/tom/papers/indivdiffs_jmp.pdf">Modeling individual differences using Dirichlet processes.</a> Journal of Mathematical Psychology, 50, 101-122.</p>
<ul>
<li> A very nice introduction to Dirichlet processes aimed at cognitive scientists. Slightly more in-depth, covers the stick-breaking construction for the Dirichlet process (which is not in my thesis) as well as the Chinese restaurant process.</li>
</ul>
<p><strong> Bayesian language models for learning </strong></p>
<p>Sharon Goldwater, Thomas L. Griffiths, and Mark Johnson (2007). <a href="http://homepages.inf.ed.ac.uk/sgwater/papers/bucld07.pdf"> Distributional Cues to Word Segmentation: Context is Important.</a> Proceedings of the 31st Boston University Conference on Language Development.</p>
<p>Sharon Goldwater, Thomas L. Griffiths, and Mark Johnson (2006). <a href="http://homepages.inf.ed.ac.uk/sgwater/papers/acl06.pdf"> Contextual Dependencies in Unsupervised Word Segmentation.</a> Proceedings of Coling/ACL.</p>
<ul>
<li> These two papers apply the Dirichlet process and hierarchical Dirichlet process to word segmentation. The BUCLD paper is more conceptual, the ACL paper is more technical. For a more in-depth treatment, see also Chapter 5 of my thesis (above).</li>
</ul>
<p>Sharon Goldwater and Thomas L. Griffiths. <a href="http://homepages.inf.ed.ac.uk/sgwater/papers/acl07-bhmm.pdf"> A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging.</a> Proceedings of the Association for Computational Linguistics.</p>
<ul>
<li>This paper provides a direct comparison between Bayesian methods (averaging over parameters and estimation using Gibbs sampling) and standard methods (estimating parameters directly using EM) using the same underlying model (a standard finite HMM).</li>
</ul>
<p>Mark Johnson (2007). <a href="http://acl.ldc.upenn.edu/D/D07/D07-1031.pdf"> Why Doesn&#8217;t EM Find Good HMM POS-Taggers? </a>Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).</p>
<ul>
<li>Includes Variational Bayes as well as Gibbs sampling and EM as estimation procedures. Results are somewhat contradictory to Goldwater and Griffiths, possibly due to the combination of a simpler model and more training data.</li>
</ul>
<p>Percy Liang, Slav Petrov, Michael I. Jordan, Dan Klein (2007). <a href="http://www.cs.berkeley.edu/%7Epliang/papers/hdppcfg-emnlp2007.pdf">The infinite PCFG using hierarchical Dirichlet processes.</a>Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP/CoNLL).</p>
<p>Jenny Rose Finkel, Trond Grenager and Christopher D. Manning (2007).  <a href="http://www.stanford.edu/%7Ejrfinkel/papers/infinite_tree.pdf">The Infinite Tree.</a> Proceedings of the Association for Computational Linguistics.</p>
<p>Mark Johnson, Thomas L. Griffiths, and Sharon Goldwater (2007).  <a href="http://homepages.inf.ed.ac.uk/sgwater/papers/nips07-adaptor.pdf">Adaptor Grammars: a Framework for Specifying Compositional Nonparametric Bayesian Models.</a> Advances in Neural Information Processing Systems 19.</p>
<ul>
<li>These three papers all deal with nonparametric models of syntax (dependency or context-free grammars). They might be a bit tough for those with less background in nonparametrics, although the exposition in Liang et al. is very nice.</li>
</ul>
<p>Thomas L. Griffiths, Michael Steyvers, and Joshua B. Tenenbaum (2007).  <a href="http://cocosci.berkeley.edu/tom/papers/topicsreview.pdf">Topics in semantic representation. </a> Psychological Review, 114, 211-244.</p>
<p>Thomas L. Griffiths, Michael Steyvers, David M. Blei, and Joshua B. Tenenbaum (2005).  <a href="http://cocosci.berkeley.edu/tom/papers/composite.pdf">Integrating topics and syntax.</a> Advances in Neural Information Processing Systems 17.</p>
<p>David Blei, Andrew Ng, and Michael Jordan (2003). <a href="http://www.cs.princeton.edu/%7Eblei/papers/BleiNgJordan2003.pdf">Latent Dirichlet allocation.</a> Journal of Machine Learning Research, 3:993-1022. (A shorter version appeared in NIPS 2002).</p>
<ul>
<li>These three papers are about Latent Dirichlet Allocation (a.k.a. topic models) for learning semantic structure. The Psych Review paper provides a less technical introduction and considers LDA as a cognitive model. The JMLR paper is the original one, suitable if you want more technical details. The NIPS paper is just cool.</li>
</ul>
<p>Fei Xu and Joshua B. Tenenbaum (2007). <a href="http://www.psych.ubc.ca/%7Efei/XuTenenbaum-PsychRev.pdf"> Word learning as Bayesian inference.</a> Psychological Review, 114, 245-272.</p>
<ul>
<li>Develops a Bayesian model to explain how children learn words at different levels of specificity (basic-level categories versus subordinate or superordinate).</li>
</ul>
<p><strong> Bayesian models of language processing </strong></p>
<p>This isn&#8217;t really my area, but here are a couple of interesting papers I know of:</p>
<p>Dennis Norris (2006).  <a href="http://www.mrc-cbu.cam.ac.uk/%7Edennis/BayesianReader.pdf">The Bayesian reader: explaining word recognition as an optimal Bayesian decision process.</a> Psychological Review, 113(2), 327-357.</p>
<p>Naomi Feldman and Thomas L. Griffiths (2007). <a href="http://cocosci.berkeley.edu/tom/papers/perceptualmagnet.pdf"> A rational account of the perceptual magnet effect.</a> Proceedings of the Twenty-Ninth Annual Conference of the Cognitive Science Society.</p>
<p><strong> Inference </strong></p>
<p>A bunch of the papers mentioned above have descriptions of sampling algorithms and/or variational inference procedures for specific models. For more general information on these topics, consider reading some of the following:</p>
<p>Sharon Goldwater (2006). <a href="http://homepages.inf.ed.ac.uk/sgwater/papers/thesis_1spc.pdf">Nonparametric Bayesian Models of Lexical Acquisition.</a> Unpublished doctoral dissertation, Brown University, 2006.</p>
<ul>
<li> As I mentioned above, there is a brief overview of Markov chain Monte Carlo methods (Gibbs sampling and Metropolis-Hastings) in Chapter 2. Examples of Gibbs sampling algorithms are described in chapters 4 and 5.</li>
</ul>
<p>Julian Besag (2000). <a href="http://citeseer.ist.psu.edu/cache/papers/cs/16898/http:zSzzSzwww.csss.washington.eduzSzPaperszSzwp9.pdf/besag00markov.pdf">Markov chain Monte Carlo for statistical inference.</a> Working paper no. 9.  University of Washington Center for Statistics and the Social Sciences.</p>
<ul>
<li> A longer and more technical introduction to Markov chain Monte Carlo methods.</li>
</ul>
<p>Mark Johnson, Thomas L. Griffiths, and Sharon Goldwater (2007). <a href="http://homepages.inf.ed.ac.uk/sgwater/papers/naacl07-mcmc-pcfg.pdf"> Bayesian Inference for PCFGs via Markov Cain Monte Carlo.</a> Proceedings of the North American Association for Computational Linguistics.</p>
<ul>
<li> How to do efficient sampling for PCFGs.</li>
</ul>
<p>Matthew Beal (2003). <a href="http://www.cse.buffalo.edu/faculty/mbeal/papers/beal03.pdf">Variational Algorithms for Approximate Bayesian Inference.</a> PhD. Thesis, Gatsby Computational Neuroscience Unit, University College London. (Or download individual chapters from <a href="http://www.cse.buffalo.edu/faculty/mbeal/thesis/"> here.</a>)</p>
<ul>
<li> I don&#8217;t know much about variational methods myself, but I&#8217;ve been told this is a good place to start.</li>
</ul>
<p><strong> Further Reading </strong></p>
<p>Yee Whye Teh, Michael Jordan, Matthew Beal, and David Blei (2006). <a href="http://www.cs.princeton.edu/%7Eblei/papers/TehJordanBealBlei2006.pdf"> Hierarchical Dirichlet processes. </a> Journal of the American Statistical Association, 2006. 101(476):1566-1581.</p>
<ul>
<li>The original HDP paper. Comprehensive, but I would suggest getting familiar with the ideas using some of the resources above before reading this one.</li>
</ul>
<p>Radford Neal (1993). <a href="http://omega.albany.edu:8008/neal.pdf">Probabilistic Inference Using Markov Chain Monte Carlo Methods.</a> Technical report CRG-TR-93-1. University of Toronto Department of Computer Science.</p>
<ul>
<li> Even more information about Markov chain Monte Carlo methods.</li>
</ul>
<p><!--adsense--></p>


<p>Related posts:<ol><li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/' rel='bookmark' title='Permanent Link: A Cool Dictionary for Natural Language Processing'>A Cool Dictionary for Natural Language Processing</a></li>
<li><a href='http://www.52nlp.com/moses-support-digesta-code-monkey-available-will-work-for-peanuts/' rel='bookmark' title='Permanent Link: Moses Support Digest:Code monkey available,Will work for peanuts'>Moses Support Digest:Code monkey available,Will work for peanuts</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-about-the-moses-chart-reordering/' rel='bookmark' title='Permanent Link: Moses Support Digest:about the moses-chart reordering'>Moses Support Digest:about the moses-chart reordering</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-any-documentation-about-the-multiple-decoding-path-functionality/' rel='bookmark' title='Permanent Link: Moses Support Digest: Any documentation about the Multiple Decoding Path functionality'>Moses Support Digest: Any documentation about the Multiple Decoding Path functionality</a></li>
<li><a href='http://www.52nlp.com/from-nlpersgetting-started-in-summarization/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started In Summarization'>From nlpers:Getting Started In Summarization</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-call-for-papers-pbml/' rel='bookmark' title='Permanent Link: Moses Support Digest: CALL FOR PAPERS &#8211; PBML'>Moses Support Digest: CALL FOR PAPERS &#8211; PBML</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Statistical Machine Translation Tutorial Reading</title>
		<link>http://www.52nlp.com/statistical-machine-translation-tutorial-reading/</link>
		<comments>http://www.52nlp.com/statistical-machine-translation-tutorial-reading/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 13:08:10 +0000</pubDate>
		<dc:creator>52nlp</dc:creator>
				<category><![CDATA[NLP]]></category>
		<category><![CDATA[Statistical Machine Translation]]></category>

		<guid isPermaLink="false">http://www.52nlp.com/?p=29</guid>
		<description><![CDATA[　　I have seen this from Dr. David Kauchak&#8217;s webpage and think it worth to keep here, so I reprint here. 　　The following is a list of papers that I think are worth reading for our discussion of machine translation. I&#8217;ve &#8230; <a href="http://www.52nlp.com/statistical-machine-translation-tutorial-reading/">Continue reading <span class="meta-nav">&#8594;</span></a>


Related posts:<ol><li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-acl-wmt-2010-machine-translation-shared-task/' rel='bookmark' title='Permanent Link: Moses Support Digest:Call for Participation ACL WMT 2010 Machine Translation Shared Task'>Moses Support Digest:Call for Participation ACL WMT 2010 Machine Translation Shared Task</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-hierarchical-and-syntax-based-decoding-in-moses/' rel='bookmark' title='Permanent Link: Moses Support Digest: Hierarchical and syntax-based decoding in Moses'>Moses Support Digest: Hierarchical and syntax-based decoding in Moses</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-is-reordering-model-a-must-be-used-component-to-use/' rel='bookmark' title='Permanent Link: Moses Support Digest:Is reordering model a must-be-used component to use?'>Moses Support Digest:Is reordering model a must-be-used component to use?</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-copyright-application-issue-for-a-snippet-translation-system-using-moses/' rel='bookmark' title='Permanent Link: Moses Support Digest: Copyright application issue for a Snippet Translation System using MOSES'>Moses Support Digest: Copyright application issue for a Snippet Translation System using MOSES</a></li>
<li><a href='http://www.52nlp.com/from-nlpersgetting-started-in-summarization/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started In Summarization'>From nlpers:Getting Started In Summarization</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-call-for-papers-pbml/' rel='bookmark' title='Permanent Link: Moses Support Digest: CALL FOR PAPERS &#8211; PBML'>Moses Support Digest: CALL FOR PAPERS &#8211; PBML</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>　　I have seen this from Dr. David Kauchak&#8217;s <a href="http://cseweb.ucsd.edu/~dkauchak/mt-tutorial/"target=_blank>webpage</a> and think it worth to keep here, so I reprint here.</p>
<p>　　The following is a list of papers that I think are worth reading for our discussion of machine translation.  I&#8217;ve tried to give a short blurb about each of the papers to put them in context.  I&#8217;ve included a number of papers that I marked &#8220;OPTIONAL&#8221; that I think are interesting, but are either supplementary or the material is more or less covered in the other papers.<span id="more-29"></span><br />
　　If anyone would like more information on a particular topic or would like to discuss any of these papers, feel free to e-mail me　<strong>dkauchak<img src="http://cseweb.ucsd.edu/%7Edkauchak/at.gif" border="0" alt="" />cs ucsd edu</strong></p>
<p><strong><em>Part 1 (Jan. 19)</em></strong><br />
<a href="http://www.isi.edu/natural-language/mt/wkbk.rtf">A  Statistical MT  Tutorial Workbook</a>.  Kevin Knight.  1999.<br />
　　Very good introduction to word-based statistical machine translation.Written in an informal, understandable, tutorial oriented style.<br />
　　Kevin Knight said about the Knight99, it is very interesting below:<br />
　　“At the time, I was trying to align English sound sequences with Japanese sound sequences, and I knew that EM could do it for me. It was hard, though. I spent two years reading Brown et al 93. When I finally got it to work, I was pretty fired up, and I told David Yarowsky. He said, “EM is the answer to all the world’s problems.” Wow! I figured everybody should know about it, so I wrote “A Statistical MT Tutorial Workbook”.</p>
<p><a href="http://www.isi.edu/natural-language/mt/aimag97.ps">Automating  Knowledge Acquisition for Machine Translation</a>.Kevin Knight. 1997.<br />
　　(OPTIONAL) Another tutorial oriented paper that steps through how one can learn from bilingual data.  Also introduces a number of important concepts for MT.</p>
<p><a href="http://cognet.mit.edu/library/books/view?isbn=0262133601">Foundations  of Statistical NLP</a>,chapter 13.  Manning and Schutze. 1999.<br />
　　(OPTIONAL) Must be accessed from UCSD.  Overview of statistical MT. Spends a lot of time on sentence and word alignment of bilingual data.</p>
<p><a href="http://cognet.mit.edu/library/books/view?isbn=0262133601">Foundations of Statistical NLP</a>, chapter 6.  Manning and Schutze. 1999.<br />
　　(OPTIONAL) Must be accessed from UCSD. Discusses n-gram language modeling.  Language modeling is crucial for SMT and many other natural language applications.  I won&#8217;t spend much time discussing language modeling, but for those that are interested this is a good introduction.</p>
<p><strong><em>Part 2 (Jan. 26)</em></strong><br />
Word models:<br />
</a><a href="http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf"> The Mathematics of Statistical Machine Translation: Parameter Estimation</a>.  P. F. Brown, S. A. Della Pietra, V. J. Della Pietra and R.L. Mercer. 1993.<br />
　　(OPTIONAL)  All you ever wanted to know about word level models.  Describes IBM models 1-5 and parameter estimation for these models.  It&#8217;s about 50 pages and contains a lot of material for the interested reader.</p>
<p>Word model decoding:<br />
<a href="http://acl.ldc.upenn.edu/P/P97/P97-1047.pdf"> Decoding Algorithm in Statistical Machine Translation</a>.Ye-Yi Wand and Alex Waibel.  1997.<br />
　　Early paper discussing decoding of IBM model 2.  The paper provides a fairly good introduction to word-level decoding including multi-stack search (i.e. multiple beams) and rest cost estimation (heuristic functions).</p>
<p><a href="http://www-i6.informatik.rwth-aachen.de/Colleagues/och/DDMT01.ps">An Efficient A* Search Algorithm for Statistical Machine Translation</a>.Franz Josef Och, Nicola Ueffing, Hermann Ney. 2001.<br />
　　(OPTIONAL) One of many papers on decoding with word-based SMT.  They discuss the basic idea of viewing decoding as state space search and provide one method for doing this.  They describe decoding for Model 3 and suggest a few different heuristics that are admissible, leading to few search errors.</p>
<p>Phrase based statistical MT:<br />
　　<a href="http://people.csail.mit.edu/people/koehn/publications/phrase2003.pdf"> Statistical Phrase-Based Translation</a>.Philipp Koehn, Franz Jasof Och and Daniel Marcu. 2003.<br />
　　Good, short overview of phrased based systems.  If you want more details, see the paper below.</p>
<p><a href="http://acl.ldc.upenn.edu/J/J04/J04-4002.pdf"> The Alignment Template Approach to Statistical Machine Translation</a>.Franz Josef Och and Hermann Ney. 2004.<br />
　　(OPTIONAL) This is a journal paper discussing one phrase based statistical system including decoding. This is more or less the system used at ISI and is probably the best current system (though syntax based systems my beat these in the next few years).  Requires acrobat 5 and to be at UCSD.</p>
<p><strong><em>Part 3 (Feb. 2)</em></strong><br />
Phrase-based decoding:<br />
　　See the previous paper.</p>
<p>Syntax based translation:<br />
<a href="http://www.isi.edu/natural-language/projects/rewrite/whatsin.pdf"> What&#8217;s in a Translation Rule?</a> Galley, Hopkins, Knight and Marcu. 2004.<br />
　　This is the current system being investigated at ISI and the hope is that these syntax based systems will perform better than phrase based systems.The paper is a bit tough to read since it&#8217;s a conference paper.</p>
<p><a href="http://www.isi.edu/natural-language/projects/rewrite/syntax.ps"> A Syntax-Based Statistical Translation Model</a>.  Yamada and Knight. 2001.<br />
　　(OPTIONAL) Predecessor model to Galley et al., but similar.</p>
<p>Syntax based decoding:<br />
<a href="http://cognet.mit.edu/library/books/view?isbn=0262133601"> Foundations of Statistical NLP, chapter 12. Manning and Schutze. 1999.</a><br />
　　Must be on campus.  This is a chapter on parsing (not actually decoding) However, since the above rules are very similar to PCFGs, then decoding is very similar to parsing&#8230; just with more complications.</p>
<p><a href="http://www.isi.edu/natural-language/projects/rewrite/syndec.ps"> A Decoder for Syntax-Based Statistical MT</a>.  Kenji Yamada and Kevin Knight. 2001.<br />
　　(OPTIONAL) Decoder for the above Yamada and Knight model.</p>
<p><strong><em>Part 4 (Feb. 9)</em></strong><br />
Discriminative Training:<br />
<a href="http://www-i6.informatik.rwth-aachen.de/Colleagues/och/ACL02.ps"> Discriminative Training and Maximum Entropy Models for Statistical Machine  Translation</a>.Och and Ney. 2002.<br />
　　Learning how the best models for combining the different models (traslation model, language model, etc.) using maximum entropy parameter estimation.This line of research is still very important and my be interesting to many of you since it&#8217;s very machine learningy.</p>
<p><a href="http://www.sfu.ca/%7Eanoop/papers/pdf/drmt.pdf">Discriminative  Reranking for Machine Translation.</a> Shen, Sarkar and Och. 2004.<br />
　　(OPTIONAL) Given a ranked output of possible translations from the translation system, this paper uses the perceptron algorithm to learn a reranking of the sentences to improves the top translation.</p>
<p>MT Evaluation:<br />
<a href="http://www1.cs.columbia.edu/nlp/sgd/bleu.pdf"> BLEU: A Method for Automatic Evaluation of Machine Translation</a>. Papineni, Roukos, Ward and Zhu. 2001.<br />
　　Foundational method for evaluating MT methods and still used currently.</p>
<p><!--adsense--></p>


<p>Related posts:<ol><li><a href='http://www.52nlp.com/maximum-entropy-model-tutorial-reading/' rel='bookmark' title='Permanent Link: Maximum Entropy Model Tutorial Reading'>Maximum Entropy Model Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-acl-wmt-2010-machine-translation-shared-task/' rel='bookmark' title='Permanent Link: Moses Support Digest:Call for Participation ACL WMT 2010 Machine Translation Shared Task'>Moses Support Digest:Call for Participation ACL WMT 2010 Machine Translation Shared Task</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-hierarchical-and-syntax-based-decoding-in-moses/' rel='bookmark' title='Permanent Link: Moses Support Digest: Hierarchical and syntax-based decoding in Moses'>Moses Support Digest: Hierarchical and syntax-based decoding in Moses</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-is-reordering-model-a-must-be-used-component-to-use/' rel='bookmark' title='Permanent Link: Moses Support Digest:Is reordering model a must-be-used component to use?'>Moses Support Digest:Is reordering model a must-be-used component to use?</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-copyright-application-issue-for-a-snippet-translation-system-using-moses/' rel='bookmark' title='Permanent Link: Moses Support Digest: Copyright application issue for a Snippet Translation System using MOSES'>Moses Support Digest: Copyright application issue for a Snippet Translation System using MOSES</a></li>
<li><a href='http://www.52nlp.com/from-nlpersgetting-started-in-summarization/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started In Summarization'>From nlpers:Getting Started In Summarization</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-call-for-papers-pbml/' rel='bookmark' title='Permanent Link: Moses Support Digest: CALL FOR PAPERS &#8211; PBML'>Moses Support Digest: CALL FOR PAPERS &#8211; PBML</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.52nlp.com/statistical-machine-translation-tutorial-reading/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Maximum Entropy Model Tutorial Reading</title>
		<link>http://www.52nlp.com/maximum-entropy-model-tutorial-reading/</link>
		<comments>http://www.52nlp.com/maximum-entropy-model-tutorial-reading/#comments</comments>
		<pubDate>Wed, 04 Nov 2009 14:47:03 +0000</pubDate>
		<dc:creator>52nlp</dc:creator>
				<category><![CDATA[NLP]]></category>
		<category><![CDATA[Maximum Entropy Model]]></category>

		<guid isPermaLink="false">http://www.52nlp.com/maximum-entropy-model-tutorial-reading/</guid>
		<description><![CDATA[　　This post is reprinted from Dr Zhang&#8217;s Maximum Entropy Modeling Toolkit manul. This section lists some recommended papers for your further reference. 1. Maximum Entropy Approach to Natural Language Processing [Berger et al., 1996] 　　A must read paper on applying &#8230; <a href="http://www.52nlp.com/maximum-entropy-model-tutorial-reading/">Continue reading <span class="meta-nav">&#8594;</span></a>


Related posts:<ol><li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-is-reordering-model-a-must-be-used-component-to-use/' rel='bookmark' title='Permanent Link: Moses Support Digest:Is reordering model a must-be-used component to use?'>Moses Support Digest:Is reordering model a must-be-used component to use?</a></li>
<li><a href='http://www.52nlp.com/moses-support-digesta-code-monkey-available-will-work-for-peanuts/' rel='bookmark' title='Permanent Link: Moses Support Digest:Code monkey available,Will work for peanuts'>Moses Support Digest:Code monkey available,Will work for peanuts</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-about-the-hierarchical-model-of-moses/' rel='bookmark' title='Permanent Link: Moses Support Digest:About the hierarchical model of Moses'>Moses Support Digest:About the hierarchical model of Moses</a></li>
<li><a href='http://www.52nlp.com/moses-digest-building-pos-language-model-with-srilm/' rel='bookmark' title='Permanent Link: Moses Support Digest:Building POS language model with SRILM'>Moses Support Digest:Building POS language model with SRILM</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/' rel='bookmark' title='Permanent Link: A Cool Dictionary for Natural Language Processing'>A Cool Dictionary for Natural Language Processing</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-tuning-failure-with-language-model-type-unknown/' rel='bookmark' title='Permanent Link: Moses Support Digest:Tuning failure with Language model type unknown'>Moses Support Digest:Tuning failure with Language model type unknown</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>　　This post is reprinted from Dr Zhang&#8217;s <a href="http://homepages.inf.ed.ac.uk/lzhang10/maxent_toolkit.html"target=_blank>Maximum Entropy Modeling Toolkit</a> manul. This section lists some recommended papers for your further reference.<span id="more-20"></span></p>
<p><strong>1. Maximum Entropy Approach to Natural Language Processing [Berger et al., 1996]</strong><br />
　　A must read paper on applying maxent technique to Natural Language Processing. This paper describes maxent in detail and presents an Increment Feature Selection algorithm for increasingly construct a maxent model as well as several example in statistical Machine Translation.</p>
<p><strong>2.Inducing Features of Random Fields [Della Pietra et al., 1997]</strong><br />
　　Another must read paper on maxent. It deals with a more general frame work: Random Fields and proposes an Improved Iterative Scaling algorithm for estimating parameters of Random Fields. This paper gives theoretical background to Random Fields (and hence Maxent model). A greedy Field Induction method is presented to automatically construct a detail random elds from a set of atomic features. An word morphology application for English is developed.</p>
<p><strong>3.Adaptive Statistical Language Modeling: A Maximum Entropy Approach [Rosenfeld, 1996]</strong><br />
　　This paper applied ME technique to statistical language modeling task. More specically, it built a conditional Maximum Entropy model that incorporated traditional N-gram, distant N-gram and trigger pair features. Significantly perplexity reduction over baseline trigram model was reported. Later, Rosenfeld and his group proposed a Whole Sentence Exponential Model that overcome the computation bottleneck of conditional ME model.</p>
<p><strong>4.Maximum Entropy Models For Natural Language Ambiguity Resolution [Ratnaparkhi, 1998]</strong><br />
　　This dissertation discussed the application of maxent model to various Natural Language Disambiguity tasks in detail. Several problems were attacked within the ME framework: sentence boundary detection, part-of-speech tagging, shallow parsing and text categorization. Comparison with other machine learning technique (Naive Bayes, Transform Based Learning, Decision Tree etc.) are given.</p>
<p><strong>5.The Improved Iterative Scaling Algorithm: A Gentle Introduction [Berger, 1997]</strong><br />
　　This paper describes IIS algorithm in detail. The description is easier to understand than [Della Pietra et al., 1997], which involves more mathematical notations.</p>
<p><strong>6.Stochastic Attribute-Value Grammars (Abney, 1997)</strong><br />
　　Abney applied Improved Iterative Scaling algorithm to parameters estimation of Attribute-Value grammars, which can not be corrected calculated by ERF method (though it works on PCFG). Random Fields is the model of choice here with a general Metropolis-Hasting Sampling on calculating feature expectation under newly constructed model.</p>
<p><strong>7.A comparison of algorithms for maximum entropy parameter estimation [Malouf, 2003]</strong><br />
　　Four iterative parameter estimation algorithms were compared on several NLP tasks. L-BFGS was observed to be the most effective parameter estimation method for Maximum Entropy model, much better than IIS and GIS. [Wallach, 2002] reported similar results on parameter estimation of Conditional Random Fields.</p>
<p><!--adsense--></p>


<p>Related posts:<ol><li><a href='http://www.52nlp.com/statistical-machine-translation-tutorial-reading/' rel='bookmark' title='Permanent Link: Statistical Machine Translation Tutorial Reading'>Statistical Machine Translation Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/graphical-models-and-bayesian-networks-tutorial-reading/' rel='bookmark' title='Permanent Link: Graphical Models and Bayesian Networks Tutorial Reading'>Graphical Models and Bayesian Networks Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/bayesian-modeling-for-language-tutorial-reading/' rel='bookmark' title='Permanent Link: Bayesian Modeling for Language Tutorial Reading'>Bayesian Modeling for Language Tutorial Reading</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-is-reordering-model-a-must-be-used-component-to-use/' rel='bookmark' title='Permanent Link: Moses Support Digest:Is reordering model a must-be-used component to use?'>Moses Support Digest:Is reordering model a must-be-used component to use?</a></li>
<li><a href='http://www.52nlp.com/moses-support-digesta-code-monkey-available-will-work-for-peanuts/' rel='bookmark' title='Permanent Link: Moses Support Digest:Code monkey available,Will work for peanuts'>Moses Support Digest:Code monkey available,Will work for peanuts</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-about-the-hierarchical-model-of-moses/' rel='bookmark' title='Permanent Link: Moses Support Digest:About the hierarchical model of Moses'>Moses Support Digest:About the hierarchical model of Moses</a></li>
<li><a href='http://www.52nlp.com/moses-digest-building-pos-language-model-with-srilm/' rel='bookmark' title='Permanent Link: Moses Support Digest:Building POS language model with SRILM'>Moses Support Digest:Building POS language model with SRILM</a></li>
<li><a href='http://www.52nlp.com/from-nlpers-getting-started-in-nlp/' rel='bookmark' title='Permanent Link: From nlpers:Getting Started in NLP'>From nlpers:Getting Started in NLP</a></li>
<li><a href='http://www.52nlp.com/a-cool-dictionary-for-natural-language-processing/' rel='bookmark' title='Permanent Link: A Cool Dictionary for Natural Language Processing'>A Cool Dictionary for Natural Language Processing</a></li>
<li><a href='http://www.52nlp.com/moses-support-digest-tuning-failure-with-language-model-type-unknown/' rel='bookmark' title='Permanent Link: Moses Support Digest:Tuning failure with Language model type unknown'>Moses Support Digest:Tuning failure with Language model type unknown</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.52nlp.com/maximum-entropy-model-tutorial-reading/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

