Dates | Topics | Assignment | |
---|---|---|---|
1 | April 10 | Introduction to this lecture. Tagging with HMM. slides slides |
- *install Python into your laptop computer. - *learn basics of Python if you are a novice. - read a note on HMM by Michael Collins. - implement your HMM-based POS tagger. |
2 | April 17 | Text classification with naive bayes classifiers slides |
- read "a comparison of event models for naive bayes text
classification" by McCallum and Nigam. - install MeCab, and try to find sentences that MeCab cannot analyze correctly |
3 | April 24 | The method of Lagrange multipliers. Maximum likelihood estimation. Maximum a posteriori estimation.slides |
- read Sections 1 and 2 of the following tutorial on Lagrange multipliers.
tutorial by Dan Klein. Try to give an intuitive explanation of this method when the solution space is 3-dimensional. - implement a naive bayes classifier. Train it on this file, and test it on this file. Each line of these files consists of a class (+1 or -1) and a segmented sentence. |
4 | May 1 |
Maximum likelihood estimation. Maximum a posteriori estimation. bag-of-words representation of document. SVM. slides |
- MAP estimation of multinomial model of naive bayes classifiers - derive the dual problem of the optimization problem of SVM with soft-margin - Use an SVM tool (e.g., TinySVM) to train a model on this file, and test it on this file. You need to write a script that converts those files into input format of the tool. - Read Section 2.1 of this tutorial. |
-- | May 8 | NO LECTURE | |
5 | May 15 |
Named-Entity Extraction Dependency Analysis slides |
- read Section 3 and Section 5.1 of the "CaboCha" paper":
"Japanese Dependency Analysis using Cascaded Chunking", CoNLL 2002. and answer the following questions: -- which static features are used? -- which dynamic features are used? -- are dynamic features effective? If so, in what situation? -- which kernel function is used? -- what benefit does the use of the kernel function above have? (not written in the paper. Think for yourself) |
6 | May 22 | Log-linear Model Conditional Random Fields (CRF) slides |
- read Sections 1, 2, and 3 of the tutorial on CRF - read Section 6.3 of a book (in Japanese) to review CRF and try to understand the forward-backward algorithm. |
7 | May 29 | Forward-backward algorithm Text summarization slides |
Read the following paper and learn how the weights on words are calculated in their work: Yih et al., 2007 |
8 | June 5 | text summarization slides | take a rest |
9 | June 12 | k-means clustering, EM, PLSI slides |
- derive the update equations for the product model. - Answer the following questions with the reference to Hofmann's paper.   * how is ``document'' integrated into the model?   * what is the tempered EM? What is the update equation for PLSI when the tempered EM is used?   * what is the folding-in? What kind of calculation is needed for the folding-in? - Implement PLSI, and train it on this file, and calculate the perplexity of this file. |
10 | June 19 | LDA slides |
implement Gibbs Sampling for LDA. Train it on this file. Each line of this file corresponds to a document, which is represented as a set of nouns, verbs, adverbs, and adjectives that appear in the document. |
-- | June 26 | NO LECTURE | |
11 | July 3 |
Check LDA code. slides, |
No assignment. But see the slides for details on the report submission (GRADING 1). |
12 | July 10 |
Derivation of update equations for LDA's Gibbs Sampling. Sentiment analysis. slides, survey by Kaji-san |
Watch this video (10 minutes). GRADING 2: Read the submission, write the review form, and send it to me by July 23rd? |
13 | July 17 |
Linguistic resources, Conference presentations slides, |
|
14 | July 24 | NO LECTURE |