New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

Term Project Guide

by: Abhishek Notetaker

Term Project Guide CSCI GA-2590

Abhishek Notetaker
GPA 3.7

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Term Project Guide
Natural Language Processing
Dr. Ralph Grishman
Study Guide
50 ?




Popular in Natural Language Processing

Popular in ComputerScienence

This 4 page Study Guide was uploaded by Abhishek Notetaker on Sunday March 6, 2016. The Study Guide belongs to CSCI GA-2590 at NYU School of Medicine taught by Dr. Ralph Grishman in Fall 2016. Since its upload, it has received 26 views. For similar materials see Natural Language Processing in ComputerScienence at NYU School of Medicine.

Similar to CSCI GA-2590 at NYU

Popular in ComputerScienence


Reviews for Term Project Guide


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 03/06/16
Term Project You must submit a term project on material connected to the course; this is worth 30% of your grade. You have wide latitude in what do for the project. It may be a project based on Jet; a separate programming project, or a research paper. A Jet or programming project must be accompanied by a separate, well - written description of the project; an analysis of the data and your system's performance will be an important part of the grade. Joint projects are encouraged. The general idea is to do something interesting which will require you to confront some of the 'real problems' of doing NLP. Real NLP is hard ... don't be too ambitious, or at least have a fall -back plan if your ambitions are not realized. Possible projects 1 KBP IE. In this course, we present the use of Jet for a specific extraction task -- 'executive succession' -- using the pattern matching tools in Jet. One possible project involves adapting the system to other event or relation types. For the past several years we have participated in a multi-site evaluation organized by NIST, the English Slot Filling task of Knowledge Base Population. This involves extracting information about people and organizations comparable to what appears in Wikipedia infoboxes; for example, for a person, their date of birth, place of birth, date of death (if no longer alive), spouse, employer, organizational memberships, title, etc.; for an organization, their date of founding, headquarters location, number of employees, top employees, website, etc. We would expect you to select a few person or organization properties. 2 IE. As an alternative to KBP on gener al news, you could do a richer analysis of some narrow sublanguage within the news, such as weather forecasts, death notices, cooking recipes, sports results, etc. If you are considering one of these, you should begin by marking up a few documents by hand to see what is feasible. Then you should ◦ gather a sample (perhaps 50) of relevant documents, probably by a Web search ◦ mark these up by hand for relevant events ◦ based on this mark-up, create some patterns ◦ test your patterns on this set of documents ◦ when your are satisfied, gather a new sample of relevant documents (maybe 20) ◦ mark these up by hand for relevant events ◦ score your system for recall and precision 3 You can do this extraction either with Jet or with your own program (e.g., with Python). 4 Wikification. For all names of certain types, link the name to the Wikipedia page describing that entity. A major challenge here is that many common names are ambiguous (there are multiple pages with that label), so you must build a cl assifier to select the appropriate page. Prof. Satoshi Sekine can provide some guidance and data for those interested in Wikification; see his project description.. 5 Foreign language analysis: building a word segmenter, a POS tagger or even a name recognizer or chunker for another language. Dr. Yifan He can provide some guidance to those interested in processing Chinese. 6 Extract time expressions (including actual dates, "Friday", "last month", "two weeks ago") and normalize them (figure out a date or date range, given the date of an article). 7 Extension of Jet syntactic patterns to more constructs (e.g., a rich variety of modifiers for noun and verb groups). This should include some performance analysis. 8 Building your own HMM or MEMM and training it to identify names, noun groups, or time expressions. Note: if someone is familiar with genomics, we would be interested in a name tagger for such texts. 9 Implementing a bootstrap learner for relations (like Brin or Agichtein). Alternatively, conducting experiments with NYU's own bootstrapped relation learner, ICE, for a new domain. (We can accomodate at most a few student s doing such experiments.) 10 Doing some experiment with word embeddings. 11 Research report on some topic not covered or minimally covered in the course (e.g., morphological analysis of morphologically rich languages; question answering, summarization, machine translation methods).The paper should show some understanding of what problems have and have not been addressed by current technology. What were the best recent projects? extraction systems for {criminal verdicts, lay-offs, weather reports, sports game summaries, wedding announcements, death notices, recipies, music releases}, including evaluation (using Jet and using Python) extracting family relationships from the Bible noun / verb group patterns, with evaluation on larger corpus (using Jet) question-answering (NL data base interface) system for {stock quotes, train schedules} name recognizer / linker for Web pages (using HMM) adaptive name recognizer fine-grained sentiment analyzer usign dependency relations sentiment analysis from tweets discovering relations by bootstrapping or distant supervision Chinese word segmentation Due dates Brief project description: March 28th, 11:55 PM; 1 -2 paragraphs, submitted through NYU Classes. For group projects, one student should submit a full description, other students should submit a note "Group project with X". Project: May 8th (Friday following last class, at 11:55 PM) 1 point (3%) penalty for submissions through May 11 at 11:55 PM; 1/2 point (1.5%) penalty for each additional weekday late.


Buy Material

Are you sure you want to buy this material for

50 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Amaris Trozzo George Washington University

"I made $350 in just two days after posting my first study guide."

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.