Tom H. C. Anderson - Next Gen Market Research™phd literature review help
help me with my homework proxy
online help for writing research papers
writing windows services in vb
write my essay 4 me
top resume service online
help on writing scholarship essays
1984 george orwell help
order conspectus paper
cv writing service reviews uk
freelance writers online
custom paper napkins toronto
professional personal essay writers
help with writing a sonnet poem
custom essay papers 7
how to write a windows service in c 2008
essay good bad customer service
essay writing service legit
help with an essay
professional resume writers groupon
writers freelance contract
new zealand essay writing service
professional assignment writers australia
help with english homework ks3
what is the best college essay editing service
digest writing services london
abstract helpers
report writer cover letter
dissertation help new york
business finance homework help
custom made personal statement
buy a condensation
homework help textbook solutions
what should i write my essay about
help essays
cv writing services reviews uk
top 10 dissertation writers
ap english literature essay help
death penalty research paper help
my community service essay
do my assignment cheap
professional cv services
top recommendation letter writing services online
buy papers online
does homework help high school students
cover letter for internship help
physics homework helper
need synopsis help
i don39t want to do my homework help
customized paper bags uk
help me write an essay free
accounting homework services
admission essay help
customer service term paper topics
help with research paper thesis statement
abridgment help san diego
how to write my dissertation proposal
help with writing report cards
research paper on customer relationship management .pdf
top 10 resume writing services
jurisprudence essay help
buy cheap toilet paper
best online condensation writing services
custom essay writing services cheap
buy an essay in the uk
help writing a precis for a precis
homework help for 7th grade math
best professional abstract writing services
homework help computer science
sat essay help
places to buy resume paper
speech helpers
need help writing my paper
paper mario helpers
best resume writers in atlanta
best custom college papers
video games help improve critical thinking
resume and cover letter writing services
phd thesis help india
how does creative writing help kids
buy your own iron on paper
custom writer39s reference
i didn39t do my homework help
case study dissociative identity disorder
holt middle school math course 1 homework help
cheap papers for printer
certified federal resume writing service
conspectus help chicago
paperback writer chords beatles
annotated bibliography helper
cheap paperback books online
examples of customer service resume titles
buy paper bags online ireland
curriculum vitae professional service uk
management accounting homework help
create custom css tumblr
purchase a conspectus
custom made essays free
cv writing services in kenya
executive resume writing service minneapolis
christian book review of the help
business plan writer raleigh nc
essay about help poor people
resume writing services ottawa ontario
research paper order online
free help with writing papers
things to research when buying a house
purchase essay papers
custom writing paper template
order custom rolling papers
freelance online writer resume
free resume writer wizard
application letter for janitorial services
birth order essay outline
houston area resume writing service
homework helpers chemistry review
custom paper napkins
professional cv writing service reviews
reading helps critical thinking
argumentative essay about is homework helpful or harmful
professional resume writing service singapore
application letter for customer service agent
free help with cv writing
the help film review independent
custom research north america
buy college books online india
college essay service trip
buy authentic college football jerseys
essay writers free
help with a brief
help with assignments online
hiring a grant writer
how to get paid to write movie reviews
resume writing services columbia sc
help assignment locus
get paid to write reviews in india
discursive essay help
freelance writers paid
cv cover letter service
how much does a music ghostwriter make
college accounting homework help
personal statement community service example
best outline writing services in atlanta ga
count desk essay neatness writer
research papers customer retention strategies
medical cv writing service uk
research before buying a car
buy essay online for cheap
ghostwriting services australia
make business budget plan
buy your research paper
customwritings.com customer reviews
need help making a compendium
homework help chemical equations
custom essay meister coupon
write my essay please
need help writing a persuasive essay
get homework done for free
buy a cv
buy a financial planning business
a case study on bipolar disorder
i need free math homework help
medical school personal statement service
how to write a essay to get into college
help with abridgments
cheap private universities
buy already written essays
resume services in miami fl
inexpensive resume writing services
executive digest writers nyc
graduate essay service
research proposal writing service
certified resume writers toronto
help with writing digest
best resume writers in uk
art homework help
criterion online essay evaluation services
sample cover letter for web content writer
buying dissertations
order recommendation letter paper
homework help hotline atlanta
cv writers in leicester
write my critical thinking
where to buy nice paper for resume
resume writing services oklahoma city ok
cv writing help london
customer service presentation powerpoint
how does critical thinking help a student
professional condensation services
essays on helping the community
online homework helper free
professional cv services limerick
best article writing services
write my high school admissions essay
famous essay writers
writing paper with borders printable
homework help riverside ca
help with brief
free online help with english homework
custom essay help

More Than Market Research - Gain The Information Advantage

Tom H. C. Anderson - Next Gen Market Research™ header image 6

Practical Sentiment Analysis and Lies

April 9th, 2012 · 4 Comments

Q&A with Prof. Bing Liu ahead of the Sentiment Analysis Symposium and Pre Symposium Tutorial

The Sentiment Analysis Symposium in NYC is just a month away (May 8th), so I thought I’d check out who was teaching the pre conference sentiment analysis tutorial this year. For those of us working with text analytics and in the New York area, Seth Grimes Sentiment Symposium has definitely made our annual must attend list. However, what most seem to miss is the half day workshop the day before the event each year. I started attending this component last year when researchers from Amazon.com were teaching it and decided it was definitely well worth half a day in the city to get a more tactical POV on Sentiment from someone who might have a slightly different use case or experience.

This year, data mining expert Bing Liu, a Professor at University of Illinois at Chicago’s Computer Science Department, will be teaching the workshop. Some of his work on text analytics and detecting fraud in online ratings was recently published in the NY Times and as I noticed we were connected on LinkedIn from a previous text analytics event, I called him up for a quick chat to learn a bit more about his work and what I might expect to learn at his pre Symposium workshop. We had an interesting talk and subsequently I sent him a few questions as I thought others would be interested as well.

I plan on being at both the Symposium and Pre Workshop again this year. Anyone else who is interested in attending feel free to use my discount code (OdinText). Do let me know if you’ll be attending so we can meet up, it’s a relatively small and informal group.

Now on to the Q&A…

Tom: Bing, how did you get into text analytics, and sentiment analysis?

Bing: My earlier research interests were in the areas of data mining and machine learning. In about year 2000, I started to get interested in Web mining and machine learning using text data. These two topics led me to the text on the Web. Reviews naturally come to mind because they are focused and well organized, which is great for data mining. I also quickly realized that sentiment analysis was a perfect research problem on its own (I called it opinion mining then due to my data mining background). It had so many applications as every individual and organization needs opinions for decision making. There was also a whole range of challenging research problems that had not been addressed by the natural language processing or the linguistics communities. We started to work on it in 2003 and published our first paper in KDD-2004 (ACM SIGKDD International Conference on Knowledge Discovery and Data Mining). The paper basically defined the framework of feature or aspect-based sentiment analysis and opinion summarization, which is now widely used in the industry and in research.

Tom: False website reviews are an interesting application, and one that I’ve been keeping my eye on. I noticed the New York Times recently covered some of your work in this area. This type of text analytics research seems to be much more difficult than most people think. Can you tell us a bit about this problem from the text analytics perspective, and how it is different from simpler use cases like identifying spam email for instance?

Bing: Indeed, this is a very difficult problem. My group began to work on it in around 2006 or 2007 as we realized this was an important problem and would become more and more important. When we started to do it, we realized it was really hard. The main difficulty lies in the fact that it is very hard, if not impossible, to recognize fake reviews manually as it is fairly easy to craft a fake review and pose it as a genuine one. Email spam detection is a much easier problem because you will immediately recognize a spam mail when you see one. This means that spam and non-spam emails have clear differences, and that it is easy to produce training data for machine learning algorithms in order to produce predictive models and to evaluate the models.

However, for fake reviews, if one writes them very carefully, it is hard to recognize them just by reading the review text. In the extreme case, this is an impossible task logically. For example, one can write a genuine review for a good restaurant and post it as a fake review for a bad restaurant in order to promote the bad restaurant. There is no way to detect this fake review without considering information beyond the review text itself simply because one review cannot be both truthful and fake at the same time.

Tom: What do you see as some of the applications of this type of research?

Bing: Review hosting sites or any general social media sites all want their reviews and user comments to be trustworthy. They are thus interested in fake review detection algorithms. All text analytics systems that use reviews or any opinion data need to worry about this problem too. Social media is here to stay. Its content is also being used more and more in applications.

Something has to be done to ensure the integrity of this valuable source of information before it becomes full of fake opinions, lies and deceptive information. After all, there are strong motivations for businesses and individuals to post fake reviews for profit and fame. It is also easy and cheap to do so. Writing fake reviews has already become a very cheap way of marketing and product promotion.

Tom: Have you found there are certain approaches that work better than others?

Bing: It is still too early to tell. Researchers currently use both linguistic features and atypical behaviors of reviewers to detect fakes. I feel that algorithms that mine atypical behaviors of reviewers and reviews tend to produce more interpretable and trustworthy results. For example, if all 5-star reviews for a hotel were posted only by people from the surrounding area of the hotel, these reviews are clearly suspicious. This is a simple example. More sophisticated fake reviews need more involved modeling and algorithms to detect them.

Tom: It’s been my observation and experience that we as an industry are moving away from linguistic approach to text (sure, some of the basics are useful), but machine learning and statistical approaches seem more powerful. What are your thoughts on this?

Bing: For most tasks, machine learning and statistical approaches are indeed more effective than pure linguistic based approaches. Linguistic approaches are mostly based on heuristic rules and patterns (including grammar information). For those tasks that can be performed based on words, it is very hard for a linguistics based approach to beat a statistical machine learning algorithm simply because the signals used by a machine learning algorithm are far more numerous than the rules or patterns that a human person can design. Plus, machine learning algorithms optimize the performances. However, that being said, in many tasks, linguistics based signals and clues are used as features by machine learning algorithms.

Statistical approaches are not without their limits. Going forward, I believe that both linguistic knowledge and statistical modeling are important. We are working on integrating more linguistic knowledge into statistical modeling.

Tom: It seems to me a lot of folks get a little too caught up in differences between languages. My firm for instance has found it rather easy to add other European languages to our approach, and of course machine translation is always a possibility. What are your thoughts on this?

Yes, I agree. Although every language is different, different languages are still similar as they all consist of words and grammar. European languages have even more similarities due to their common roots. A learning algorithm can capture many types of grammar regularities from any language if there is a sufficient amount of training data. For those tasks that need only word or lexical information, the same algorithm can be used for any language with almost no modification because an algorithm treats words are symbols. In that sense, it does not matter what language it is.

Tom: What will you be covering during the tutorial at the sentiment symposium?

Bing: Sentiment analysis has been studied extensively for the past decade. A huge number of research papers have been published on it (probably more than 1000). It is impossible to cover them all. Therefore, I will try to cover the main threads of research that also contain aspects which can be of immediate use in practice.

In the tutorial, I will start with a short motivation and then go on to define the problem. This will provide an abstraction or statement of the problem, which will naturally introduce the key sub-problems. I will then discuss the current state-of-the-art approaches to solving these problems. Since this is a practical sentiment analysis tutorial, I will also describe how to build a practical sentiment analysis system based on my previous experience in building one. In the final part of the tutorial, I will introduce the problem of fake review detection.

A big thanks to Bing for our talk and the subsequent Q&A. Looking forward to meeting up at the Symposium.

@TomHCAnderson
@OdinText

[For those interested in more info about the sentiment tutorial a syllabus and outline is available here]

[Post to Twitter] 

Tags: Academia · Conferences · Datamining · Odin Text · OdinText · Sentiment Analysis · Sentiment Analysis Symposium · Text Analytics · Text Analytics Summit · Text Mining Guru · seth grimes · text mining · tomhcanderson

4 responses so far ↓

  • 1 Peter Szekeres // Apr 11, 2012 at 7:58 am

    Great interview. I really agree with Prof. Bing Liu. I think that the most effective sentiment analysis methods those which are using knowledge lexicons, grammar rules and statictical methods and assesments both. By combining these I built a quite good sentiment analysis system on Hungarian webpages (accuracy: 80%).
    I think one important thing wasn’t mentioned above: handling irony. Maybe it can be similar to recognizing fake reviews…

  • 2 chris west // Apr 19, 2012 at 6:47 am

    i’m someone that’s coming into Text Analytics from the Marketing World. Can anyone explain (simply) how ’statistical machine learning algorithm’ works: what do you give it as inputs? Does it look for wide variances from ‘typical’ or ‘mean’ to spot possible fakes?
    Any help appreciated
    Chris

  • 3 Tom H C Anderson // Apr 19, 2012 at 8:30 am

    Chris, good question. Yes most typically that approach has to do with the computer ‘learning’ how humans do it. So target variable would be how humans have coded it, be it sentiment (Pos, Neg etc.) or in this case I guess dishonest or honest.

  • 4 Fake Reviews a Growing and Tenacious Problem in Social Media : Beyond Search // Apr 20, 2012 at 12:07 am

    [...] Analysis Symposium in New York City early next month. He has titled his interview, “Practical Sentiment Analysis and Lies.” [...]

Leave a Comment