Publication Search

 

Application of a Probability-Based Algorithm to Extraction of Product Features from Online Reviews

Chris Scaffidi
Technical Report CMU-ISRI-06-111, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 2006

KeyWords:information extraction,mining,personalization,product review

Online Links:      PDF

Abstract:

Prior research has demonstrated the viability of automatically extracting product features from online reviews. This paper presents a probability-based algorithm and compares it to an existing support-based approach. Specifically, I used each algorithm to extract features from 7 Amazon.com product categories and then asked end users to rate the features in terms of helpfulness for choosing products. The end users preferred the features identified by the probability- based algorithm. This probability-based algorithm can identify features that comprise a single noun or two successive nouns (which end users rated as more helpful than features comprising only one noun), yet even for collections of tens of thousands of reviews, it still executes fast enough (at around 1ms per review) for practical use.

Preferred citation: C. Scaffidi. Application of a Probability-Based Algorithm to Extraction of Product Features from Online Reviews. Technical Report CMU-ISRI-06-111, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 2006

Entry last Updated 2006-06-20.
The software used to index and search these papers is Marian - the on-line-braian, available at Marian's Home site.
This page is part of Mary Shaw's site in the School of Computer Science at Carnegie Mellon University. Use of any portion of this site to generate spam or other mass communications is forbidden. Comments to maintainer. The software used to index and search these papers is Marian - the on-line-braian, available at Marian's Home site. Comments and suggesions are welcome