What is Prevalence (Estimation) in Machine Learning?

According to Wikipedia's entry on Statistical Classification,

In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.

One example of such task is sentiment analysis (or opinion mining) in which a given text is categorized with positive or negative polarity.

Now in the very same context, let's assume that we would like to estimate an advertisement campaign's success by an estimation of how many of customers' comment during the campaign. See here we are not interested in knowing granular opinions but an aggregated sum of opinions and that is prevalence estimation. As Fabritzio Sebastiani puts it,

With rapid growth of data volume, prevalence estimation is simply becoming the more popular approach because in many cases companies would like to have classification result at an aggregated level. Prevalence Estimation is thus akin to classification evaluated at the aggregate (rather than at the individual) level.

Sebastiani also mentions that evaluation measures used in Prevalence estimation is different than of in classification. It is important to know that if you have a classification task that only needs the aggregated result, using classification is suboptimal.

The Rstats tag of this blog is added to R Bloggers