Online Discussion Participation Prediction Using Non-Negative Matrix Factorization

1 minute read

Fung, Y.-H., Li, C.-H., & Cheung, W. K. (2007). Online Discussion Participation Prediction Using Non-negative Matrix Factorization. In Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops (pp. 284—287). Washington, DC, USA: IEEE Computer Society.

Introduction

This paper presents an approach for estimating the user participation in discussions taking place in Internet forums, i.e. to estimate whether a particular user will participate in a certain discussion.

User participation follows Zipf's law:

The authors note that the user participation frequency obeys the Zipf's law, i.e. the posting frequency of a user is inversely proportional to his rank in the "user activity" table. Therefore, active users post significantly more than inactive ones.

Method

The used method encodes discussions and users in the $$n \times m$$ matrix $$F$$ where $$f_{ij} \in F$$ represents the number of posts posted by the user $$j$$ in the discussion$$i$$. To account for the Zipf's law frequency the following normalization which has been inspired by pf/idf is used:

$$x_{ij} = (pf_{ij}) \times (idf_j) = f_{ij} \times log\frac{n}{n_j}$$

where $$n_j$$ represents the number of discussions in which a user $$j$$ has participated. Furthermore the discussion $$i$$ is normalized to unit Euclidean length by dividing the $$L_2$$ norm corresponding to the pf/idf vector.

Afterwards, Weighted Non-negative Matrix Factorization (WNMF) rather than Singular Value Decomposition (SVD) is used to find the latent factors based on the observed participation frequency, i.e. to estimate the model.

Evaluation

The evaluation experiments observe user interaction in three different discussion boards and use the mean absolute error (MAE) to describe the method's performance:

$$MAE = \frac{1}{N} \sum_{i,j \in S_{test}} |x_{ij} - y_{iy} | $$

Share on

Twitter Facebook LinkedIn

Albert Weichselbraun

Online Discussion Participation Prediction Using Non-Negative Matrix Factorization

Introduction

User participation follows Zipf's law:

Method

Evaluation

Share on

You may also enjoy

Big, Linked Geospatial Data and Its Application in Earth Observation

Employment relations: a data driven analysis of job markets using online job boards and online professional networks

Suffix array

Dynamic feature scaling for online learning of binary classifiers