SummaryThis paper discusses Ember, a framework for predicting significant societal events such as (i) disease outbreaks (influenza-like and discrete incidents of rare disease), (ii) elections, (iii) domestic political crises and (iv) civil unrest events based on the following surrogates:
- news, blogs, social media
- tweats and twitter activity
- Wikipedia edits
- restaurant reservations and availability information
- parking lot imagery
- physical indicators (temperature, humidity, vegetation index)
- economic data
- TOR routing statistics
- percentages of smiles in photos shared over social media
MethodThe system combines text mining approaches that operate on surrogates with models and statistical approaches. Combining these models allows obtaining information on different aspects of the alert (e.g. the predicted epidemic curve versus models identifying spatial clusters based on twitter hashtags).
Ember also uses probabilistic collective reasoning leveraging the framework of probabilistic soft logic. A Bayesian system integrates different models and assessments into the final set of issued alerts.