Categories
Algorithms and Optimizations
NoDB: Efficient Query Execution on Raw Data Files
Alagiannis, Ioannis, Renata Borovica, Miguel Branco, Stratos Idreos, and Anastasia Ailamaki. NoDB: Efficient Query Execution on Raw Data Files. In Proceedi...
Applications
Big, Linked Geospatial Data and Its Application in Earth Observation
Integrating earth observation data with linked open data would pave the way for easy reuse and integration of these datasets. The article discusses how knowl...
Employment relations: a data driven analysis of job markets using online job boards and online professional networks
Career websites contain valuable data on employees, their skill sets and, employment history. This article uses k-means clustering on keywords describing ski...
Articles
Distributed Information Processing in Biological and Computational systems
Navlakha, S. & Bar-Joseph, Z., 2014. Distributed information processing in biological and computational systems. Communications of the ACM, 58(1), pp.94...
The Power of Social Media Analytics
Fan, W., & Gordon, M. D. (2014). The power of social media analytics. Communications of the ACM, 57(6), 74—81. doi:10.1145/2602574</p> Summa...
Social Media Analytics for Smart Health
Abbasi, A., Adjeroh, D., Dredze, M., Paul, M. J., Zahedi, F. M., Zhao, H., Ross, A. (2014). Social Media Analytics for Smart Health. IEEE Intelligent System...
Factors Influencing the Response Rate in Social Question and Answering Behavior
Liu, Z. & Jansen, B.J., 2013. Factors Influencing the Response Rate in Social Question and Answering Behavior. In Proceedings of the 2013 Conference on ...
Online Discussion Participation Prediction Using Non-Negative Matrix Factorization
Fung, Y.-H., Li, C.-H., & Cheung, W. K. (2007). Online Discussion Participation Prediction Using Non-negative Matrix Factorization. In Proceedings of th...
Large-scale Incremental Processing Using Distributed Transactions and Notifications
by Peng and Dabek This paper introduces Percolator and the corresponding processing pipeline called Caffeine, which are systems for incrementally processing ...
Dremel: Interactive Analysis of Web-Scale Datasets
by Melnik et al. in Proceedings of the 36th International Conference on Very Large Data Bases 2010 This paper covers Dremel, a scalable, interactive ad-hoc q...
Web Page Classification: Features and Algorithms
Qi, X. & Davison, B.D., 2009. Web page classification: Features and algorithms. ACM Comput. Surv., 41(2), pp.12:1—12:31. </p> </p> C...
A conceptual density-based approach for disambiguation of toponyms
by Buscaldi and RossoThis article explores the use of word-sense-disambiguation (WSD) techniques for toponym resolution. The authors explain two algorithms w...
An empirical study of the effects of NLP components on Geographic IR performance
by Stokes et al. This article focuses on the impact of NLP components on the task of toponym resolution (TR) and geographic information retrieval (GIR) &l...
Basic Ontology Data Integration Concepts
Data integration consists of two basic steps: semantic enrichement mapping discovery LAV vs. GAV</p> LAV and GAV describe two approaches for integrat...
Information Revelation and Privacy in Online Social Networks (The Facebook case)
by Ralph Gross and Alessandro Acquisti (+) The article provides an excellent literature review and trust and intimacy in online networking and on the partici...
What Is Web 2.0
Design Patterns and Business Models for the Nexter Generation of Software by Tim O'Reilly (09/30/2005) An excellent article elaborating the concepts and idea...
Sharing Knowledge
by Peter Marks, Peter Polak, Scott McCoy and Dennis Galletta The idea of knowledge management systems (KMS) is unlocking knowledge heretofore only accessible...
A Value-Driven System for Autonomous Information Gathering
by Grass, J. and Zilberstein, S. Grass and Zilberstein introduce a framework for gathering information, by repeatedly selecting queries with the highest marg...
Improving Performance of Web Service Query Matchmaking with Automated Knowlede Acquisition
by Gupta et al. The Article presents a systems for simple interfacing Web services using an HTML based search engine, mapping the queries to Web service requ...
Towards a Query Optimizer for Text-Centric Tasks
by Panagiots Ipeirotis et al. The idea of the article is to provide strategies for optimal choosing between different crawl-/query strategies (like scan, fil...
Big data
Big, Linked Geospatial Data and Its Application in Earth Observation
Integrating earth observation data with linked open data would pave the way for easy reuse and integration of these datasets. The article discusses how knowl...
Employment relations: a data driven analysis of job markets using online job boards and online professional networks
Career websites contain valuable data on employees, their skill sets and, employment history. This article uses k-means clustering on keywords describing ski...
Suffix array
The suffix array is a memory-efficient alternative to the suffix tree which provides a sorted list of string indices indicating the string’s suffixes.
Dynamic feature scaling for online learning of binary classifiers
This article describes and evaluates different online feature scaling approaches and their impact on the performance of binary classifiers. online feature...
40 years of suffix trees
Suffix trees are used in text searching, indexing, statistics. This article describes the history, construction, current developments and applications of suf...
Data sketching
This article introduces three popular data structures that efficiently handle and summarize large data sets.
Rich Data, Poor Fields
This article shows how handheld devices and big data technology may multiply field yields and make farming more environmentally friendly.
The DARPA Twitter Bot Challenge
Subrahmanian, V. S., A. Azaria, S. Durst, V. Kagan, A. Galstyan, K. Lerman, L. Zhu, E. Ferrara, A. Flammini, and F. Menczer. The DARPA Twitter Bot Challenge....
It Probably Works
Mcmullen, T. (2015). It Probably Works. Commun. ACM, 58(11), 50—54. http://doi.org/10.1145/2814332 Introduction This article distinguishes between thre...
The New Smart Cities
Mone, G. (2015). The New Smart Cities. Commun. ACM, 58(7), 20—21. http://doi.org/10.1145/2771297</p> </p> Summary This article discusses b...
Natural Language Processing for Health and Social Media
Abbasi, A. et al., 2014. Social Media Analytics for Smart Health. IEEE Intelligent Systems, 29(2), pp.60—80.</p> </p> Summary In this arti...
Social Media Analytics for Smart Health
Abbasi, A., Adjeroh, D., Dredze, M., Paul, M. J., Zahedi, F. M., Zhao, H., Ross, A. (2014). Social Media Analytics for Smart Health. IEEE Intelligent System...
Healthcare Intelligence: Turing Data Into Knowledge
Yang, H., Kundakcioglu, E., Li, J., Wu, T., Mitchell, J. R., Hara, A., Tsui, K.-L. (2014). Healthcare Intelligence: Turning Data into Knowledge. IEEE Intell...
Beyond Data and Analysis
Davis, C. K. (2014). Beyond Data and Analysis. Commun. ACM, 57(6), 39—41. doi:10.1145/2602326</p> Summary The article identifies competition whi...
Big Data and Its Technical Challenges
Jagadish, H. V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., & Shahabi, C. (2014). Big Data and Its Technical Chal...
Data Science and Prediction
by Dhar, V. (2013). Data science and prediction. Communications of the ACM, 56(12), 64—73. This article provides insights into how data science complem...
Hazy: Making It Easier to Build and Maintain Big-Data Analytics
Kumar, Arun, Feng Niu, and Christopher Ré. Hazy: Making It Easier to Build and Maintain Big-data Analytics. Communications of the ACM 56, no. 3...
Book Review
Collective Intelligence
by Segaran Toby An excellent guide to programming Web 2.0 applications, with code examples and excellent explanations of the used techniques. Similarity Metr...
Extreme Programming Explained
by Kent Beck and Cynthia Andres I would like to point out some of the concepts and ideas presented in the book: small improvements: analyse the corrent proc...
Web 2.0 by tom ALBY
Yesterday I finally got the time to read some of the new literature I ordered and I decided to start with tom ALBYS "Web 2.0". Relevant information for sema...
Conferences
WISE 2011 - Training a Named Entity Recognizer on the Web
by Urbansky et al. The authors distinguish between three approaches towards NER: use of hand-crafted rules (lexicons, rules) supervised machine learning, an...
Comparing the Sensitivity of Information Retrieval Metrics
by Radlinsky and Craswell (SIGIR 2010) This paper compares user behaviour based IR metrics with the following standard IR metrics: Precision@k -- the preci...
Using Ontological Contexts to Assess the Relevance of Statements in Ontology Evolution
by Fouad Zablith, Mathieu d'Aquin, Marta Sabou, and Enrico Motta This work describes a method for judging the relevance statements suggested by ontology evol...
Conference Thoughts
some things I found interesting to mention. Educational </p> Traditional measures for association rules support: P(XY) confidence: P(X|Y) lift ...
Evolutinary Clustering in Description Logics: Controlling Concept Formation and Drift in Ontologies
by Floriana Esposito et. al Detecting concept drift and new concepts in an ontology by analyzing the superclasses of its individuals.
Dynamic Taxonomies (FIND Workshop)
The term dynamic taxonomies refers to a multidimensional (multifaceted *check*) classification . Interesting ideas: </p> Searching/Browsing: express s...
Text-based Information Retrieval (T-IR)
a Workshop organized by Benno Stein. Content Extraction (from Web pages) Filter lines based on the content to tags ratio.</p> a <- ASCII character...
Database
The Pathologies of Big data
by Jacobs, Adam (2009). The pathologies of big data, Communications of the ACM, ACM, pages 36-44, 52(8) The article demonstrates the importance of a profound...
HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads
Abouzeid, Azza, Bajda-Pawlikowski, Kamil, Abadi, Daniel, Rasin, Alexander and Silberschatz, Avi (2009). ''HadoopDB: An Architectural Hybrid of MapReduce and ...
Distributed Computing
HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads
Abouzeid, Azza, Bajda-Pawlikowski, Kamil, Abadi, Daniel, Rasin, Alexander and Silberschatz, Avi (2009). ''HadoopDB: An Architectural Hybrid of MapReduce and ...
Evaluation Metrics
Evaluation Without Ground Truth in Social Media Research
Zafarani, Reza, and Huan Liu. Evaluation Without Ground Truth in Social Media Research. Communcations of the ACM 58, no. 6 (May 2015): 54—60. doi:10.1...
Comparing the Sensitivity of Information Retrieval Metrics
by Radlinsky and Craswell (SIGIR 2010) This paper compares user behaviour based IR metrics with the following standard IR metrics: Precision@k -- the preci...
Future Internet
Future Internet Socio-Economics - Challenges and Perspectives
by Hausheer et al. This is a very general high level article on the Future Internet which identifies issues like scalability and address space limitations b...
Context-Aware Systems and Implications for Future Internet
by Nigel Baker et al. The article presents a vision how context can make network services more personalised and useful. The authors define context awareness ...
Geo
Spatial and Temporal Information
based on "Normalizing Spatial Information to Better Combine Criteria in Geographical Information Retrieval" y Palacio et al. (ECIR 2009). There are two types...
Evaluation and User Preference Study on Spatial Diversity
by Tang and Sanderson (ECIR 2010) This article presents a user study which shows that users prefer search results which are not only (i) relevant but also (i...
Judging the spatial relevance of documents for GIR
by Clough and Joho (Advances in Information Retrieval 2006) This articles describes a pilot study which assesses both thematic and geographic relevance based...
Detecting Geographic Locations from Web Resources
by Wang et al. (GIR 2005) The articles of the author distinguish between three different types of geographic locations the provider location (= source locat...
Comparing the Sensitivity of Information Retrieval Metrics
by Radlinsky and Craswell (SIGIR 2010) This paper compares user behaviour based IR metrics with the following standard IR metrics: Precision@k -- the preci...
An empirical study of the effects of NLP components on Geographic IR performance
by Stokes et al. This article focuses on the impact of NLP components on the task of toponym resolution (TR) and geographic information retrieval (GIR) &l...
An evaluation dataset for the toponym resolution task
Leidner, Jochen L. (2006). ''An evaluation dataset for the toponym resolution task'', Computers, Environment and Urban Systems, pages 400-417 This paper moti...
A confidence-based framework for disambiguating geographic terms
Rauch, Erik and Bukatin, Michael and Baker, Kenneth (2003). ''A confidence-based framework for disambiguating geographic terms'', Proceedings of the HLT-NAAC...
Challenges and Resources for Evaluating Geographical IR
Martins, Bruno and Silva, Mário J. and Chaves, Marcirio Silveira (2005). ''Challenges and resources for evaluating geographical IR'', GIR '05: Proceedings of...
Habilitation
Fortschritt in der Wirtschaftsinformatik
basierend auf Ideen aus dem Beitrag "Perspektiven der Wirtschaftsinformatik aus Sicht der Informatik" von Matthias Jarke Das relationale Datenmodell (Codd 19...
Information Diffusion
Model-Based Forecasting of Significant Societal Events
Ramakrishnan, Naren, Chang-Tien Lu, Madhav V. Marathe, Achla Marathe, Anil Vullikanti, Stephen Eubank, Scotland Leman, et al. Model-Based Forecasting of Sig...
An Open Digest-based Technique for Spam Detection
Damiani, E. et al., 2004. An Open Digest-based Technique for Spam Detection. In in Proceedings of the 2004 International Workshop on Security in Parallel ...
Factors Influencing the Response Rate in Social Question and Answering Behavior
Liu, Z. & Jansen, B.J., 2013. Factors Influencing the Response Rate in Social Question and Answering Behavior. In Proceedings of the 2013 Conference on ...
Meme ranking to maximize post virality in microblogging platforms
Bonchi, F., Castillo, C., & Ienco, D. (2013). Meme ranking to maximize posts virality in microblogging platforms. Journal of Intelligent Information Syst...
Catching a Viral Video
Broxton, Tom, Yannet Interian, Jon Vaver, and Mirjam Wattenhofer. Catching a Viral Video. Journal of Intelligent Information Systems 40, no. 2 (April 1, 201...
Clash of the Contagions - Cooperation and Competition in Information Diffusion
by Seth A. Myers and Jure Leskovec, IEEE International Conference on Data Mining (ICDM 2012), Brussels, Belgium Introduction The authors present a statistica...
Targeting Online Communities to Maximise Information Diffusion
Belák, V., Lam, S. & Hayes, C., 2012. Targeting online communities to maximise information diffusion. In Proceedings of the 21st internati...
Finding Text Reuse in the Web
by Michael Bendersky and W. Bruce Croft (WSDM'09) This article discusses an approach for finding three different kinds of text reuse in the web: verbatim co...
Extracting influential nodes on a social network for information diffusion
by Kimura, M. et al. (Data Mining and Knowledge Discovery 2010; kimura2010) This paper cover the optimization problem of finding the most influential nodes o...
Social influence analysis in large-scale networks
by Tang et al. (tang2009), SIGKDD Tang et al. (2009) propose Topical Affinity Propagation (TAP) for determining the topic-level social influence of nodes in ...
Inferring networks of diffusion and influence
by Gomez Rodriguez, M.; Leskovec, J. & Krause, A. In order to study network diffusion, we need to identify the contagion (idea, information, virus, phra...
The flow of on-line information in global networks
by Kleinberg (SIGMOD 2010 Keynode; kleinberg2010) Kleinberg notes that to understand the dynamics of real-time information, we need on ways to reason about ...
Identifying the Influential Bloggers in a Community
by Agarwal et al. (WSDM2008, agarwal2008) The goal of this article is to provide a method for identifying influential bloggers. Definitions: outlinks: resou...
Meme-tracking and the Dynamics of the News Cycle
by Leskovec et al. (leskovec2009, KDD2009) The authors introduce a framework for tracking short, distinctive phrases that travel intact through on-line text ...
Tracking Information Epidemics in Blogspace
by Eytan Adar and Lada A. Adamic (WI2005; adar2005) The authors study the paths along which information spreads in the "blog network". They consider the task...
A Measurement-driven Analysis of Information Propagation in the Flickr Social Network
by Cha et al. (cha2009) at the WWW 2009 The authors present a study on viral information propagation which is based on crawls of the favorite markings of 2.5...
Learning information diffusion process on the web
by Wan, X. and Yang, J (wan2007) The authors present an approach which identifies the diffusion process for a particular topic. Sets of documents with a give...
Finding and tracking subjects within an ongoing debate
by Rudy Prabowo and Mike Thelwall (prabowo2008) This article tracks subjects in postings and bulletins by identifying co-occurring terms which represent thes...
Learning influence probabilities in social networks
by Goyal et al. (goya2010) The article's authors learn influence models based on social graphs and an action log Based on the learned models they are able ...
Information flow modeling based on diffusion rate for prediction and ranking
by Song et al. (song2007) Song et al. investigate the information flow in a user network. They try to (i) predict where information flows and (ii) who will m...
Information Extraction
Model-Based Forecasting of Significant Societal Events
Ramakrishnan, Naren, Chang-Tien Lu, Madhav V. Marathe, Achla Marathe, Anil Vullikanti, Stephen Eubank, Scotland Leman, et al. Model-Based Forecasting of Sig...
HYENA-live: Fine-Grained Online Entity Type Classification from Natural-language Text
Yosef, M. A., Bauer, S., Hoffart, J., Spaniol, M., & Weikum, G. (2013). HYENA-live: Fine-Grained Online Entity Type Classification from Natural-language...
From Names to Entities using Thematic Context Distance
Pilz, A., & Paaß, G. (2011). From Names to Entities Using Thematic Context Distance. In Proceedings of the 20th ACM International Conference...
Improving Efficiency and Accuracy in Multilingual Entity Extraction
Daiber, J. et al., 2013. Improving Efficiency and Accuracy in Multilingual Entity Extraction. In Proceedings of the 9th International Conference on Semantic...
Evaluating the Impact of Phrase Recognition on Concept Tagging
Mendes, P., Daiber, J., Rajapakse, R., Sasaki, F., & Bizer, C. (2012). Evaluating the Impact of Phrase Recognition on Concept Tagging. In Proceedings of...
Targeted disambiguation of ad-hoc, homogeneous sets of named entities
Wang, C., Chakrabarti, K., Cheng, T., & Chaudhuri, S. (2012). Targeted disambiguation of ad-hoc, homogeneous sets of named entities. In Proceedings of th...
Automatic Semantic Web Annotation of Named Entities
Charton, E., Gagnon, M., & Ozell, B. (2011). Automatic semantic web annotation of named entities. In Proceedings of the 24th Canadian conference on Adva...
A Comparison of Knowledge Extraction Tools for the Semantic Web
Gangemi, A., 2013. A Comparison of Knowledge Extraction Tools for the Semantic Web. In P. Cimiano et al., eds. The Semantic Web: Semantics and Big Data. Lect...
Automatic knowledge extraction from documents
Fan, J., Kalyanpur, A., Gondek, D. C., & Ferrucci, D. A. (2012). Automatic knowledge extraction from documents. IBM Journal of Research and Development, ...
Ensemble Semantics for Large-scale Unsupervised Relation Extraction
Min, B. et al., 2012. Ensemble semantics for large-scale unsupervised relation extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods ...
Relation Extraction and the Influence of Automatic Named-Entity Recognition
Giuliano, C., Lavelli, A. & Romano, L., 2007. Relation extraction and the influence of automatic named-entity recognition. ACM Transactions on Speech an...
Collective Cross-Document Relation Extraction Without Labelled Data
Yao, L., Riedel, S. & McCallum, A., 2010. Collective cross-document relation extraction without labelled data. In Proceedings of the 2010 Conference on ...
Mining competitor relationships from online news: A network based approach
Ma, Z., Pant, G. & Sheng, O.R.L., 2011. Mining competitor relationships from online news: A network-based approach. Electronic Commerce Research and Appl...
Evaluating Entity Linking with Wikipedia
Hachey, B. et al., 2013. Evaluating Entity Linking with Wikipedia. Artificial Intelligence, 194, pp.130—150. This article compares the performance of t...
IdentityRank: Named entity disambiguation in the news domain
Fernández, N. et al., 2012. IdentityRank: Named entity disambiguation in the news domain. Expert Systems with Applications, 39(10), pp.9207&mda...
Open Information Extraction using Wikipedia
Wu, F. & Weld, D.S., 2010. Open information extraction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Li...
Distributional Footprints of Deceptive Product Reviews
by Song Feng, Longfei Xing, Anupam Gogar and Yejin Choi 2012 The authors of this paper argue that there are natural distributions of opinions in reviews for ...
A Survey of Types of Text Noise and Techniques to Handle Noisy Text
by Subramaniam, L. V., Roy, S., Faruquie, T. A., & Negi, S. (2009). A survey of types of text noise and techniques to handle noisy text. Proceedings of T...
Isanette: A Common and Common Sense Knowledge Base for Opinion Mining
Cambria, E. et al., 2011. Isanette: A Common and Common Sense Knowledge Base for Opinion Mining. In Data Mining Workshops (ICDMW), 2011 IEEE 11th Internation...
Evaluation of Named Entity Recognition Systems
Monica Marrero et al. (2009) Evaluation of Named Entity Extraction Systems, Research In Computer Science, Vol. 41 (2009), pp. 47-58 This paper presents a num...
The Web is not a Person - An Analysis of the Performance of Named-Entity Recognition
Krovetz, R. et al. 2011. The web is not a person, Berners-Lee is not an organization, and African-Americans are not locations: an analysis of the performance...
Identifying Relations for Open Information Extraction
by Fader et al. This paper addresses two major shortcomings of state of the art open information extraction systems: uninformative extractions that omit cri...
Acquiring Competitive Intelligence from Social Media
by Dey et al. This article discusses (i) methodologies to obtain Web intelligence and (ii) presents case-studies which demonstrate how this information can b...
Finding Text Reuse in the Web
by Michael Bendersky and W. Bruce Croft (WSDM'09) This article discusses an approach for finding three different kinds of text reuse in the web: verbatim co...
Building and applying a concept hierarchy representation of a user profile
by Nanas et al. The focus on this paper is rather on the building of concept hierarchies and networks describing document repositories than on ontology evalu...
Redundancy-based information extraction
The notion of redundancy-based information extraction utilizes the fact that many information on the Web is redundand, which leads to the consequences that ...
Domain relevance of terminology
based on Navigli, R. and Velardi, P. (2004). ''Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites'', Computational Linguistics, page...
Using a POS tagger for lexical knowledge acquisition
based on: Litz, Berenike, Langer, Hagen and Malaka, Rainer (2009). ''Trigrams'n'Tags for Lexical Knowledge Acquisition'', First International Conference on K...
The Pathologies of Big data
by Jacobs, Adam (2009). The pathologies of big data, Communications of the ACM, ACM, pages 36-44, 52(8) The article demonstrates the importance of a profound...
Open Relation Extraction
[Banko:2008] Banko, Michele and Etzioni, Oren (2008). ''The Tradeoffs Between Open and Traditional Relation Extraction'', Proceedings of ACL-08: HLT, Associa...
Information Retrieval
A Survey of Types of Text Noise and Techniques to Handle Noisy Text
by Subramaniam, L. V., Roy, S., Faruquie, T. A., & Negi, S. (2009). A survey of types of text noise and techniques to handle noisy text. Proceedings of T...
Evaluation and User Preference Study on Spatial Diversity
by Tang and Sanderson (ECIR 2010) This article presents a user study which shows that users prefer search results which are not only (i) relevant but also (i...
Comparing the Sensitivity of Information Retrieval Metrics
by Radlinsky and Craswell (SIGIR 2010) This paper compares user behaviour based IR metrics with the following standard IR metrics: Precision@k -- the preci...
Using text mining and sentiment analysis for online forum hotspot detection
by Nan Li and Desheng Dash Wu (Decision Support Systems) The authors combine the following five features to detect forum hotspots using either (i) K-means cl...
Domain relevance of terminology
based on Navigli, R. and Velardi, P. (2004). ''Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites'', Computational Linguistics, page...
The Pathologies of Big data
by Jacobs, Adam (2009). The pathologies of big data, Communications of the ACM, ACM, pages 36-44, 52(8) The article demonstrates the importance of a profound...
Dynamic Taxonomies (FIND Workshop)
The term dynamic taxonomies refers to a multidimensional (multifaceted *check*) classification . Interesting ideas: </p> Searching/Browsing: express s...
Text-based Information Retrieval (T-IR)
a Workshop organized by Benno Stein. Content Extraction (from Web pages) Filter lines based on the content to tags ratio.</p> a <- ASCII character...
An evaluation dataset for the toponym resolution task
Leidner, Jochen L. (2006). ''An evaluation dataset for the toponym resolution task'', Computers, Environment and Urban Systems, pages 400-417 This paper moti...
Collective Intelligence
by Segaran Toby An excellent guide to programming Web 2.0 applications, with code examples and excellent explanations of the used techniques. Similarity Metr...
Getting Better Search Results
Human-aided filtering can make the difference (by Bob Zeidman). Bob presents in this article how human-aided filtering can improve filtering accuracy. At fir...
Journals
Techniques and applications for sentiment analysis
Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4), 82—89. doi:10.1145/2436256.2436274 The articl...
Enhancing Learning Objects with an Ontology-Based Memory
by Amal Zouaq and Roger Nkambou This paper presents an approach for building Learning Knowledge Objects (LKO) and an ontology learning component for the auto...
Automatising the learning of lexical patterns: An application to the enrichment of WordNet by extracting semantic relationships from Wikipedia
by Maria Ruiz-Casado et al. Ruiz-Casado et al. extend WordNet by extracting hyponym, hyperonym, holonym and meronym relations from the simple english version...
Automatic hidden-web table interpretation, conceptualization, and semantic annotation
by Cui Tao and David W. Embley This paper presents an approach which solves the problem of hidden-web table interpretations for cases in which sibling pages ...
Automatic Fuzzy Ontology Generation for Semantic Web
by Tho et al. This article describes algorithms to automatically generate fuzzy ontologies. The authors identify concept by combining fuzzy logic with FCA (=...
Review: Dr. Dobb’s Journal January 2008
The January issue covers the following interesting topics: Tag Clouds Jurgen Appelo provides hints for optimizing the presentation of tag clouds by linearizi...
Keynotes
KDIR Panel Discussion
The Information Butler (Andreas Dengel) Learn from best practices Recommends resources (similar to MISTRAL) Context Identification of context ¨(Eye-tra...
Linking Open Data
Thematic Exploration of Linked Data
by Castano et al.; Very Large Data Search (VLDS) 2011 This article addresses the problem of organizing linked data, which features an inherent flat organizat...
Digital Intuition: Applying Common Sense Using Dimensionality Reduction
Havasi, C., Pustejovsky, J., Speer, R., & Lieberman, H. (2009). Digital Intuition: Applying Common Sense Using Dimensionality Reduction. Intelligent Syst...
Machine Learning
Growing Pains for Deep Learning
Edwards, C. (2015). Growing Pains for Deep Learning. Commun. ACM, 58(7), 14—16. http://doi.org/10.1145/2771283</p> Summary This article provides...
Digital Intuition: Applying Common Sense Using Dimensionality Reduction
Havasi, C., Pustejovsky, J., Speer, R., & Lieberman, H. (2009). Digital Intuition: Applying Common Sense Using Dimensionality Reduction. Intelligent Syst...
Isanette: A Common and Common Sense Knowledge Base for Opinion Mining
Cambria, E. et al., 2011. Isanette: A Common and Common Sense Knowledge Base for Opinion Mining. In Data Mining Workshops (ICDMW), 2011 IEEE 11th Internation...
Argument-based Machine Learning
Description Standard machine learning takes examples as input in the form of pairs (A, C), where A is an attribute value vector and C the class the example b...
Named Entity Linking
AGDISTIS - Graph-Based Disambiguation of Named Entities Using Linked Data
Summary This paper introduces a graph-based disambiguation approach for named entity linking that achieves higher F-measures than the state of the art and a ...
Ontology Building
Extracting Concepts
This article collects some thoughts on normalizing phrases to concepts. Examples: drive_car <- "drive a car", "you drive your car", "driving cars" and "...
Isanette: A Common and Common Sense Knowledge Base for Opinion Mining
Cambria, E. et al., 2011. Isanette: A Common and Common Sense Knowledge Base for Opinion Mining. In Data Mining Workshops (ICDMW), 2011 IEEE 11th Internation...
Remarks on Ontology Learning and Evaluation
This post contains some random remarks on ontology learning and evaluation: terms versus concepts: concepts are formed by grouping terms with the same meani...
Using Ontological Contexts to Assess the Relevance of Statements in Ontology Evolution
by Fouad Zablith, Mathieu d'Aquin, Marta Sabou, and Enrico Motta This work describes a method for judging the relevance statements suggested by ontology evol...
Enhancing Learning Objects with an Ontology-Based Memory
by Amal Zouaq and Roger Nkambou This paper presents an approach for building Learning Knowledge Objects (LKO) and an ontology learning component for the auto...
Automatising the learning of lexical patterns: An application to the enrichment of WordNet by extracting semantic relationships from Wikipedia
by Maria Ruiz-Casado et al. Ruiz-Casado et al. extend WordNet by extracting hyponym, hyperonym, holonym and meronym relations from the simple english version...
Automatic hidden-web table interpretation, conceptualization, and semantic annotation
by Cui Tao and David W. Embley This paper presents an approach which solves the problem of hidden-web table interpretations for cases in which sibling pages ...
Automatic Fuzzy Ontology Generation for Semantic Web
by Tho et al. This article describes algorithms to automatically generate fuzzy ontologies. The authors identify concept by combining fuzzy logic with FCA (=...
Domain relevance of terminology
based on Navigli, R. and Velardi, P. (2004). ''Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites'', Computational Linguistics, page...
Ontology Evaluation
Remarks on Ontology Learning and Evaluation
This post contains some random remarks on ontology learning and evaluation: terms versus concepts: concepts are formed by grouping terms with the same meani...
Using Ontological Contexts to Assess the Relevance of Statements in Ontology Evolution
by Fouad Zablith, Mathieu d'Aquin, Marta Sabou, and Enrico Motta This work describes a method for judging the relevance statements suggested by ontology evol...
On How to Perform a Gold Standard Based Evaluation of Ontology Learning
by K. Dellschaft and St. Staab This work provides an excellent overview of ontology evaluation measures, specifies criteria for good measures and introduces ...
Notions of Correctness when Evaluating Protein Name Taggers
by Olsson et al. This paper introduces six notions of correctness to evaluate the performance of protein name taggers: sloppy - which means that the propose...
Building and applying a concept hierarchy representation of a user profile
by Nanas et al. The focus on this paper is rather on the building of concept hierarchies and networks describing document repositories than on ontology evalu...
Metrics for Evaluation of Ontology-based Information Extraction
by Maynard et al. This article describes metrics for evaluating ontologies. The article covers the following metrics: precision, recall False positives cost...
A conceptual density-based approach for disambiguation of toponyms
by Buscaldi and RossoThis article explores the use of word-sense-disambiguation (WSD) techniques for toponym resolution. The authors explain two algorithms w...
Research venues
Checking Facts
The Web and social media produce massive amounts of data at different levels of quality and trustworthiness. New research focuses on creating methods for che...
Scalability
Borg, Omega and Kubernetes
Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016). Borg, Omega, and Kubernetes. Commun. ACM, 59(5), 50—57. https://doi.org/10...
Search Test Stop
Background on STS
The following two articles provide mathematical and statistical background on the search test stop model. The paper by Marcozzi presents approximations of th...
Acquisition and Adoption of New Technology
Information Acquisition and the Adoption of new Technology. *** (Mathematics) by McCardle, 1985 This paper presents a model for making adoption decision on n...
Continuos Search Queries
Answering bounded continuous search queries in the world wide web by Kukulenz and Ntoulas. This article present the application of optimal stopping theory t...
Reinforcement Learning
A reinforcement learning approach to dynamic resource allocation ** by David Vengerov Another paper on the application of the concept of 'utility' to compute...
Optimal Stopping Reloaded
On the use of hybrid reinforcement learning for autonomic resource allocation *** by Teasuro et. al This paper elaborates on the use of reinforcement learnin...
STS Literature
We will divide the related literature on STS into two categories (i) literature on optimal stopping, and the foundations of the algorithm, and (ii) literatur...
A Value-Driven System for Autonomous Information Gathering
by Grass, J. and Zilberstein, S. Grass and Zilberstein introduce a framework for gathering information, by repeatedly selecting queries with the highest marg...
Sentiment Detection
Polarity Shift Detection, Elimination and Ensemble: A Three-Stage Model for Document-Level Sentiment Analysis
Xia, Rui, Feng Xu, Jianfei Yu, Yong Qi, and Erik Cambria. Polarity Shift Detection, Elimination and Ensemble: A Three-Stage Model for Document-Level Sentime...
A Machine-Learning Approach to Negation and Speculation Detection for Sentiment Analysis
Cruz, Noa P., Maite Taboada, and Ruslan Mitkov. A Machine-Learning Approach to Negation and Speculation Detection for Sentiment Analysis. Journal of the As...
Combining Resources to Improve Unsupervised Sentiment Analysis at Aspect-Level
Jiménez-Zafra, S. M., Martín-Valdivia, M. T., Martínez-Cámara, E., & Ureña-López, L. A. (2016). Combining resources to improve unsupervised sentiment ana...
Aspect-Based Sentiment Analysis of Movie Reviews on Discussion Boards
Thet, Tun Thura, Jin-Cheon Na, and Christopher S. G. Khoo. Aspect-Based Sentiment Analysis of Movie Reviews on Discussion Boards. Journal of Information Scie...
Topic-based content and sentiment analysis of Ebola virus on Twitter and in the news
Kim, E. H.-J., Jeong, Y. K., Kim, Y., Kang, K. Y., & Song, M. (2015). Topic-based content and sentiment analysis of Ebola virus on Twitter and in the new...
Opinion Holder and Target Extraction for Verb-based Opinion Predicates - The Problem is Not Solved
Michael Wiegand, Marc Schulder, & Josef Ruppenhofer. (n.d.). Opinion Holder and Target Extraction for Verb-based Opinion Predicates -- The Problem is No...
Experimental evidence of massive-scale emotional contagion through social networks
Kramer, A. D. I., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedi...
Semantic Multi-Dimensional Scaling for Open-Domain Sentiment Analysis
Cambria, E., Song, Y., Wang, H., & Howard, N. (2013). Semantic Multi-Dimensional Scaling for Open-Domain Sentiment Analysis. IEEE Intelligent Systems, 9...
Factors Influencing the Response Rate in Social Question and Answering Behavior
Liu, Z. & Jansen, B.J., 2013. Factors Influencing the Response Rate in Social Question and Answering Behavior. In Proceedings of the 2013 Conference on ...
Techniques and applications for sentiment analysis
Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4), 82—89. doi:10.1145/2436256.2436274 The articl...
Taking Refuge in Your Personal Sentic Corner
Cambria, E., Hussain, A. & Eckl, C., 2011. Taking Refuge in Your Personal Sentic Corner. In Proceedings of the Workshop on Sentiment Analysis where AI me...
Sentimantics: Lexical Sentiment Polarity Representations with Contextuality
Das, A. & Gambäck, B., 2012. Sentimantics: conceptual spaces for lexical sentiment polarity representation with contextuality. In Proceedi...
Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification
by Melville et al. (KDD 2009) Motivation: </p> before the rise of the Web 2.0 companies published product information and reviews on Web sites ...
Isanette: A Common and Common Sense Knowledge Base for Opinion Mining
Cambria, E. et al., 2011. Isanette: A Common and Common Sense Knowledge Base for Opinion Mining. In Data Mining Workshops (ICDMW), 2011 IEEE 11th Internation...
Generating High-Coverage Semantic Orientation Lexicons from Overtly Marked Words and a Thesaurus
by Saif et al. 1. General confirms the Polyanna Hypothesis which states that people have a preference for using positive words and expressions suggesting th...
The viability of web-derived polarity lexicons
by Velikovich et al. (Google research) This paper describes an approach for semi-automatically generating sentiment lexicon from seed terms and a Web corpus...
Automatic Creation of a Reference Corpus for Political Opinion Mining in User-Generated Content
by Sarmento et al. (TSA 2009) This article presents an approach which applies manually-crafted lexico-syntactic patterns to collect highly opinionated commen...
What’s Great and What’s Not: Learning to Classify the Scope of Negation for Improved Sentiment Analysis
by Councill et. al - Proceedings of the Workshop on Negation and Speculation in NLP (July, 2010) </p> This paper uses conditional random fields to dete...
Using text mining and sentiment analysis for online forum hotspot detection
by Nan Li and Desheng Dash Wu (Decision Support Systems) The authors combine the following five features to detect forum hotspots using either (i) K-means cl...
Leveraging Sentiment Analysis for Topic Detection
by Cai et al (IBM China Research Lab) *** The authors combine sentiment detection with identifying the terms that are highly correlated to a specific sentime...
Social Networks
The DARPA Twitter Bot Challenge
Subrahmanian, V. S., A. Azaria, S. Durst, V. Kagan, A. Galstyan, K. Lerman, L. Zhu, E. Ferrara, A. Flammini, and F. Menczer. The DARPA Twitter Bot Challenge....
Experimental evidence of massive-scale emotional contagion through social networks
Kramer, A. D. I., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedi...
Social Network Analysis and Mining for Business Applications
Bonchi, F., Castillo, C., Gionis, A., & Jaimes, A. (2011). Social Network Analysis and Mining for Business Applications. ACM Transactions on Intelligent ...
Mining competitor relationships from online news: A network based approach
Ma, Z., Pant, G. & Sheng, O.R.L., 2011. Mining competitor relationships from online news: A network-based approach. Electronic Commerce Research and Appl...
Uncovering the overlapping community structure of complex networks in nature and society
Palla, G. et al., 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043), p.814. </p> Introd...
Predicting the Future with Social Media
by Asur, S., & Huberman, B. A. (2010). IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)</p> <...
Collective Intelligence
by Segaran Toby An excellent guide to programming Web 2.0 applications, with code examples and excellent explanations of the used techniques. Similarity Metr...
Tracking Website Data-Collection and Privacy Practices with the iWatch Web Crawler
by Carlos Jensen et. al The paper presents a approach for analyzing the privacy practice of Web sites. The iWatch crawler garners information regarding Coo...
Privacy-Enhanced Sharing of Personal Content on The Web
by Mannan, Mohammad and van Oorschot, Paul This article introduces a technique for sharing personal data based on the contact lists of instant messaging (IM)...
Seven Privacy Worries in Ubiquitous Social Computing
Motahari et. al identify seven privacy threads: Inappropriate use by administrators legal obligations: e.g. police inadequate security design invasions (poo...
Information Revelation and Privacy in Online Social Networks (The Facebook case)
by Ralph Gross and Alessandro Acquisti (+) The article provides an excellent literature review and trust and intimacy in online networking and on the partici...
Social Phishing
by Tom N. Jagatic, Nathaniel A. Johnson, Markus Jakobsson, and Filippo Menczer This article covers various aspects of phishing attacks luring users into disc...
Correlating User Profiles from Multiple Folksonomies
by Martin Szomszor, Ivan Cantador, Harith Alani (+) The article focuses on comparing tag-clouds from multiple folksonomies (namely del.icio.us and flickr). A...
Online socialising…
...an article I got from Walter without any reference to the author. Two facts the article mentioned I found particular interesting: News Corp. was paying a...
What Anyone Can Know
- The Privacy Risks of Social Networking Sites by David Rosenblum This article describes the professional and personal risks, bad privacy and a wrong expecta...
Statistics
It Probably Works
Mcmullen, T. (2015). It Probably Works. Commun. ACM, 58(11), 50—54. http://doi.org/10.1145/2814332 Introduction This article distinguishes between thre...
Receiver Operating Characteristic Analysis - A Primer
Eng, J., 2005. Receiver Operating Characteristic Analysis: A Primer1. Academic Radiology, 12(7), pp.909—916. </p> Introduction </p> This ar...
Talks
Using a POS tagger for lexical knowledge acquisition
based on: Litz, Berenike, Langer, Hagen and Malaka, Rainer (2009). ''Trigrams'n'Tags for Lexical Knowledge Acquisition'', First International Conference on K...
Thesis
Noisy text analytics for sentiment analysis
Die Valenz (Polarity; Sentiment; Semantic Orientation) eines Dokumentes definiert, ob dieses eine positive oder negative Polarität beziehungsweise Berichters...
Automatische Ermittlung der Relevanz von Nachrichten
Aktuelle Web Intelligence Systeme wie der Media Watch on Climate Change (www.ecoresearch.net/climate) analysieren umfangreiche Datenbest#nde und reichern...
Extraktion von Ternären Relationen aus deutschsprachigen Texten
Einleitung Aktuelle Web Intelligence Systeme wie der Media Watch on Climate Change (www.ecoresearch.net/climate) analysieren umfangreiche Datenbestände und...
Ermittlung der Valenz von Nachrichten in Sozialen Netzen
Die Valenz (Sentiment; Semantic Orientation) eines Dokumentes definiert, ob dieses eine positive oder negative Polarität beziehungsweise Berichterstattung au...
Verarbeitung von natürlichsprachigen Texten aus Sozialen Netzen
Der Anteil von benutzergenerierten Inhalten hat sich mit der Weiterentwicklung des World Wide Webs zum Web 2.0 beziehungsweise Social Web stark erhöht. Zusät...
Extraktion von Named-Entities aus deutschsprachigen Texten
Einleitung Aktuelle Web Intelligence Systeme wie der Media Watch on Climate Change (www.ecoresearch.net/climate) analysieren umfangreiche Datenbestände und r...
Uncategorized
A Machine-Learning Approach to Negation and Speculation Detection for Sentiment Analysis
Cruz, Noa P., Maite Taboada, and Ruslan Mitkov. A Machine-Learning Approach to Negation and Speculation Detection for Sentiment Analysis. Journal of the As...
Using Word Association to Detect Multitopic Structures in Text Documents
Klahold, A. et al., 2014. Using Word Association to Detect Multitopic Structures in Text Documents. IEEE Intelligent Systems, 29(5), pp.40—46.</p&g...
Near-optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions
Andoni, A. & Indyk, P., 2008. Near-optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions. Communications of the ACM, 51(1), pp....
Natural Language Processing for Health and Social Media
Abbasi, A. et al., 2014. Social Media Analytics for Smart Health. IEEE Intelligent Systems, 29(2), pp.60—80.</p> </p> Summary In this arti...
Mining comparative opinions from customer reviews for Competitive Intelligence
Xu, K., Liao, S. S., Li, J., & Song, Y. (2011). Mining comparative opinions from customer reviews for Competitive Intelligence. Decision Support Systems...
Hazy: Making It Easier to Build and Maintain Big-Data Analytics
Kumar, Arun, Feng Niu, and Christopher Ré. Hazy: Making It Easier to Build and Maintain Big-data Analytics. Communications of the ACM 56, no. 3...
Most Influential Data Mining Algorithms
As ranked by the IEEE International Conference on Data Mining2006 (ICDM 2006) C4.5 k-means support vector machines (SVM) Apriori expectation maximization (...
Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification
by Melville et al. (KDD 2009) Motivation: </p> before the rise of the Web 2.0 companies published product information and reviews on Web sites ...
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora
by de Knijff et al. This paper covers a framework that extracts terms from Web corpora, uses word sense disambiguation (WSD) to determine the word's senses,...
Sparse Machine Learning Methods for Understanding Large Text Corpora
by Ghaoui et al. Sparse machine learning methods provide models that are easier to interpret by seeking a trade-off between goodness-of-fit and the sparsity ...
Evaluation in Information Retrieval
Manning, C.D., Raghavan, P. & Schütze, H., 2008. Introduction to Information Retrieval 1st ed., Cambridge University Press. Chapter 8 - Evaluation in inf...
Information flow modeling based on diffusion rate for prediction and ranking
by Song et al. (song2007) Song et al. investigate the information flow in a user network. They try to (i) predict where information flows and (ii) who will m...
Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification
by Melvile et al (IBM Watson Research Centre) ***** Analyzing blog posts raises a number of interesting questions: how to identify the subset of blogs discu...
What makes a helpful online Review?
Mudambi and Schuff This article presents a model for the helpfulness of customer reviews which is verified based on Amazon reviews. The article is quite inte...
KDIR causal knowledge
</p> Automatic identification of quasi-experimental designs for discovering causal knowledge by Jensen et. al</p> Introduction </p> blac...
Context-Aware Systems and Implications for Future Internet
by Nigel Baker et al. The article presents a vision how context can make network services more personalised and useful. The authors define context awareness ...
Socrates…
...described how his dialectic works. A precondition for being able to talk about 'things' is that we define those things (in contrast to the sophists, who u...
Evolutinary Clustering in Description Logics: Controlling Concept Formation and Drift in Ontologies
by Floriana Esposito et. al Detecting concept drift and new concepts in an ontology by analyzing the superclasses of its individuals.
Utilification
by John Wilkes, Jeffrey Mogul, and Jaap Suermondt This paper elaborates on utilitfication - the transfer of applications to utility computing. Utility comput...
A Utility Theoretic Approach to Determining Optimal Wait Times in Distributed Information Retrieval
This paper presents an approach for providing optimal access to multiple resources combined to a virtual resource through an information retrieval agent. The...
Designing a Better Shopbot
This paper describes optimizing the design of a shopbot (=shopping robot) which considers, the intrinsic value of the product, the disutility from waiting, ...
Why Do Street-Smart People Do Stupid Things Online?
Bratus, Sergey, Masone, Chris and Smith, Sean W. (2008). ''Why Do Street-Smart People Do Stupid Things Online?'', IEEE Security and Privacy, IEEE Computer So...
An evaluation dataset for the toponym resolution task
Leidner, Jochen L. (2006). ''An evaluation dataset for the toponym resolution task'', Computers, Environment and Urban Systems, pages 400-417 This paper moti...
A confidence-based framework for disambiguating geographic terms
Rauch, Erik and Bukatin, Michael and Baker, Kenneth (2003). ''A confidence-based framework for disambiguating geographic terms'', Proceedings of the HLT-NAAC...
Information Diffusion
by Dimitry Zibold The following article summarizes some interesting aspects from Dimitry's research: A Shingle is a contiguous sub-sequence of tokens in a d...
Getting Better Search Results
Human-aided filtering can make the difference (by Bob Zeidman). Bob presents in this article how human-aided filtering can improve filtering accuracy. At fir...
Build, Sharing, and Merging Ontologies
by John F. Sowa An excellent article (available online here) I got recommended by on of the reviewer of our "ontology evolution paper". The paper starts with...
A methodology for constructing of philosophy ontology based on philosophical texts
by Jung-Min Kim et al. The paper presents an philosophy ontology created by the authors. Especially the methodology outlined and the literature review provid...
Record Matching in Digital Library Metadata
Kan, Min-Yen and Tan, Yee Fan: Record Matching in Digital Library Metdata, Communications of the ACM, Volume 51 (2), 91-94 The article provides an excellent ...
Optimization of Relational Preferences Queries
by Bernd Hafenrichter and Werner Kießling. The article provided us with an insight into personal preferences, which are often expressed as wishe...
Data Mining…
Google Tech Talk; Data1 Some interesting stuff we learned in the course
Web Site Success Metrics: Addressing the Duality of Goals
by France Belanger et. al. The authors distinguish web wite success into audience (user) specific and goal (company) specific </p> success metrics. ...
Software Architecture for Language Engineering
Phd Thesis by Hamish Cunningham The thesis makes the case for GATE - the General Architecture for Text Engineering - a framework propagating code reuse for l...
Text Mining - Predictive Methods for analyzing unstructured information
by Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau. The book gives a very good beginner's introduction to text mining. In the first chapter so...
Web Dynamics edited by Mark Levene and Alexandra Poulovassilis
This book covers the following topics: evolution of web structure and content, presenting papers on a) the size of the web,</p> b) methods for the m...
User Evaluation
Evaluation and User Preference Study on Spatial Diversity
by Tang and Sanderson (ECIR 2010) This article presents a user study which shows that users prefer search results which are not only (i) relevant but also (i...
Judging the spatial relevance of documents for GIR
by Clough and Joho (Advances in Information Retrieval 2006) This articles describes a pilot study which assesses both thematic and geographic relevance based...
Comparing the Sensitivity of Information Retrieval Metrics
by Radlinsky and Craswell (SIGIR 2010) This paper compares user behaviour based IR metrics with the following standard IR metrics: Precision@k -- the preci...
Web Intelligence
Social-media-based public policy informatics: Sentiment and network analyses of U.S. Immigration and border security
Chung, W., & Zeng, D. (2016). Social-media-based public policy informatics: Sentiment and network analyses of U.S. Immigration and border security. Journ...
Topic-based content and sentiment analysis of Ebola virus on Twitter and in the news
Kim, E. H.-J., Jeong, Y. K., Kim, Y., Kang, K. Y., & Song, M. (2015). Topic-based content and sentiment analysis of Ebola virus on Twitter and in the new...
It Probably Works
Mcmullen, T. (2015). It Probably Works. Commun. ACM, 58(11), 50—54. http://doi.org/10.1145/2814332 Introduction This article distinguishes between thre...
Social Network Analysis and Mining for Business Applications
Bonchi, F., Castillo, C., Gionis, A., & Jaimes, A. (2011). Social Network Analysis and Mining for Business Applications. ACM Transactions on Intelligent ...
Catching a Viral Video
Broxton, Tom, Yannet Interian, Jon Vaver, and Mirjam Wattenhofer. Catching a Viral Video. Journal of Intelligent Information Systems 40, no. 2 (April 1, 201...
Management Support with Structured an Unstructured Data - An Integrated Business Intelligence Framework
Baars, Henning, and Hans-George Kemper. Management Support with Structured and Unstructured Data-An Integrated Business Intelligence Framework. Information...
Data Warehousing and Analytics Infrastructure at Facebook
Thusoo, A. et al., 2010. Data warehousing and analytics infrastructure at facebook. In Proceedings of the 2010 ACM SIGMOD International Conference on Managem...
An Overview of Business Intelligence Technology
Chaudhuri, S., Dayal, U. & Narasayya, V., 2011. An overview of business intelligence technology. Communications of the ACM, 54(8), pp.88—98. This a...
ACM Tech Pack on Business Intelligence and Data Management
Cupoli, P. et al., 2012. ACM Tech Pack on Business Intelligence/Data Management. ACM. Available at: http://techpack.acm.org/bi/. This tech pack contains a hi...
Web Intelligence Applications
This article collects real world use cases of Web and Business Intelligence applications. Use Cases Business Intelligence: companies use their own data sourc...
Discovering company revenue relations from news: A network approach
Ma, Z., Sheng, O.R.L. & Pant, G., 2009. Discovering company revenue relations from news: A network approach. Decission Support Systems, 47(4), pp.408&mda...
Business Intelligence and Analytics: From Big Data to Big Impact
</p> Chen, H., Chiang, R.H.L. & Storey, V.C., 2012. Business Intelligence and Analytics: From Big Data to Big Impact. MIS Quarterly, 36(4), pp.11...
A rule-based method for identifying the factor structure in customer satisfaction
Ahmad, A., Dey, L. & Halawani, S.M., 2012. A rule-based method for identifying the factor structure in customer satisfaction. Inf. Sci., 198, pp.118&mda...
Web 2.0 Environmental Scanning and Adaptive Decision Support for Business Mergers and Acquisitions
Lau, R. et al., 2012. Web 2.0 Environmental Scanning and Adaptive Decision Support for Business Mergers and Acquisitions. Management Information Systems Qua...
Text-Mining the Voice of the People
Evangelopoulos, N., & Visinescu, L. (2012). Text-mining the voice of the people. Communications of the ACM, 55(2), 62. doi:10.1145/2076450.2076467 This a...
Predicting the Future with Social Media
by Asur, S., & Huberman, B. A. (2010). IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)</p> <...
Acquiring Competitive Intelligence from Social Media
by Dey et al. This article discusses (i) methodologies to obtain Web intelligence and (ii) presents case-studies which demonstrate how this information can b...
Envisioning Intelligent Information Technologies through the Prism of Web Intelligence
Zhong et al. (Communications of the ACM) - coined the term Web Intelligence (see literature) This article introduces intelligent Information Technology (iIT)...
Data Mining for Web Intelligence
by Han and Chen-Chuan This article discusses data mining as key technology for bringing intelligence and direction to our Web interactions. At first they dis...
Web Services
Improving Performance of Web Service Query Matchmaking with Automated Knowlede Acquisition
by Gupta et al. The Article presents a systems for simple interfacing Web services using an HTML based search engine, mapping the queries to Web service requ...