Categories

Algorithms and Optimizations (1) Applications (2) Articles (17) Big data (17) Book Review (3) Conferences (7) Database (2) Distributed Computing (1) Evaluation Metrics (2) Future Internet (2) Geo (9) Habilitation (1) Information Diffusion (20) Information Extraction (30) Information Retrieval (11) Journals (6) Keynotes (1) Linking Open Data (2) Machine Learning (4) Named Entity Linking (1) Ontology Building (9) Ontology Evaluation (7) Research venues (1) Scalability (1) Search Test Stop (7) Sentiment Detection (20) Social Networks (15) Statistics (2) Talks (1) Thesis (6) Uncategorized (35) User Evaluation (3) Web Intelligence (19) Web Services (1)

Algorithms and Optimizations

Applications

Articles

The Power of Social Media Analytics

2 minute read

Fan, W., & Gordon, M. D. (2014). The power of social media analytics. Communications of the ACM, 57(6), 74—81. doi:10.1145/2602574</p> Summa...

Social Media Analytics for Smart Health

less than 1 minute read

Abbasi, A., Adjeroh, D., Dredze, M., Paul, M. J., Zahedi, F. M., Zhao, H., Ross, A. (2014). Social Media Analytics for Smart Health. IEEE Intelligent System...

Basic Ontology Data Integration Concepts

less than 1 minute read

Data integration consists of two basic steps: semantic enrichement mapping discovery LAV vs. GAV</p> LAV and GAV describe two approaches for integrat...

What Is Web 2.0

less than 1 minute read

Design Patterns and Business Models for the Nexter Generation of Software by Tim O'Reilly (09/30/2005) An excellent article elaborating the concepts and idea...

Sharing Knowledge

less than 1 minute read

by Peter Marks, Peter Polak, Scott McCoy and Dennis Galletta The idea of knowledge management systems (KMS) is unlocking knowledge heretofore only accessible...

Big data

Suffix array

1 minute read

The suffix array is a memory-efficient alternative to the suffix tree which provides a sorted list of string indices indicating the string’s suffixes.

40 years of suffix trees

less than 1 minute read

Suffix trees are used in text searching, indexing, statistics. This article describes the history, construction, current developments and applications of suf...

Data sketching

less than 1 minute read

This article introduces three popular data structures that efficiently handle and summarize large data sets.

Rich Data, Poor Fields

less than 1 minute read

This article shows how handheld devices and big data technology may multiply field yields and make farming more environmentally friendly.

The DARPA Twitter Bot Challenge

1 minute read

Subrahmanian, V. S., A. Azaria, S. Durst, V. Kagan, A. Galstyan, K. Lerman, L. Zhu, E. Ferrara, A. Flammini, and F. Menczer. The DARPA Twitter Bot Challenge....

It Probably Works

3 minute read

Mcmullen, T. (2015). It Probably Works. Commun. ACM, 58(11), 50—54. http://doi.org/10.1145/2814332 Introduction This article distinguishes between thre...

The New Smart Cities

less than 1 minute read

Mone, G. (2015). The New Smart Cities. Commun. ACM, 58(7), 20—21. http://doi.org/10.1145/2771297</p> </p> Summary This article discusses b...

Social Media Analytics for Smart Health

less than 1 minute read

Abbasi, A., Adjeroh, D., Dredze, M., Paul, M. J., Zahedi, F. M., Zhao, H., Ross, A. (2014). Social Media Analytics for Smart Health. IEEE Intelligent System...

Beyond Data and Analysis

1 minute read

Davis, C. K. (2014). Beyond Data and Analysis. Commun. ACM, 57(6), 39—41. doi:10.1145/2602326</p> Summary The article identifies competition whi...

Big Data and Its Technical Challenges

1 minute read

Jagadish, H. V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., & Shahabi, C. (2014). Big Data and Its Technical Chal...

Data Science and Prediction

3 minute read

by Dhar, V. (2013). Data science and prediction. Communications of the ACM, 56(12), 64—73. This article provides insights into how data science complem...

Book Review

Collective Intelligence

less than 1 minute read

by Segaran Toby An excellent guide to programming Web 2.0 applications, with code examples and excellent explanations of the used techniques. Similarity Metr...

Extreme Programming Explained

less than 1 minute read

by Kent Beck and Cynthia Andres I would like to point out some of the concepts and ideas presented in the book: small improvements: analyse the corrent proc...

Web 2.0 by tom ALBY

less than 1 minute read

Yesterday I finally got the time to read some of the new literature I ordered and I decided to start with tom ALBYS "Web 2.0". Relevant information for sema...

Conferences

Conference Thoughts

less than 1 minute read

some things I found interesting to mention. Educational </p> Traditional measures for association rules support: P(XY) confidence: P(X|Y) lift ...

Dynamic Taxonomies (FIND Workshop)

1 minute read

The term dynamic taxonomies refers to a multidimensional (multifaceted *check*) classification . Interesting ideas: </p> Searching/Browsing: express s...

Text-based Information Retrieval (T-IR)

1 minute read

a Workshop organized by Benno Stein. Content Extraction (from Web pages) Filter lines based on the content to tags ratio.</p> a <- ASCII character...

Database

The Pathologies of Big data

less than 1 minute read

by Jacobs, Adam (2009). The pathologies of big data, Communications of the ACM, ACM, pages 36-44, 52(8) The article demonstrates the importance of a profound...

Distributed Computing

Evaluation Metrics

Future Internet

Geo

Spatial and Temporal Information

less than 1 minute read

based on "Normalizing Spatial Information to Better Combine Criteria in Geographical Information Retrieval" y Palacio et al. (ECIR 2009). There are two types...

Habilitation

Fortschritt in der Wirtschaftsinformatik

less than 1 minute read

basierend auf Ideen aus dem Beitrag "Perspektiven der Wirtschaftsinformatik aus Sicht der Informatik" von Matthias Jarke Das relationale Datenmodell (Codd 19...

Information Diffusion

Catching a Viral Video

2 minute read

Broxton, Tom, Yannet Interian, Jon Vaver, and Mirjam Wattenhofer. Catching a Viral Video. Journal of Intelligent Information Systems 40, no. 2 (April 1, 201...

Finding Text Reuse in the Web

less than 1 minute read

by Michael Bendersky and W. Bruce Croft (WSDM'09) This article discusses an approach for finding three different kinds of text reuse in the web: verbatim co...

Tracking Information Epidemics in Blogspace

less than 1 minute read

by Eytan Adar and Lada A. Adamic (WI2005; adar2005) The authors study the paths along which information spreads in the "blog network". They consider the task...

Information Extraction

Evaluating Entity Linking with Wikipedia

1 minute read

Hachey, B. et al., 2013. Evaluating Entity Linking with Wikipedia. Artificial Intelligence, 194, pp.130—150. This article compares the performance of t...

Finding Text Reuse in the Web

less than 1 minute read

by Michael Bendersky and W. Bruce Croft (WSDM'09) This article discusses an approach for finding three different kinds of text reuse in the web: verbatim co...

Redundancy-based information extraction

less than 1 minute read

The notion of redundancy-based information extraction utilizes the fact that many information on the Web is redundand, which leads to the consequences that ...

Domain relevance of terminology

less than 1 minute read

based on Navigli, R. and Velardi, P. (2004). ''Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites'', Computational Linguistics, page...

The Pathologies of Big data

less than 1 minute read

by Jacobs, Adam (2009). The pathologies of big data, Communications of the ACM, ACM, pages 36-44, 52(8) The article demonstrates the importance of a profound...

Open Relation Extraction

less than 1 minute read

[Banko:2008] Banko, Michele and Etzioni, Oren (2008). ''The Tradeoffs Between Open and Traditional Relation Extraction'', Proceedings of ACL-08: HLT, Associa...

Information Retrieval

Domain relevance of terminology

less than 1 minute read

based on Navigli, R. and Velardi, P. (2004). ''Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites'', Computational Linguistics, page...

The Pathologies of Big data

less than 1 minute read

by Jacobs, Adam (2009). The pathologies of big data, Communications of the ACM, ACM, pages 36-44, 52(8) The article demonstrates the importance of a profound...

Dynamic Taxonomies (FIND Workshop)

1 minute read

The term dynamic taxonomies refers to a multidimensional (multifaceted *check*) classification . Interesting ideas: </p> Searching/Browsing: express s...

Text-based Information Retrieval (T-IR)

1 minute read

a Workshop organized by Benno Stein. Content Extraction (from Web pages) Filter lines based on the content to tags ratio.</p> a <- ASCII character...

Collective Intelligence

less than 1 minute read

by Segaran Toby An excellent guide to programming Web 2.0 applications, with code examples and excellent explanations of the used techniques. Similarity Metr...

Getting Better Search Results

less than 1 minute read

Human-aided filtering can make the difference (by Bob Zeidman). Bob presents in this article how human-aided filtering can improve filtering accuracy. At fir...

Journals

Review: Dr. Dobb’s Journal January 2008

less than 1 minute read

The January issue covers the following interesting topics: Tag Clouds Jurgen Appelo provides hints for optimizing the presentation of tag clouds by linearizi...

Keynotes

KDIR Panel Discussion

less than 1 minute read

The Information Butler (Andreas Dengel) Learn from best practices Recommends resources (similar to MISTRAL) Context Identification of context ¨(Eye-tra...

Linking Open Data

Thematic Exploration of Linked Data

1 minute read

by Castano et al.; Very Large Data Search (VLDS) 2011 This article addresses the problem of organizing linked data, which features an inherent flat organizat...

Machine Learning

Growing Pains for Deep Learning

1 minute read

Edwards, C. (2015). Growing Pains for Deep Learning. Commun. ACM, 58(7), 14—16. http://doi.org/10.1145/2771283</p> Summary This article provides...

Argument-based Machine Learning

less than 1 minute read

Description Standard machine learning takes examples as input in the form of pairs (A, C), where A is an attribute value vector and C the class the example b...

Named Entity Linking

Ontology Building

Extracting Concepts

less than 1 minute read

This article collects some thoughts on normalizing phrases to concepts. Examples: drive_car <- "drive a car", "you drive your car", "driving cars" and "...

Remarks on Ontology Learning and Evaluation

less than 1 minute read

This post contains some random remarks on ontology learning and evaluation: terms versus concepts: concepts are formed by grouping terms with the same meani...

Domain relevance of terminology

less than 1 minute read

based on Navigli, R. and Velardi, P. (2004). ''Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites'', Computational Linguistics, page...

Ontology Evaluation

Remarks on Ontology Learning and Evaluation

less than 1 minute read

This post contains some random remarks on ontology learning and evaluation: terms versus concepts: concepts are formed by grouping terms with the same meani...

Research venues

Checking Facts

less than 1 minute read

The Web and social media produce massive amounts of data at different levels of quality and trustworthiness. New research focuses on creating methods for che...

Scalability

Borg, Omega and Kubernetes

less than 1 minute read

Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016). Borg, Omega, and Kubernetes. Commun. ACM, 59(5), 50—57. https://doi.org/10...

Search Test Stop

Background on STS

less than 1 minute read

The following two articles provide mathematical and statistical background on the search test stop model. The paper by Marcozzi presents approximations of th...

Acquisition and Adoption of New Technology

less than 1 minute read

Information Acquisition and the Adoption of new Technology. *** (Mathematics) by McCardle, 1985 This paper presents a model for making adoption decision on n...

Continuos Search Queries

1 minute read

Answering bounded continuous search queries in the world wide web by Kukulenz and Ntoulas. This article present the application of optimal stopping theory t...

Reinforcement Learning

less than 1 minute read

A reinforcement learning approach to dynamic resource allocation ** by David Vengerov Another paper on the application of the concept of 'utility' to compute...

Optimal Stopping Reloaded

2 minute read

On the use of hybrid reinforcement learning for autonomic resource allocation *** by Teasuro et. al This paper elaborates on the use of reinforcement learnin...

STS Literature

3 minute read

We will divide the related literature on STS into two categories (i) literature on optimal stopping, and the foundations of the algorithm, and (ii) literatur...

Sentiment Detection

Social Networks

The DARPA Twitter Bot Challenge

1 minute read

Subrahmanian, V. S., A. Azaria, S. Durst, V. Kagan, A. Galstyan, K. Lerman, L. Zhu, E. Ferrara, A. Flammini, and F. Menczer. The DARPA Twitter Bot Challenge....

Predicting the Future with Social Media

2 minute read

by Asur, S., & Huberman, B. A. (2010). IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)</p> <...

Collective Intelligence

less than 1 minute read

by Segaran Toby An excellent guide to programming Web 2.0 applications, with code examples and excellent explanations of the used techniques. Similarity Metr...

Social Phishing

less than 1 minute read

by Tom N. Jagatic, Nathaniel A. Johnson, Markus Jakobsson, and Filippo Menczer This article covers various aspects of phishing attacks luring users into disc...

Online socialising…

less than 1 minute read

...an article I got from Walter without any reference to the author. Two facts the article mentioned I found particular interesting: News Corp. was paying a...

What Anyone Can Know

less than 1 minute read

- The Privacy Risks of Social Networking Sites by David Rosenblum This article describes the professional and personal risks, bad privacy and a wrong expecta...

Statistics

It Probably Works

3 minute read

Mcmullen, T. (2015). It Probably Works. Commun. ACM, 58(11), 50—54. http://doi.org/10.1145/2814332 Introduction This article distinguishes between thre...

Talks

Thesis

Noisy text analytics for sentiment analysis

1 minute read

Die Valenz (Polarity; Sentiment; Semantic Orientation) eines Dokumentes definiert, ob dieses eine positive oder negative Polarität beziehungsweise Berichters...

Uncategorized

Most Influential Data Mining Algorithms

less than 1 minute read

As ranked by the IEEE International Conference on Data Mining2006 (ICDM 2006) C4.5 k-means support vector machines (SVM) Apriori expectation maximization (...

Evaluation in Information Retrieval

2 minute read

Manning, C.D., Raghavan, P. & Schütze, H., 2008. Introduction to Information Retrieval 1st ed., Cambridge University Press. Chapter 8 - Evaluation in inf...

What makes a helpful online Review?

less than 1 minute read

Mudambi and Schuff This article presents a model for the helpfulness of customer reviews which is verified based on Amazon reviews. The article is quite inte...

KDIR causal knowledge

less than 1 minute read

</p> Automatic identification of quasi-experimental designs for discovering causal knowledge by Jensen et. al</p> Introduction </p> blac...

Socrates…

less than 1 minute read

...described how his dialectic works. A precondition for being able to talk about 'things' is that we define those things (in contrast to the sophists, who u...

Utilification

less than 1 minute read

by John Wilkes, Jeffrey Mogul, and Jaap Suermondt This paper elaborates on utilitfication - the transfer of applications to utility computing. Utility comput...

Designing a Better Shopbot

2 minute read

This paper describes optimizing the design of a shopbot (=shopping robot) which considers, the intrinsic value of the product, the disutility from waiting, ...

Information Diffusion

less than 1 minute read

by Dimitry Zibold The following article summarizes some interesting aspects from Dimitry's research: A Shingle is a contiguous sub-sequence of tokens in a d...

Getting Better Search Results

less than 1 minute read

Human-aided filtering can make the difference (by Bob Zeidman). Bob presents in this article how human-aided filtering can improve filtering accuracy. At fir...

Build, Sharing, and Merging Ontologies

less than 1 minute read

by John F. Sowa An excellent article (available online here) I got recommended by on of the reviewer of our "ontology evolution paper". The paper starts with...

Record Matching in Digital Library Metadata

less than 1 minute read

Kan, Min-Yen and Tan, Yee Fan: Record Matching in Digital Library Metdata, Communications of the ACM, Volume 51 (2), 91-94 The article provides an excellent ...

Data Mining…

less than 1 minute read

Google Tech Talk; Data1 Some interesting stuff we learned in the course

User Evaluation

Web Intelligence

It Probably Works

3 minute read

Mcmullen, T. (2015). It Probably Works. Commun. ACM, 58(11), 50—54. http://doi.org/10.1145/2814332 Introduction This article distinguishes between thre...

Catching a Viral Video

2 minute read

Broxton, Tom, Yannet Interian, Jon Vaver, and Mirjam Wattenhofer. Catching a Viral Video. Journal of Intelligent Information Systems 40, no. 2 (April 1, 201...

Web Intelligence Applications

1 minute read

This article collects real world use cases of Web and Business Intelligence applications. Use Cases Business Intelligence: companies use their own data sourc...

Text-Mining the Voice of the People

1 minute read

Evangelopoulos, N., & Visinescu, L. (2012). Text-mining the voice of the people. Communications of the ACM, 55(2), 62. doi:10.1145/2076450.2076467 This a...

Predicting the Future with Social Media

2 minute read

by Asur, S., & Huberman, B. A. (2010). IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)</p> <...

Data Mining for Web Intelligence

less than 1 minute read

by Han and Chen-Chuan This article discusses data mining as key technology for bringing intelligence and direction to our Web interactions. At first they dis...

Web Services