WIQE-10 |
PhD Colloqium on Web Information and Quality EvaluationEscuela Técnica Superior de Ingeniería Informática
|
09:00-09:45 | Detection of Text Plagiarism and Web Vandalism Benno Stein [slides] |
09:45-10:30 | Using Web n-grams to Help Second-Language Speakers Martin Potthast [slides] |
10:30-11:15 | Detection of Cross-language Text Reuse Alberto Barrón-Cedeño [slides] |
11:15-11:45 | Coffee-break |
11:45-13:00 | Keynote - Entropy and Semantic: A Mathematical Approach to Authorship Attribution, Plagiarism Detection and Key Words Extraction Mirko Degli Esposti [slides] |
13:00-13:45 | Paraphrasing: Potential Applications for Plagiarism Detection Marta Vila [slides] |
13.45-14:15 | Wikipedia Aandalism: A First Attempt Santiago Mola |
14:15-16:00 | Lunch |
16:00-16:45 | The Impact of Toponym Disambiguation in Geographical Information Retrieval and Question Answering Davide Buscaldi |
16:45-17:30 | Making the most of a Web Search Session Matthias Hagen [slides] |
17:30-18:00 | Coffee-break |
18:00-19:00 | Discussion - Web Information and Quality Evaluation: On the Detection of Text Reuse, Plagiarism, Paraphrasing, and Wikipedia Vandalism |
Tuesday, September 14th, 2010
09:30-10:45 | Keynote - Visual Analysis of Unstructured Data Sets Michael Granitzer [slides] |
10:45-11:30 | Automatic Detection of Information Quality Flaws in Wikipedia Articles Maik Anderka [slides] |
11:30-12:00 | Coffee-break |
12:00-12:45 | On Filtering the Web Nedim Lipka [slides] |
12:45-13:30 | Networks, Crowds, and Markets: Reasoning about a Highly Connected Worl Tim Gollub [slides] |
13:30-14:15 | A General Bio-inspired Method to Improve the Short-text Clustering tas Diego Ingaramo, Marcelo Errecalde, Paolo Rosso [slides] |
14:15-16:00 | Lunch |
16:00-16:45 | Insight into Cluster Labeling Dennis Hoppe [slides] |
16:45-17:15 | A Semantic Role Labeling Application Lidia Moreno, Natividad Prieto |
17:15-18:00 | Feature Associations in Graph Structures for Unsupervised Entity Disambiguation Roman Kern [slides] |
18:00-18:30 | Drug-Drug Interaction Detection: A New Approach Based on Maximal Frequent Sequence Sandra García-Blasco [slides] |
18:30-19:00 | Coffee-break |
19:00-20:00 | Discussion - Web Information and Quality Evaluation: On Clustering and Labelling Information |
Wednesday, September 15th, 2010
09:00-09:45 | Cross-language Text Classifcation using Structural Correspondence Learning Peter Prettenhofer [slides] |
09:45-10:30 | Figurative Language Processing: Mining Underlying Knowledge from Social Media Antonio Reyes, Paolo Rosso [slides] |
10:30-11:15 | Assessing Information Quality Facets in Blogs and Web Pages Elisabeth Lex [slides] |
11:15-11:45 | Opinion Sharing via Ontology Matching Enrique Vallés |
11:45 – 12:15 | Coffee-break |
12:15 – 13:15 | Discussion - Web Information and Quality Evaluation: On the Classification of Objective and Subjective Information. |
Mission
WIQE motivation starts from the observation, that today's information and data pools on the Web focus on the quantity of information rather than its quality; a fact observable through the increasing size of the blogosphere, the number of growing artificially created data, the well established copy & paste syndrome and the lack of semantically enriched data. Intentional and unintentional information misuse like for example Wikipedia vandalism, Spam Blogs (Splogs), Plagiarism etc. further adds to the decrease in information quality on the Web.
The resulting decentralized, low quality of information yields to several problems:
- Information search requires robust methods for removing low quality, non-credible information.
- Judging quality, credibility, and reliability of information remains a manual, labour intensive task.
- Users can hardly estimate credibility of virtual persons to establish trusted relationships.
- Separating contradicting facts or outdated information from valuable information assets becomes a major chalenge for information systems on the Web.
- Information is redundantly stored and highly scatered among diferent places.
Overal, the Web today lacks quality dependent filter mechanisms, automatic identification of misuse paterns, as well as tools to establish user trust in information and authors. The aim of the workshop is to meet emerging challenges in our information-flooded society conducting both basic and applied research in the areas of information retrieval, datamining, and knowledge processing. The talks will cover a diverse set of research topics in the respective fields including document clustering, algorithmic approaches to information quality, plagiarism and text reuse, query formulation, and domain adaptation in natural language processing. The focus will be especialy on assessing information quality in the Web: the assessment of the quality of information is an important task because decisions are often based on information from multiple and sometimes unknown sources, though, the reliability and accuracy of the information is questionable.
Organizing Committee
Paolo Rosso
Alberto Barrón-Cedeño
Natural Language Engineering Lab. - ELiRF
Universidad Politécnica de Valencia, Spain
http://users.dsic.upv.es/grupos/nle
Benno Stein
Web Technology and Information Systems Group
Bauhaus-Universität Weimar, Germany
http://www.webis.de