WAC3 - 2007

Web as Corpus 2007, UCLouvain, Louvain-la-Neuve, September 15-16 2007 (Belgium)

 o Call for papers
 o Submit a paper
 o Registration
 o Program
 o Scientific committee
 o Travel info. & venue
 o Local organisation team
 o Associated events
 o Previous Workshops
 o Pictures

 o Information
 o Scientific committee


Outline Program

(download pdf or doc version)

Fri 14 September
20:00Welcome Cocktail

Sat 15 September
Auditorium Socrate, Place Cardinal Mercier, Louvain-la-Neuve
9:30 – 9:45Opening Session
9:45 – 10:45Invited Speaker:
The Crúbadán Project: Corpus building for under-resourced languages
Kevin Scannell,
Saint Louis University, USA
10:45 – 11:15Classifying Web corpora into domain and genre using automatic feature identification
Serge Sharoff,
University of Leeds, UK
Coffee break
11:45 – 12:15A Human Evaluation of Filtering Functions for Pattern-based Extraction of Arbitrary Relations from the Web
Sebastian Blohm & Philipp Cimiano,
University of Karlsruhe, Germany
12:15 – 12:45Identification of Languages and Encodings in a Multilingual Document
Anil Kumar Singh & Jagadeesh Gorla,
International Institute of Information Technology, Hyderabad, India
Lunch time
14:30 – 16:00
Overview, data preparation, scoring, results of Cleaneval
Marco Baroni, Francis Chantree, Adam Kilgarriff & Serge Sharoff
Coffee break
16:30 – 18:00
System descriptions

Dimanche 16 septembre
Amphithéâtre Socrate, Place Cardinal Mercier, Louvain-la-Neuve
9:00 – 10:00 Invited Speaker (to be confirmed) or Panel: "A WAC search engine"
10:00 – 10:30CorpEus, a 'web as corpus' tool designed for the agglutinative nature of Basque
Igor Leturia & Antton Gurrutxaga,
Elhuyar Foundation
Iñaki Alegria & Aitzol Ezeiza,
University of the Basque Country, Spain
10:30 – 11:00Implementing a BNC-Compare-able Web Corpus
William Fletcher,
United States Naval Academy, USA
Coffee break
11:30 – 12:00Yet another web crawler
Fabrice Issac,
Université Paris 13, France
12:00 – 12:30textBox : a Tool for Written Corpus Linguistic Investigation
Emmanuel Cartier,
Université Paris 13, France
Lunch time
14:00 – 15:30
Panel: Lessons Learned, future Cleanevals
15:30 – 16:00Closing Session

Invited speaker : Kevin Scannell

Kevin Scannell, of Saint Louis Univ., Missouri, USA, has been working with scholars of a range of smaller languages to develop web corpora for those languages : website currently lists 135 corpora/languages.

Worskshop Co-chairs

Prof. Cédrick Fairon, UCLouvain, Cental, fairon@tedm.ucl.ac.be
Prof. Gilles-Maurice de Schryver, Universiteit Gent, gillesmaurice.deschryver@ugent.be

Last update :  August, 2007