|
Outline Program
| Fri 14 September |
20:00 | Welcome Cocktail |
| Sat 15 September Auditorium Socrate, Place Cardinal Mercier, Louvain-la-Neuve |
9:30 – 9:45 | Opening Session |
9:45 – 10:45 | Invited Speaker: The Crúbadán Project: Corpus building for under-resourced languages Kevin Scannell,
Saint Louis University, USA |
10:45 – 11:15 | Classifying Web corpora into domain and genre using automatic feature identification Serge Sharoff,
University of Leeds, UK |
| Coffee break |
11:45 – 12:15 | A Human Evaluation of Filtering Functions for Pattern-based Extraction of Arbitrary Relations from the Web Sebastian Blohm & Philipp Cimiano,
University of Karlsruhe, Germany |
12:15 – 12:45 | Identification of Languages and Encodings in a Multilingual Document Anil Kumar Singh & Jagadeesh Gorla,
International Institute of Information Technology, Hyderabad, India |
| Lunch time |
14:30 – 16:00 Cleaneval | Overview, data preparation, scoring, results of Cleaneval Marco Baroni, Francis Chantree, Adam Kilgarriff & Serge Sharoff |
| Coffee break |
16:30 – 18:00 Cleaneval | System descriptions |
| Dimanche 16 septembre Amphithéâtre Socrate, Place Cardinal Mercier, Louvain-la-Neuve |
9:00 – 10:00 | Invited Speaker (to be confirmed) or Panel: "A WAC search engine" |
10:00 – 10:30 | CorpEus, a 'web as corpus' tool designed for the agglutinative nature of Basque Igor Leturia & Antton Gurrutxaga,
Elhuyar Foundation Iñaki Alegria & Aitzol Ezeiza,
University of the Basque Country, Spain |
10:30 – 11:00 | Implementing a BNC-Compare-able Web Corpus William Fletcher,
United States Naval Academy, USA |
| Coffee break |
11:30 – 12:00 | Yet another web crawler Fabrice Issac,
Université Paris 13, France |
12:00 – 12:30 | textBox : a Tool for Written Corpus Linguistic Investigation Emmanuel Cartier,
Université Paris 13, France |
| Lunch time |
14:00 – 15:30 Cleaneval | Panel: Lessons Learned, future Cleanevals |
15:30 – 16:00 | Closing Session |
|
|
Invited speaker : Kevin Scannell
Kevin Scannell, of Saint Louis Univ., Missouri, USA, has been working with scholars of a range of smaller languages to develop web corpora for those languages :
website currently lists 135 corpora/languages.
|
|
Worskshop Co-chairs
Prof. Cédrick Fairon, UCLouvain, Cental, fairon@tedm.ucl.ac.be
Prof. Gilles-Maurice de Schryver, Universiteit Gent, gillesmaurice.deschryver@ugent.be
|
|
Last update : August, 2007
|
|