Thomas François

Assistant Professor at the Université catholique de Louvain

Applied linguistics, Natural language processing

  

Welcome to my homepage !

I am Thomas François, an Assistant Professor at UCLouvain since September, the 1st, 2018. My work focuses on the issue efficient communication in various contexts (mostly professional).

Previously, I have done research in the fields of readability, text simplification, automatic complex word identification, all topics that are concerned with the assessment of language difficulty for various kind of publics (learners of a foreign language, readers with some language issue, readers of speciality texts, etc.) and with different aims.

Brief description of my research path

I first achieved a Ph.D. Thesis at the Centre for Natural Language Processing (CENTAL, UCLouvain) that investigated the use of NLP technologies to enhance the assessment of text readability for French as a foreign language (FFL). A brief description of my Ph.D. project can be found on this page.

After the Ph.D., I have spent a one-year research stay at IRCS (University of Pennsylvania) as a B.A.E.F. and Fulbright Fellow.

As a follow up, I have benefited from several post-doctoral research scholarships at CENTAL:

  • FNRS post-doc researcher (2015-2018): developing the DMesure project, which aims to provide tools to automatically analyse the readability of texts intended for L2 French in a more detailled manner.
  • iMediate (2014-2015): Automatic annotation and classification of medical texts
  • SPORTIC (2010-2013): Generation of automatic audiovisual sport summaries

I was also appointed as an Invited Associate Professor in Applied Linguistics at UCLouvain (2013-2015).

To discover some of my research, have a look at this video.

Achievements and projects

Here will be listed some external links to resources or tools developped during the different research projects I managed.

- CEFRLex, an international project aiming at developing graded lexicons for various European languages as a foreign language (based on the CEFR scale). It includes:

  • FLELex, a graded lexicon for French as a foreign language (based on the CEFR scale).
  • SVALex, a graded lexicon for Swedish as a foreign language (based on the CEFR scale).
  • SweLLex, a second language learners' productive graded vocabulary (based on the CEFR scale).
  • EFLLex, a graded lexicon for English as a foreign language (based on the CEFR scale).
  • NT2Lex, a graded lexicon for Dutch as a foreign language (based on the CEFR scale).

- AMesure, an on-line readability formula for French administrative texts.

- ReSyf, a lexical resource for French with graded and disambiguated synonyms.

- Dicaupro, a plateform to provide access to a dictionnary of proverbs.

- a bibliography about the readability studies, selected from my PhD. thesis bibliography.


Here is a list of grants that I got as the main investigator:

  • FSR Seed Funding (2018-2022): Ph.D. scholarship on the following topic: "Simplification des textes de spécialité : d'une analyse des pratiques de rédaction à une plateforme d'aide à la rédaction claire."


Here is a list of grants that I got (in collaboration with Prof. Cédrick Fairon):

  • ALECTOR (2016-2020): Aide à la LECTure pour améliORer l'accès aux documents pour enfants dyslexiques (ANR project)
  • E-DMesure (2015-2018): Explicit knowledge-based models to predict the readability of L2 texts (Post-doc FNRS)
  • AMesure (2012, 2016, and 2017): Automatic evaluation of administrative texts
  • ALLuSIF (2013-2014): readability analysis and text simplification for French

Research interests

  • Business communication: efficient oral and written communication in professional contexts
  • Psycholinguistics: reading in a first and second language, reading comprehension testing
  • Text readability in L1 and L2
  • Automatic text simplification (ATS)
  • Data Mining: classification techniques (linear and logistic regression, KNN, decision trees, boosting, bagging, random forests, SVM...), clustering, deep learning, ...
  • Computational Linguistics: language modeling, finite-state automata, tagging and syntactic parsing
  • Statistics: descriptive and inferential statistics
  • French as a foreign language (FFL): teaching and didactics
  • iCALL: dialogue systems, automatic complex word identification
  • Programming languages: Python, R, Java, Perl, HTML, PHP, Javascript, SQL

Education

  • [2007 - 2011]

  • PhD, Computational Linguistics
    UCLouvain, Louvain-la-Neuve, Belgium

  • [2005 - 2007]

  • M.A. (Master, 1st year), Computational Linguistics (Summa cum laude)
    UCLouvain, Louvain-la-Neuve, Belgium

  • [2005 - 2006]

  • M.Res. (DES), French as a Second Language (Magna cum laude)
    UCLouvain, Louvain-la-Neuve, Belgium

  • [2002 - 2005]

  • M.A. (Master), Romance Philology (Magna cum laude),
    UCLouvain, Louvain-la-Neuve, Belgium

  • [2002 - 2004]

  • B.A. (DEC), English Philology (Cum laude)
    UCLouvain, Louvain-la-Neuve, Belgium

  • [2000 - 2002]

  • B.A. (Bach), Romance Philology (Cum laude)
    UCLouvain, Louvain-la-Neuve, Belgium

    Publications

    Books and Monographs

  • [2014]

  • François, T. et Bernhard, D. (eds.) Recent Advances in Automatic Readability Assessment and Text Simplification. In International Journal of Applied Linguistics (Special issue), 165:2, John Benjamins.

    Journal papers

  • [2018]

  • Gala, N. François, T., Javourey-Drevet, D., Ziegler, J.C. La simplification de textes, une aide à l'apprentissage de la lecture In Langue française 199 (3), pp. 123-131, Armand Colin.

  • [2017]

  • Todirascu, A., François, T., Bernhard, D., Gala, N., Ligozat, A.-L., Khobzi, R. Chaînes de référence et lisibilité des textes : Le projet ALLuSIF In Langue française 195 (3), pp. 35-52, Armand Colin.

  • [2015]

  • François, T. When readability meets computational linguistics: a new paradigm in readability. In Revue Française de Linguistique Appliquée, 20(2), 79-97.

  • [2014]

  • François, T. et Bernhard, D. When text readability meets automatic text simplification. In François, T. et Bernhard, D. (eds.). Recent Advances in Automatic Readability Assessment and Text Simplification John Benjamins, pp. 89-96.

  • [2013]

  • François, T. et Fairon, C. Les apports du TAL à la lisibilité du français langue étrangère. In Traitement Automatique des Langues (TAL), vol. 54(1): 171-202.

  • [2011]

  • François T., La lisibilité computationnelle : un renouveau pour la lisibilité du français langue première et seconde ? In International Journal of Applied Linguistics (ITL), vol. 160, 75-99.

  • [2011]

  • Cougnon L.-A. et François T., étudier l'écrit SMS. Un objectif du projet sms4science In Stähli, A. and Dûrscheid, C. and Béguelin, M.-J. (eds.). La communication par SMS en Suisse. Usages et variétés linguistiques (Linguistik Online) .

  • [2006]

  • Thonet A., Romain F., Rivera, J.-D. et François T., Des possibilités de l'enseignement du FLE en Syrie, août 2005 : compte rendu didactique, In Français 2000, 201-202 : 177-181.

    Proceedings in International Conference Peer-reviewed

  • [2018]

  • Billami, M. and François, T. and Gala, N. ReSyf: a French lexicon with ranked synonyms In Proceedings of COLING 2018, August, 20-26, Santa Fe, USA, pp.2570-2581.

  • [2018]

  • Tack, A. and François, T. and Desmet, P. and Fairon, C. NT2Lex: A CEFR-Graded Lexical Resource for Dutch as a Foreign Language Linked to Open Dutch WordNet In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications (NAACL 2018).

  • [2018]

  • Dürlich, L. and François, T. EFLLex: A Graded Lexical Resource for Learners of English as a Foreign Language In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan, 7-12 May.

  • [2017]

  • Volodina, E., Borin, L., Pilán, I., François, T. et Tack, A. En andraspräksordlista med CEFR-niväer. In Svenskans beskrivning 35. Eds: Sköldberg, E., Andréasson, M., Adamsson Eryd, H., Lindahl, F., Lindström, S., Prentice, J. and Sandberg, M. Gothenburg. pp. 369-382.

  • [2017]

  • Tack, A. and François, T. and Roekhaut, S. and Fairon, C. Human and Automated CEFR-based Grading of Short Answers In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (EMNLP 2017).

  • [2016]

  • Volodina, E. and Pilán, I. and Llozhi, L. and Degryse, B. and François, T. SweLLex: second language learners' productive vocabulary In Proceedings of the joint 5th NLP4CALL and 1st NLP4LA workshops (SLTC 2016), November 16, Umea, Sweden.

  • [2016]

  • Todirascu, A. and François, T. and Bernhard, D. and Gala, N. and Ligozat, A.-L. Are Cohesive Features Relevant for Text Readability Evaluation? In Proceedings of COLING 2016, December 13-16, Osaka, Japan.

  • [2016]

  • Lemaire, N., and François, T. and Debongnie, J.-C. and De Meyere, D. and Fauquert, B. and Klein, T. and Fairon, T. and Van Campenhoudt, M. L'enrichissement terminologique d'usage du projet iMediate : une collaboration tripartite terminologie/TAL/sciences de la santé In Actes du Second Congrès international du Réseau de Lexicographie (5-7 octobre 2015), Universidade de Santiago de Compostela. Ibéroamericana.

  • [2016]

  • Mûller, A. and François, T. and Roekhaut, S. and Fairon, C. Classification automatique de dictées selon leur niveau de difficulté de compréhension et orthographique In Actes de la 23e Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2016). Paris, France, 4-8 July.

  • [2016]

  • Tack, A. and François, T. and Ligozat, A.-L., and Fairon, C. Modèles adaptatifs pour prédire automatiquement la compétence lexicale d'un apprenant de français langue étrangère In Actes de la 23e Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2016). Paris, France, 4-8 July.
    Award: Best Paper TALN 2016

  • [2016]

  • François, T. and Billami, M.B., and Gala, N. and Bernhard, D. Bleu, contusion, ecchymose : tri automatique de synonymes en fonction de leur difficulté de lecture et compréhension In Actes de la 23e Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2016). Paris, France, 4-8 July.

  • [2016]

  • François, T. and Volodina, E. and Pilán, I. and Tack, A. SVALex: a CEFR-graded lexical resource for Swedish foreign and second language learners In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016). Portoroz, Slovenia, 23-28 May, pp. 213-219.

  • [2016]

  • Brognaux, S. and François, T. and Saerens, M. Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016). Portoroz, Slovenia, 23-28 May, pp. 3872-3879.

  • [2016]

  • Tack, A. and François, T. and Ligozat, A.-L., and Fairon, C. Evaluating Lexical Simplification and Vocabulary Knowledge for Learners of French: Possibilities of Using the FLELex Resource In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016). Portoroz, Slovenia, 23-28 May, pp. 230-236.

  • [2015]

  • De Meyere, D. and Klein, T. and François, T. and Debongnie, J.-C. and Radulescu, C. and Mbengo, N. and Ouro Kouro, M. and Coppieters't Wallant, Y. and Fairon, C. Automatic annotation of medical reports using SNOMED-CT: a flexible approach based on medical knowledge databases In Proceedings of the 7th Language Technology Conference (LTC2015).

  • [2015]

  • Bibauw, S. and François, T. and Desmet, P. Dialogue-based CALL: an overview of existing research In F. Helm, L. Bradley, M. Guarda, and S. Thouësny (Eds), Critical CALL ? Proceedings of the 2015 EUROCALL Conference, Padova, Italy (pp. 1-8).

  • [2015]

  • Gala, N. and Billami, M. B. and François, T. and Bernhard, D. Graded lexicons: new resources for educational purposes and much more. In F. Helm, L. Bradley, M. Guarda, and S. Thouësny (Eds), Critical CALL ? Proceedings of the 2015 EUROCALL Conference, Padova, Italy (pp. 1-6).

  • [2014]

  • François, T. An analysis of a French as a Foreign language corpus for readability assessment In Proceedings of the 3rd workshop on NLP for Computer-assisted Language Learning, NEALT Proceedings Series 22 / Linköping Electronic Conference Porceedings 107: 13-32.

  • [2014]

  • François, T. et Brouwers, L. et Naets, H. et Fairon, C. AMesure: une formule de lisibilité pour les textes administratifs In Actes de la 21e Conférence sur le Traitement automatique des Langues Naturelles (TALN 2014), Marseille, 467-472.

  • [2014]

  • Gala, N. et François, T. et Bernhard, D. et Fairon, C. Un modèle pour prédire la complexité lexicale et graduer les mots In Actes de la 21e Conférence sur le Traitement automatique des Langues Naturelles (TALN 2014), Marseille, 91-102.

  • [2014]

  • Brouwers, L. et Bernhard, D. et Ligozat, A.-L. et François, T. Syntactic Sentence Simplification for French In the 3rd International Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR 2014). Gothenburg, Sweden, 27 April.

  • [2014]

  • François, T., Gala, N., Watrin, P. et Fairon, C. FLELex: a graded lexical resource for French foreign learners In the 9th International Conference on Language Resources and Evaluation (LREC 2014). Reykjavik, Iceland, 26-31 May.

  • [2014]

  • Pho, V.-M., André, T., Ligozat, A.L., Grau, B., Illouz, G. et François, T. Multiple Choice Question Corpus Analysis for Distractor Characterization In the 9th International Conference on Language Resources and Evaluation (LREC 2014). Reykjavik, Iceland, 26-31 May.

  • [2013]

  • Gala, N., François, T. et Fairon, C. Towards a French lexicon with difficulty measures: NLP helping to bridge the gap between traditional dictionaries and specialized lexicons. In Proceedings of Electronic lexicography in the 21st century: thinking outside the paper (eLEX-2013). Tallinn, Estonia, octobre 2013.

  • [2013]

  • Todirascu, A. et François, T. et Gala, N. et Fairon, C. et Ligozat, A.-L. et Bernhard, D. Coherence and Cohesion for the Assessment of Text Readability In Proceedings of 10th International Workshop on Natural Language Processing and Cognitive Science (NLPCS 2013), 11-19.

  • [2013]

  • Boubel, N. et François, T. et Naets, H. Automatic extraction of contextual valence shifters In Proceedings of Recent Advances in Natural Language Processing (RANLP 2013).

  • [2012]

  • François, T. et Fairon, C. An ?AI readability? formula for French as a foreign language In Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP 2012), Jeju, 466-477.

  • [2012]

  • François, T. et Miltsakaki, E. Do NLP and machine learning improve traditional readability formulas? In Proceedings of the First Workshop on Predicting and improving text readability for target reader populations (PITR2012), Montréal, June 7, 49-57.

  • [2012]

  • Brouwers, L. et Bernhard, D. et Ligozat, A.-L. et François, T. Simplification syntaxique de phrases pour le français In Actes de la conférence conjointe JEP-TALN-RECITAL 2012, volume 2: TALN, pages 211?224, Montpellier.

  • [2011]

  • François T. et Watrin, P. On the Contribution of MWE-based Features to a Readability Formula for French as a Foreign Language In Proceedings of Recent Advances in Natural Language Processing (RANLP 2011), Hissar, September 14-16, 441-447.

  • [2011]

  • Watrin, P. et François T. An N-gram frequency database reference to handle MWE extraction in NLP applications In Proceedings of the 2011 Workshop on MultiWord Expressions: from Parsing and Generation to the Real World (ACL Workshop), Portland, Oregon, June 23, 2011, 83-91.

  • [2011]

  • François T. et Watrin, P., Quel apport des unités polylexicales dans une formule de lisibilité pour le français langue étrangère ? In Actes de la 18e Conférence sur le Traitement automatique des Langues Naturelles (TALN 2011), Montpellier, vol. 2, 15-20.

  • [2010]

  • Goldman, J.-P. François T., Roekhaut, S. et Simon, A.C. étude statistique de la durée pausale dans différents styles de parole In Actes des 28èmes journées d'étude sur la parole (JEP), Mons, Belgique, 25-28 mai 2010, 161-164.

  • [2010]

  • Cougnon L.-A. et François T., Quelques contributions des statistiques à l'analyse sociolinguistique d'un corpus de SMS In Proceedings of 10th International Conference JADT, 9-11 juin 2010, Sapienza University of Rome, volume 1, 619-630.

  • [2009]

  • François T., Modèles statistiques pour l'estimation automatique de la dif?culté de textes de FLE, In Rencontre des Etudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2009), Senlis, 24-26/06/2009.

  • [2009]

  • François T., Combining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL, In Proceedings of the EACL 2009 Student Research Workshop, Athens, 2 April 2009, 19-27 (corrected version).
    Original version.

    Invited talk

  • [2016]

  • Aberdeen, August, 29, 2016, Invited lecture to One-Day NLP/NLG Workshop, University of Aberdeen
    Lecture: "Various NLP approaches to assess lexical difficulty for reading"

  • [2015]

  • Mainz, August, 27, 2015, Invited lecture to the DRV-Sommerschule 2015 (August 24-28, 2015), Johannes Gutenberg-Universität Mainz
    Lecture: "Informing linguistics with statistics: the case of the SMS corpora"

  • [2015]

  • Aberdeen, July 23, 2015, Invited lecture to the NLG Summer School 2015 (July 20-24, 2015), University of Aberdeen
    Lecture: "Readability: a one-hundred-year-old field still in his teens"

  • [2012]

  • Grenoble, June 04, 2012, Invited talk as recipient of the Thesis Award ATALA 2012, at Conference JEP-TALN-RECITAL 2012, Université Stendhal.
    Talk: "Les apports du traitement automatique du langage à la lisibilité du français langue étrangère?

    Oral communications (in conferences, seminars, workshops, etc.)

  • [2018]

  • Bruges, July, 6, 2018, Conference CALL 2018
    Auteurs : Thomas François, Núria Gala, Elena Volodina, AnaÏs Tack, Ildikó Pilán, Luise Dürlich, Patrick Watrin, Piet Desmet and Cédrick Fairon
    Communication : "The CEFRLex project: multilingual CEFRLex graded lexical resources for foreign language learning, teaching and research" (to come)

  • [2018]

  • Bruges, July, 5, 2018, Conference CALL 2018
    Auteurs : Serge Bibauw, Thomas François and Piet Desmet
    Communication : "Insights from a multilevel meta-analysis on the effectiveness of dialogue-based CALL" (to come)

  • [2018]

  • Bruges, July, 4, 2018, Conference CALL 2018
    Auteurs : AnaÏs Tack, Thomas François, Piet Desmet and Cédrick Fairon
    Communication : "Making Sense of L2 Lexical Complexity with NT2Lex, a CEFR-graded Lexicon Linked to Open Dutch WordNet" (to come)

  • [2018]

  • Louvain-la-Neuve, May 29, 2018, Formation SMCS (Text Mining)
    Communication : "Pré-traitements des données : quelles informations extraire ?"

  • [2018]

  • Fukuoka, May 16, 2018, Seinan Gakuin University
    Communication : "Le traitement automatique du langage au service de l'apprentissage des langues étrangères : 10 ans de recherches au Cental"

  • [2018]

  • Fukuoka, May 15, 2018, Fukuoka University
    Communication : "The CEFRLex project: multilingual CEFR-graded lexical resources"

  • [2018]

  • Besançon, April 27, 2018, (Seminar of the Lucien Tesnière team).
    Communication : "Introduction aux approches automatisées pour évaluer la difficulté du langage"

  • [2017]

  • Aix-en-Provence, December 15, 2017, LPL (Seminar of the LPL).
    Communication : "Approches automatiques pour l'évaluation de la difficulté du langage" (Podcast de la conférence)

  • [2017]

  • Paris, March 28, 2017, INALCO (Seminar of the ERTIM team).
    Communication : "Des formules de lisibilité à l'analyse de la difficulté"

  • [2017]

  • Leuven, February 10, 2017, Computational Linguistics in the Netherlands 27 (CLIN 2017).
    Auteurs : AnaÏs Tack, Thomas François, Piet Desmet and Cédrick Fairon
    Communication : "Introducing NT2Lex: A Machine-readable CEFR-graded Lexical Resource for Dutch as a Foreign Language"

  • [2016]

  • Darmstadt, September 15, 2016, Technische Universität Darmstadt (Seminar).
    Communication : "NLP methods for assessing lexical difficulty"

  • [2016]

  • Marseille, June 26, 2016, Aix Marseille University (Seminaire du LIF).
    Communication : "Méthodes TAL pour la prédiction automatisée de la difficulté lexicale"

  • [2016]

  • New-York, June 15-18, 2016, ICSB World Conference
    Auteurs : Benoit Gailly, Mahamadou Biga-Diambeidou, Christian Gnekpe, Thomas François, Patrick Watrin, and Françoise de Viron
    Communication : "Using Natural Language Processing in Entrepreneurship Research: Exploring the Perceived Benefits of Business Incubation"

  • [2016]

  • East Lansing, May 10-14, 2016, Presentation at the CALICO Conference 2016, Michigan State University,
    Auteurs : Serge Bibauw, Thomas François, Piet Desmet
    Communication : "Instructional design and natural language processing in dialogue-based CALL"

  • [2016]

  • Fukuoka, January 13, 2016, Fukuoka University
    Communication : "La prédiction automatisée de la difficulté lexicale en FLE"

  • [2015]

  • Lille, December 10, 2015, Lille 3 University (invited lecture)
    Communication : "Apports du TAL à l'ALAO: un exemple concret avec FLELex"

  • [2015]

  • Santiago de Compostela, October 5-7, 2015, 2nd International Congres RELEX (RELEX2015)
    Auteurs : Nathalie Lemaire, Thomas François, Jean-Claude Debongnie, Damien De Meyere, Benjamin Fauquert, Thierry Klein, Cédrick Fairon, Marc Van Campenhoudt
    Communication : "L'enrichissement terminologique d'usage du projet iMediate: une collaboration tripartite terminologie/TALN/sciences de la santé."

  • [2015]

  • Louvain-la-Neuve, November 6, 2015, Université catholique de Louvain (Séminaire du CENTAL)
    Communication : "La prédiction automatisée de la difficulté lexicale par la combinaison de ressources et de méthodes d'apprentissage automatisé."

  • [2015]

  • Herstmonceux castle, UK, August 13, 2015, Presentation at the ENEL WG3 Meeting
    Auteurs : Elena Volodina, Ildikó Pilán, Thomas François
    Communication : "Introducing SVALex: a corpus-based lexical resource for second language learning"

  • [2015]

  • Boulder, May 25, 2015, Presentation at the CALICO Conference 2015, University of Colorado,
    Auteurs : Serge Bibauw, Thomas François, Piet Desmet
    Communication : "Conversational agents for language learning: state of the art and avenues for research on task-based agents"

  • [2015]

  • Gothenburg, February 5, 2015, University of Gothenburg
    Communication : "Dmesure and FLELex: two approaches of textual complexity for French as a foreign language"

  • [2015]

  • Fukuoka, January 7, 2015, Fukuoka University
    Communication : "Dmesure et FLELex: deux approches de la difficulté textuelle pour le français langue étrangère"

  • [2014]

  • Leuven, December, 17, 2014, Katholieke Universiteit Leuven (KUL) (to come)
    Communication : "Assessing the lexical complexity in French: FLELex and ReSyF"

  • [2014]

  • Tûbingen, July, 11, 2014, Universität Tûbingen (Seminar at SFS).
    Communication : "Challenges for specializing readability formulas: a case study for administrative texts"

  • [2014]

  • Marseille, March 25, 2014, Aix Marseille University (Seminaire du LIF).
    Communication : "La lisibilité computationnelle : limites et défis"

  • [2014]

  • Paris, January 14, 2014, Maison de la Recherche, Université Paris-Sorbonne (journée d'étude).
    Communication (in collaboration with Boubel, N.) : "étude linguistique des phénomènes de modification de polarité dans le domaine de la fouille d'opinion."

  • [2013]

  • Louvain-la-Neuve, November 15, 2013, Université catholique de Louvain (Séminaire du CORE).
    Communication : "La lisibilité computationnelle : les apports du TAL et de l'apprentissage automatisé à la lisibilité"

  • [2013]

  • Louvain-la-Neuve, October 4, 2013, Université catholique de Louvain (Séminaire du CENTAL).
    Communication : "Le TAL pour l'assistance à la lecture : lisibilité et simplification automatique de textes"

  • [2013]

  • Lille, June 28, 2013, Université Lille 3 (Savoirs, Textes, Langage).
    Communication : "Le TAL et l'assistance à la lecture : lisibilité et simplification automatique de textes"

  • [2013]

  • Leuven, May 3, 2013, Katholieke Universiteit Leuven (KUL).
    Communication : "Computational readability: limitations and challenges"

  • [2012]

  • New York, September 21, 2012, City University of New York (CUNY).
    Communication : "Computational readability: need for a domain-oriented approach?"

  • [2012]

  • Montréal, June 08, 2012, Université du Québec à Montréal (UQAM).
    Communication : "Les apports du TAL à la lisibilité du FLE?.

  • [2012]

  • Philadelphia, February 09, 2012, "CLUNCH", University of Pennsylvania.
    Communication : "A readability formula for French as a foreign language?.

  • [2011]

  • Montpellier, September 29, 2011, "Séminaires sud4science, n°5", Maison des Sciences de l'Homme de Montpellier.
    Communication : "Une approche statistique des corpus de SMS : outils et défis?.

  • [2011]

  • Namur, May 21, 2011, "Journée des doctorants de l'école doctorale en langues et lettres", Facultés Universitaires Notre-Dame de la Paix.
    Communication : ?Dmesure : une plateforme internet pour la lisibilité du français langue étrangère?.

  • [2011]

  • Louvain-la-Neuve, May 13, 2011, "Séminaires du CENTAL", Université catholique de Louvain.
    Communication : ?Une formule de lisibilité computationnelle pour le français langue étrangère ou seconde?.

  • [2011]

  • Louvain-la-Neuve, February 28, 2011, "Séminaires de l'IL&C", Université catholique de Louvain.
    Communication : ?Dmesure : une plateforme de lisibilité pour le français langue étrangère?.

  • [2011]

  • Gand, February 11, 2011, "Computational Linguistics in the Netherlands" (CLIN21), University College Ghent.
    Communication (in collaboration with Naets, H.) : ?Dmesure: a readability platform for French as a foreign language?.

  • [2010]

  • Courtrai, November 20, 2010, "Séminaires de l'ITEC", Université KULeuven.
    Communication : "Dmesure: a readability formula for French as a foreign language".

  • [2009]

  • Grenoble, November 27, 2009, "Conférences Industries de la Langue", Université Stendhal.
    Communication : "Lisibilité du français langue étrangère et TAL : une manière de renouveau".

  • [2009]

  • Marne-la-Vallée, May 18, 2009, "Séminaire interne de linguistique de l'IGM", Université de Paris-Est Marne-la-Vallée.
    Communication : "Modèles statistiques pour l'estimation automatique de la difficulté lexicale et syntaxique en FLE".

  • [2008]

  • Bruxelles, May 17, 2008, "Journée des doctorants de l'école doctorale en langues et lettres", Université libre de Bruxelles (ULB).
    Communication : "Prédire automatiquement la difficulté d'exercices à trous pour des apprenants FLE: une approche TAL".

    Ph.D. Thesis

  • [2011]

  • François T., Les apports du traitement automatique du langage à la lisibilité du français langue étrangère , Ph.D. Thesis, Université Catholique de Louvain. Thesis Supervisors : Cédrick Fairon and Anne Catherine Simon.

    Master Thesis

  • [2006]

  • François T., L'apprentissage des pronoms appellatifs qui régissent la rencontre francophone (France ou Belgique) à des Espagnols dans le cadre du cours de FLE, Master's Thesis, Université Catholique de Louvain. Thesis Supervisors : Luc Collès and Geneviève Fabry.

  • [2005]

  • François T., La symbolique des couleurs dans "Madame Bovary" et "la Regenta", Master's Thesis, Université Catholique de Louvain. Thesis Supervisor : Jean-Claude Polet.

    Professional experience

  • [09/2018 - 08/2022]

  • Faculty of Philosophy, Arts and Letters, UCLouvain, Belgium
    Chargé de cours (Assistant Professor) at UCLouvain

  • [10/2015 - 08/2018]

  • CENTAL, UCLouvain, Belgium
    Chargé de recherche FNRS (Postdoc researcher) at CENTAL

  • [10/2013 - 09/2018]

  • Faculty of Philosophy, Arts and Letters, UCLouvain, Belgium
    Invited Associate Professor

  • [05/2013 - 09/2015]

  • CENTAL, UCLouvain, Belgium
    Computational linguist and computer scientist at CENTAL

  • [12/2012 - 09/2015]

  • CENTAL, UCLouvain, Belgium
    Post-doc researcher at CENTAL

  • [11/2011 - 11/2012]

  • Institute for Research in Cognitive Science, Philadelphia, United States
    B.A.E.F. and Fulbright Postdoc Fellow at University of Pennsylvania

  • [10/2007 - 09/2011]

  • Aspirant F.N.R.S.,Louvain-la-Neuve, Belgium
    Ph.D. Student at UCLouvain

  • [09/2006 - 06/2007]

  • Fukuoka University, Fukuoka, Japan
    French as a second language assistant

  • [01/2007 - 06/2007]

  • Institut Franco-japonais du Kyushu, Fukuoka, Japan
    French and Spanish as second languages teacher

  • [10/2005 - 02/2006]

  • Académie des Langues, Marche-en-Famenne, Belgium
    Spanish as a second language teacher

  • [11/2005]

  • Institut Saint-Laurent (High School), Marche-en-Famenne, Belgium
    French as a first language teacher

  • [08/2005 and 07/2006]

  • Bishopric of Hassake, Syria
    French as a second language teacher

  • [07/2005]

  • ASBL Roeland, Virton
    French as a second language teacher

    Scientific activities

    • [2018] : Member of the Programme Committee of the conferences ACL 2018, Coling 2018, NAACL-SRW 2018, TALN 2018, RJC 2018, BEA Workshop 2018 (NAACL 2018), ATA-2018, NLP4CALL 2018.

    • [2018] : Guest reviewer for the LRA journal (Language Resources and Evaluation), Computational Linguistics (CL), TAL journal, Languages, the Journal of Artificial Intelligence Research (JAIR).

    • [2017-2019] : Member of the Editorial Board of ITL (International Journal of Applied Linguistics).

    • [2017] : Reviewer for the French ANR programme and the Swedish Riksbankens Jubileumsfond.

    • [2017] : Member of the Programme Committee of the conferences TALN 2017, RECITAL 2017, AIST'2017, EACL Student Workshop 2017, BEA Workshop 2017 (EMNLP), (Dis)Fluency2017, the MUMTTT 2017 workshop.

    • [2017] : Guest reviewer for ACM - Transactions on Accessible Computing, International Journal of Applied Linguistics (ITL), DoRiF journal.

    • [2016] : Organisation of the Workshop CL4LC 2016 (COLING).

    • [2016] : Member of the Programme Committee of TALN 2016, RECITAL 2016, NAACL 2016 Student Research workshop, BEA Workshop 2016 (NAACL), CORIA 2016, AIST' 2016, CL4LC Workshop, Joint 5th NLP4CALL and 1st NLP4LA, and the CBL day.

    • [2016] : Guest reviewer for various journals or series: International Journal of Applied Linguistics (ITL), Digital Scholarship in the Humanities, IEEE Transactions on Audio, Speech and Language Processing, and Morgan and Claypool (Synthesis Lectures on Human Technologies)

    • [2015] : Member of the Programme Committee of NAACL Student Workshop 2015, RECITAL 2015, NLP4CALL 2015, AIST'2015, the MUMTTT 2015 workshop, and the 10th BEA Workshop 2015 (NAACL 2015)

    • [2015] : Guest reviewer for various journals or series: TAL, TRANEL, NPSS, Morgan and Claypool (Synthesis Lectures on Human Technologies)

    • [2014] : Guest reviewer for the TAL journal in 2014

    • [2014] : Member of the Jury for the Prix de la thèse ATALA 2014 and the Jury for the Best paper Award at RECITAL 2014

    • [2014] : Member of the Programme Committee of the ACL Student Workshop 2014, NLP4CALL 2014, RECITAL 2014, the BEA Workshop 2014 (ACL), the SLPAT Workshop 2014 (ACL), AIST'2014, CEDIL 2014 (Colloque International des Etudiants Chercheurs), the ATS-MA Workshop 2014 (COLING), and PITR 2014 (Workshop EACL)

    • [2014] : Guest Editor of a special issue of the ITL journal on readability and text simplification

    • [2013] : Co-Director of the Prix de la thèse ATALA 2013

    • [2013] : Guest reviewer for the Discours journal and for a chapter of the Cambridge Handbook of Learner Corpus Research.

    • [2013] : Member of the Programme Committee for PITR 2013 (Workshop ACL)

    • [2012] : Member of the Programme Committee for one special issue of Linguisticae Investigationes

    Awards and Fellowships

    Teaching

    Teaching (UCLouvain)

    • LROM 2670 : Les textes économiques et commerciaux en français : genres de discours et questions de terminologie (2018-2019)

    • LROM 2680 : Exercices oraux spécialisés en français (2018-2019)

    • LROM 2660 : Stratégies de communication orale dans l'entreprise : français (2017-2019)

    • LCLIG 2240 : Statistiques linguistiques (2014-2019)

    • LFIAL 2260 : Statistics for Linguistics (2016-2017)

    • LROM 1112 : Introduction aux études de langues et littératures françaises et romanes : courants, concepts et méthodes (2014-2015)

    • LFLTR 1530 : Introduction aux sciences du langage (2013-2014)

    • LCLIG 2250 : Méthodologie de l'analyse de corpus en linguistique (2013-2014)

    • LROM 1221 : Linguistique française I: analyses du français contemporain (2013-2014)

    Involvement in the following lectures (UCLouvain)

    • FLTR 2620 : Traitement automatique du langage naturel (2007-2011; 2012-2013)

    • CLIG 2140 : Séminaire de linguistique computationnelle (2010)

    • CLIG 2240 : Statistiques linguistiques (2008 and 2009)

    Direction of Ph.D. thesis

    • Adeline MÜLLER (2018-2022) in NLP and communication : "Vers une simplification automatique des textes de spécialité".

    • Julien ZHAKIA DOUEIHI (2018-2022) in FFL : "L'impact de la L1 sur l'apprentissage de la L2: une approche explicite pour l'apprentissage des verbes pronominaux français à l'université au Japon".

    • Anaïs TACK (2016-2020) in NLP and CALL: "Personalized modeling of non-native speaker's lexical-semantic knowledge".

    • Serge BIBAUW (2014-2019) in iCALL: "Effectiveness of dialog systems for language learning".

    • Laetitia BROUWERS (2013-2016) in automatic text simplification (ATS).

    Direction of master thesis (completed)

    • Jérémy CHELALA (2018). La compression de phrases pour le résumé.

    • Emile BEGUIN (2017). Exploration des segments en périphérie droite en français oral informel.

    • Adeline MULLER (2016). Classification automatique de dictées selon leur niveau de difficulté de compréhension et orthographique.

    • Anaïs TACK (2015). Models adaptatifs pour évaluer automatiquement la connaissance lexicale d'un apprenant de FLE.

    • Tanguy CHARLIER (2014). L'apprentissage semi-supervisé : une solution au problème du manque de données en lisibilité du français langue étrangère.

    • Thibault ANDRÉ (2014). Génération automatique de distracteurs dans le cadre de questionnaires à choix multiples.

    • Laetitia BROUWERS (2012). Simplification syntaxique de phrases pour le français langue étrangère.

    Contact information

    Thomas François
    Researcher at Cental

    Room C116
    Tel.: +32 (010) 47 37 36
    Fax.: +32 (010) 47 26 06
    thomas d0t francois At uclouvain d0t be

    Center for Natural Language Processing
    University of Louvain
    Place Blaise Pascal, 1, bte L3.03.12
    1348 Louvain-la-Neuve