|
|
Email:
Update: now working at Vlingo in Cambridge, MA
[ Research | Publications
| Theses | Online demos |
Data and software |
Talks | Teaching ]
[ Google Scholar | Linkedin profile | Young Researchers' Roundtable on Spoken
Dialogue Systems ]
I am a research associate at the Cambridge University Machine
Intelligence Lab, in the Dialogue Systems Group headed
by Prof. Steve Young.
I am currently working on the EU FP7 CLASSiC project (Computational Learning in Adaptive Systems for Spoken Conversation), which focuses on statistical methods for data-driven semantic parsing, dialogue management and natural language generation.
I completed my Ph.D. thesis in 2008 under the supervision of Prof. Marilyn Walker, at the Computer Science Department of the University of Sheffield, United Kingdom. I obtained a Master of
Engineering and Computer Science in 2004 from the Université Catholique de
Louvain in Belgium.
I have been working on statistical methods for natural language understanding, natural language generation and opinion mining. These problems require learning structured prediction models from a large amount of annotated data. I have been especially interested in crowdsourcing for collecting data, in order to model the wide range of speaking styles found in natural language.
Research Interests:
- Opinion mining from text and spoken utterances
- Learning to generate natural language from data
- Robust semantic parsing for spoken utterances
- Paraphrase acquisition from corpora
- Expressive language generation and text-to-speech synthesis
- Learning to detect mood, emotion and personality for user modelling
Journal articles (Google Scholar):
- François Mairesse and Marilyn
Walker. Controlling User Perceptions of
Linguistic Style: Trainable
Generation of Personality Traits
[PDF]
[Talk].
Computational Linguistics, 37(3), 2011.
- Kai Yu, Heiga Zen, François Mairesse and Steve Young. Context Adaptive Training with
Factorized Decision Trees for HMM based Statistical Parametric
Speech Synthesis [PDF]. Speech Communication, 53(6), pages
914-923, 2011.
- François Mairesse and Marilyn Walker. Towards Personality-Based User Adaptation:
Psychologically Informed Stylistic Language Generation [PDF] [Talk].
User Modeling and User-Adapted Interaction, 20(3), pages 227-278, 2010.
- Steve Young, Milica Gasic, Simon Keizer, François Mairesse, Jost
Schatzmann, Blaise Thomson and Kai Yu. The Hidden Information State Model: a
practical framework for POMDP-based spoken dialogue management [PDF] [PS] . Computer
Speech and Language, 24(2), pages 150-174, 2010.
- François Mairesse, Marilyn Walker, Matthias Mehl and Roger
Moore. Using Linguistic Cues for the Automatic Recognition of
Personality in Conversation and Text [PDF] [PS]
[BibTeX] [Talk].
Journal of Artificial Intelligence Research (JAIR), 30, pages 457-500, 2007.
- Marilyn Walker, Amanda Stent, François Mairesse and Rashmi Prasad. Individual and Domain Adaptation
in Sentence Planning for Dialogue [PDF] [PS] [BibTeX]. Journal of
Artificial Intelligence Research (JAIR), 30, pages 413-456, 2007.
Peer-reviewed publications at international conferences:
- F. Mairesse, J. Polifroni and G. Di Fabbrizio. Can Prosody Inform Sentiment
Analysis? Experiments on Short Spoken Reviews. In Proceedings
of ICASSP, Kyoto, March 2012.
- J. Polifroni and F. Mairesse. Using Latent Topic Features
for Named Entity Extraction in Search Queries. In Proceedings
of Interspeech, Florence, August 2011.
- F. Jurcicek, S. Keizer, M. Gasic, F. Mairesse, B. Thomson,
K. Yu, and S. Young. Real User
Evaluation of Spoken Dialogue Systems using Amazon Mechanical
Turk. In Proceedings of Interspeech, Florence, August 2011.
- F. Mairesse, M. Gasic, F. Jurcicek, S. Keizer, B. Thomson,
K. Yu and S. Young. Phrase-based Statistical
Language Generation using Graphical Models and Active Learning. In Proceedings of the 48th Annual Meeting of the Association for
Computational Linguistics (ACL), Uppsala, July 2010.
- F. Lefevre, F. Mairesse and
S. Young. Cross-Lingual Spoken Language Understanding from Unaligned Data
using Discriminative Classification Models and Machine
Translation. In Proceedings of Interspeech, Makuhari, September 2010.
- K. Yu, H. Zen, F. Mairesse and S. Young. Context adaptive
training with factorized decision trees for HMM-based speech
synthesis (best paper). In Proceedings of Interspeech, Makuhari, September 2010.
- K. Yu, F. Mairesse and S. Young. Word-level
Emphasis Modelling in HMM-based Speech Synthesis. In Proceedings of
ICASSP, Dallas, 2010.
- F. Mairesse, M. Gasic, F. Jurcicek, S. Keizer, B. Thomson, K. Yu and
S. Young. Spoken Language Understanding from Unaligned Data using
Discriminative Classification Models. In Proceedings of ICASSP, Taipei, 2009.
- S. Keizer, M. Gasic, F. Mairesse, B. Thomson, K. Yu and
S. Young. Modelling user behaviour in the
HIS-POMDP dialogue manager. In Proceedings of SLT, Goa, 2008.
- François Mairesse and Marilyn Walker. Trainable Generation of Big-Five Personality Styles
through Data-driven Parameter Estimation [PDF] [PS]
[BibTeX] [Talk]. In
Proceedings of the 46th Annual Meeting of the Association for Computational
Linguistics (ACL), Columbus, June 2008.
- M. Gasic, S. Keizer, F. Mairesse, J. Schatzmann, B. Thomson, K. Yu and
S. Young.
Training and Evaluation of the HIS POMDP Dialogue
System in Noise. In Proceedings of SIGDial, Columbus, 2008.
- B. Thomson, K. Yu, M. Gasic, S. Keizer, F. Mairesse, J. Schatzmann and
S. Young. Evaluating semantic-level confidence scores with multiple
hypotheses. In Proceedings of Interspeech, Brisbane, 2008.
- B. Thomson, M. Gasic, S. Keizer, F. Mairesse, J. Schatzmann, K. Yu and
S. Young.
User study of the Bayesian Update of Dialogue State
approach to dialogue management. In Proceedings of Interspeech, Brisbane, 2008.
- François Mairesse and Marilyn Walker. A Personality-based Framework for Utterance
Generation in Dialogue Applications [PDF] [PS] [BibTeX]. In Proceedings of the AAAI Spring Symposium on Emotion, Personality,
and Social Behavior, Palo Alto, March 2008.
- François Mairesse and Marilyn Walker.
PERSONAGE: Personality Generation for Dialogue [PDF] [PS] [BibTeX] [Talk]. In
Proceedings of the 45th Annual Meeting of the Association for Computational
Linguistics (ACL), Prague, June 2007.
- François Mairesse and Marilyn Walker. Words Mark the Nerds: Computational Models
of Personality Recognition through Language [PDF] [PS] [BibTeX] [Talk]. In
Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006), pages 543-548,
Vancouver, July 2006.
- François Mairesse and Marilyn Walker. Automatic Recognition of
Personality in
Conversation [PDF]
[PS] [BibTeX] [Talk]. In Proceedings of HLT-NAACL
2006, New York City, June 2006.
- Emma Barker, Ryuichiro Higashinaka, François Mairesse, Robert Gaizauskas, Marilyn Walker and
Jonathan Foster. Simulating Cub Reporter Dialogues: The collection of naturalistic
human-human dialogues for information access to text archives [PDF] [PS] [BibTeX]. In Proceedings of the
International Conference on Language Resources and Evaluation (LREC 2006), Genoa, May 2006.
- François Mairesse and Marilyn Walker. Learning to Personalize
Spoken Generation for Dialogue Systems [PDF] [PS] [BibTeX]. In Proceedings of Interspeech'2005
- Eurospeech: 9th European Conference on Speech Communication and Technology, pages 1881-1884, Lisbon, September 2005.
- François Mairesse and Marilyn Walker. Generating Individualized
Utterances for Dialogue Systems [PDF] [PS] [BibTeX] [Talk]. In Proceedings of the Symposium on Dialogue Modelling and
Generation (as part of the Annual Meeting of the Society for Text & Discourse), Amsterdam, July 2005.
Theses:
Online demos:
- CamInfo: The Cambridge
Tourist Information Dialogue System (requires a microphone)
This Java applet is an interface to our group's live dialogue system,
which provides information
about most places in Cambridge, including pubs, restaurants, colleges, museums,
etc. The system can also be called using the number +44 1223 852
453. The system implements the HIS framework, i.e. it relies
on Partially-observable Markov Decision Processes to reason over multiple hypotheses
about the user input, which are provided by the ATK speech recogniser. Some functionalities of Personage are used for language
generation (e.g., syntactic aggregation, WordNet synonym
selection). The speech synthesiser is an HTS voice trained on
emphasis-dependent context features using the two-pass context clustering
method.
- Personage:
Language Generation with Personality
The Personage generator can produce personality-rich utterances for presenting information
in the restaurant domain. You can use the interactive interface to observe how
each utterance varies along the extraversion
dimension. Personage is based on models of the generation parameters computed from human
personality ratings, detailed in this paper. An online
demo is available, and the Java stand-alone generator
can be downloaded here.
- Automatic personality recognition
What does your language reveal
about you? The personality recognition models can estimate your scores along
the 5 main personality dimensions based on your input text. Models are detailed in this paper.
Data and software:
Here are various human-annotated datasets and freely available software. Feel free to use and modify them for non-commercial purposes.
- BAGEL training and evaluation data
This contains the 404 semantically aligned utterances used for training
and evaluating the BAGEL statistical language generator, together
with the naturalness and informativeness ratings of 1616 utterances
generated using different learning configurations,
i.e. using active learning and random sampling. More details
in this paper.
- Emphasis-annotated ARCTIC database
for speaker AWB
This corpus contains word-level emphasis annotations for the first 597 utterances (set A) of the
ARCTIC speech database, i.e. the words or phrases perceived as the focus of speaker
AWB's utterances.
- The Personage Language Generator is now maintained by the Natural
Language and Dialogue Systems Group at UCSC.
- Personage dataset: a
personality-annotated corpus
This dataset contains
580 utterances annotated
with personality/stylistic ratings from human judges, for each Big Five
trait. The data also includes the generation
decisions made for each
utterance, as well as the intermediary content plan tree, sentence
plan tree and syntactic structures. Naturalness ratings are also included. This data was used
for evaluating the Personage generator, as well as for training
parameter estimation models (Mairesse & Walker, 2007, 2008). More details
in the Personage dataset readme file.
- Personality Recognizer v1.02 (new version 06/06/2007)
This Java command-line application extracts psycholinguistic features from multiple text files and runs the included models to compute personality scores for all Big Five traits. An online demo is also available.
- jMRC - MRC Psycholinguistic Database Java Interface v0.9
This Java interface allows you to query the MRC Psycholinguistic Database from your Java programs, providing psycholinguistic features for over 150,000 words.
Talks:
- The Benefits of Statistical
Language Understanding and Generation in Dialogue
- CSAIL Seminar, MIT, 26/09/2011.
- Crowdsourcing a Statistical Language Generator using
Phrase-based Factored Language Models
- TALC Seminar at LORIA, Nancy, 23/11/2010.
- 48th Annual Meeting of the Association for Computational
Linguistics (ACL), Uppsala, Sweden, 14/07/2010.
- Trainable
Generation of Personality through Data-driven Parameter Estimation
- NLIP Seminar at the Computer Laboratory, Cambridge University, 21/11/2008.
-
46th Annual Meeting of the Association for Computational
Linguistics (ACL), Columbus OH, 16/06/2008.
- Generating
Language with Personality
- SRI International's Artificial Intelligence Center, Menlo Park, 03/04/2008.
- AAAI Spring Symposium on Emotion, Personality and Social Behavior, Stanford
University, 26/03/2008.
- Psychology Department, University of Texas at Austin, 14/11/2007.
- Machine Intelligence Lab Seminar, Department of Engineering,
Cambridge University, 22/10/2007.
-
Computing Department Seminar at the Open University, Milton Keynes,
20/09/2007.
-
45th Annual Meeting of the Association for Computational
Linguistics (ACL), Prague, 26/06/2007.
- NLP Group Talk, Sheffield, 20/03/2007.
- Computational Models of Personality
Recognition through Language
- 28th Annual Conference of the Cognitive Science Society (CogSci 2006), Vancouver, 29/07/2006.
- HLT-NAACL 2006 Conference, New York City, 05/06/2006.
- NLP Group Talk, Sheffield, 16/05/2006.
- Learning Individual
Adaptation in Dialogue Systems
- Symposium on Dialogue Modelling
and Generation, Amsterdam, 07/07/2005.
- NLP Group Talk, Sheffield, 10/05/2005.
Teaching:
François
Mairesse, 2011 -
|