TrebleCLEF Logo
Evaluation, Best Practices and Collaboration for Multilingual Information Access
Latest News
CLEF 2010: Padua, Italy September 2010
Read more - Link

TrebleCLEF workshop at eChallenges : Best Practices for Multilingual Information Access Istanbul,...
Read more - Link

Read all news



Daniela Alderuccio (daniela.alderuccio at actually works in the ICT domain at ENEA (Italian National Agency for New Technologies, Energy and the Environment) in the Research Centre of Casaccia (Rome). She graduated cum laude in Modern Foreign Languages and Literatures at the University of Rome “La Sapienza”, with an interdisciplinary ENEA thesis on semantic information retrieval in multilingual digital libraries (harmonizing Humanities Computing, AI and Computational Linguistics). Her research activities focus on Multilingual Text Processing for Intelligent Access to Multilingual e-Content, by using linguistic software packages for content analysis. Her research interests include: Multilinguism and Knowledge Representation (Ontologies) - Cross Language Information Retrieval - Web Archiving - e-Humanities. Application Fields: Digital Libraries - Semantic Web - GRID
Aliaksandr Autayeu (Aliaksandr.Autayeu at is a PhD student of the Doctorate School in Information and Communication Technologies of the University of Trento. His current research interests are connected with application of natural language processing techniques to metadata, such as web directory classification labels, business directory classes, picture titles, and other. Aliaksandr Autayeu graduated from Mechanics and Mathematics Faculty of Belarusian State University with specialization in Computer Mathematics. His past research interests included applications of wavelets to content-based image retrieval.
Stefano Baccianella (stefano.baccianella at I actually works as a researcher at ISTI-CNR, Pisa and PhD Student in Computer Engineering at Università di Pisa. I have a master's degree in Computer Science and my research topics concerns all the aspects of the classification of multimedia documents. In particular my focus is on the classification of textual documents and the analysis of the sentiment carried by them. Main research areas include the study of: Text Mining, Text Classification, Sentiment Analysis and Opinion Mining.
Alexandra Balahur-Dobrescu (abalahur at I am a graduate in Computer Science from the "Al. I. Cuza" University in Iasi, Romania. My graduate thesis was in Textual Entailment, in which I was working in 2007 and 2008. In parallel, I have started working in multilingual Question Answering and Answer Validation. I am currently a second year Ph.D Student at the University of Alicante, Spain, where my main research area is sentiment analysis. At the moment (April 2009 - September 2009) I am doing a traineeship with the European Commission's Joint Research Centre in Ispra, Italy, within the Istitute for the Protection and Security of the Citizens, where I am applying my doctoral research in sentiment analysis over newswire texts. I speak Romanian, English, German, Spanish, French, some Portuguese and I am currently learning Italian.
Guillaume Bernard (gbernard at I am from France and I am currently a Phd Student, in my second year. I am working in the LIMSI-CNRS laboratory, in Orsay, on computational linguistic, on the field of questions-answering systems. Even if I am studying all kind of linguistic phenomena, I am more specialized in textual implication : in the context of a questions-answering system (Ritel), I am trying to identify cases where the answer to a question asked by a user is not stated explicitly in our documents.
Annalina Caputo (acaputo at am a second year Ph.D student in Computer Science at the University of Bari (Italy). My research activity is principally focused on studying new hybrid Information Retrieval models and applying new methods to index documents content. The main Research Area is Information Retrieval (IR). In particular, the ultimate goal of my Ph.D is to investigate performance improvements in Information Retrieval systems due to the use of Natural Language Processing techniques which are able of extracting semantic information from documents. Currently, my research is concerning to the integration of Named Enities in IR systems.
Balint Daroczy (daroczyb at As a PhD student at the Eötves Lóránd University Budapest currently I am working on my dissertation on field of image and text search on the web. In the last two years I have been working on an image segmentation method based on three types of image growing method for an image ranking system combined with wide range of classificators and a cross-media clustering method (bi-clustering). I am interested in the fields of image processing, classification techniques and large scale image retrieval systems.
Le Thanh Dinh (thanh190882 at I am doing the second year master of the European Language and Communication Technology program (LCT). My first year was in Free University of Bozen-Bolzano and this second year I am studying in the Charles University of Prague. My current interests are mainly about Machine Learning and Statistical methods, Information Retrieval and Multilingual. I am doing my master thesis about Question Classification, Topic shift Classification using machine learning methods.

Marco Dussin (dussinma at Since February 2006 I am member of the research team of the Department of Information Engineering of the University of Padua, where I study. I deal with the study and the design of Graphical User Interfaces (GUIs) – in particular Web UIs – for prototypes made by the Information Management Systems research group (IMS). My latest research activity is focused on the interface of Distributed Information Retrieval Evaluation Campaign Tool (DIRECT), a prototype of a Digital Library System (DLS) able to support the course of an evaluation initiative, and to manage, curate and enrich the scientific data produced during it. DIRECT has been adopted and tested in the Cross-Language Evaluation Forum (CLEF) since 2005. I'm interested in human-computer interaction, web design and information architecture.

Mahmoud El-Haj (melhaj at
Position: Computer Science PhD student.
Organization: University of Essex, UK.
Previous Studies:
MSc, Information Systems, University of Jordan
BSc, Computer Information Systems: University of Jordan
Fields of Interest:
• Arabic Natural Language Processing
• Automatic Text Summarization
• Cross-Lingual Information Retrieval
Research activities:
Member at the Language and Computation (LAC) group at Essex University

Ingrid Falk (ingrid.falk at Current position: Research assistant at INRIA Nancy-Grand Est, Nancy, France and Ph. D. student at Université Nancy 2.
Description: After a degree in Mathematics I worked as a programmer with several academic teams in Nancy involved in language and document engineering. I was principally concerned with modeling, creating and maintaining morpho-syntactic, syntactic and semantic lexical resources for French.
In 2008 I received a master's degree in Computational Linguistics and work since fall 2008 as a research assistant and Ph.D. student at the INRIA research center in Nancy. My current research interests lie in exploring the interface between (multilingual) lexical information and ontology elements.

Pamela Forner (forner at I graduated in Foreign Languages and Literatures at the University of Trento in 2000. From 2001 to 2006 I worked at ITC-irst doing research in the field of contrastive linguistics, computational lexicography (Italian MultiWordNet), and multilingual corpora. From early 2006 to the present, I have been working at CELCT on the 0rganization of evaluation campaigns, and annotations at different levels.
M. Rami Ghorab (ghorabm at
Current: PhD Candidate at School of Computer Science & Statistics, Trinity College Dublin, Ireland.
Received his MSc in IT from the School of CSIT, University of Nottingham, UK (2003).
Received a diploma in IT from the Java Department, Information Technology Institute, Egypt (2002).
Received his BSc in Computer Science from the CS Department, Modern Academy in Maadi, Egypt (2001).
Held the position of Head of Java Department at the Information Technology Institute, Egypt (2007-2008).
Teaching Assistant at Information Technology Institute, Egypt (2003-2008).
Teaching Assistant at the Modern Academy in Maadi, Egypt (2001-2002).
Research interests: Multilingual Information Retrieval, Adaptive Hypermedia, and User Modeling.
Other research interests include: Peer-to-Peer Networks, Artificial Intelligence, and Data Mining.
Eniko Héja (eheja at I am a junior researcher at the Dept. of Langugage Technology, Research Institute for Linguistics (HAS) and also a PhD student at the Dept. of Theoretical Linguistics, University of Budapest. I hold MA degrees in philosophy and theoretical linguistics. I am currently working on the automatic generation of bilingual dictionaries. The topic of my thesis is the semantic representation of (Hungarian) verbs from a word sense disambiguation point of view. I am particularly interested in methods that combine symbolic and statistical approaches.

Charlotte Lecluze (charlotte.lecluze at
After a Master Degree in Linguistics at the University of Caen (UCBN, France), a year ago, I started a professional PhD in
Computer Sciences at Pertimm, Information Management Experts. I'm also a member of the Interaction Semiotics: Language, Diagrams Group (ISLanD) at UCBN. My current research projects are centered on the implementation of a manual methodology for semantic alignment of subsentential segments, in order to improve cross-lingual information retrieval.

Vanessa Lopez Garcia (v.lopez at Vanessa Lopez is a research fellow at the Open University’s Knowledge Media Institute, where she is also a part-time PhD student. Her research interests are in natural-language front ends to query the Semantic Web. Lopez received her MSc in computer engineering from the Technical University of Madrid.
Yashar Mehdad (mehdad at PhD student in ICT (Information and Communication Technologies) International Doctorate School at the University of Trento. He is working on Textual Entailment, as his main research activity, with Bernardo Magnini in HLT (Human Language Technology) group in Bruno Kessler Foundation (FBK). After his degree in Engineering (Iran, 2000), he studied Information Technology at University of Malaya in Malaysia, which led to achieve his master degree with distinction in 2006. Moving towards his research enthusiasm, he obtained his second level master in Human Language Technology at University of Trento, in 2008. His main interests, besides Textual Entailment, are Statistical Natural Language Processing, Computational Semantics, Machine Learning, Named Entity Recognition, Wikipedia-based NLP, Multilingual Information Access and Computational Linguistics for Persian (Farsi) Language.

Jinming Min (jmin at ) is a PHD student of the Computing School at Dublin City University. His current research interests are multilingual query translation with the IBM translation model and short-length document expansion from the external resource such as Wikipedia. Jinming Min graduated from Chinese Academic of Sciences majored in Computer Science. His past research interests included cross language information retrieval especially the out-of-vocabulary words translation and translation disambiguation.

Soto Montalvo (soto.montalvo at Assistant Professor at Computer Science Department at Universidad Rey Juan Carlos (Madrid, Spain) and PhD student at Universidad Rey Juan Carlos.
Previous studies: M. S. in Computer Science and Information Engineering (Universidad Rey Juan Carlos)
B. S. in Computer Science and Information Engineering (Universidad Politécnica, Madrid, Spain)
Primary Research interests: Automatic Multilingual News Clustering, Cognate Identification and Information Extraction

Sergio Navarro (snavarro at After 6 years of experience working in a software engineering private company, two years ago I started my PhD in the Natural Language Processing and Information Systems Group in the University of Alicante,Spain. There my research work has been focused on multimedia information retrieval area, concretely on looking for the most suitable ways of fusing multimodal systems and sources in order to achieve systems with better precision and better diversity handling. This summer school is a great opportunity for me in order to complete my formation expaning my view of the related state of the art technologies.

Alexandre Patry (patryale at I am completing my PhD on the topic of statistical machine translation at the Université de Montréal in Canada. I currently work on a better integration of context in phrase-based machine translation and on parallel documents retrieval from bilingual corpora.

Mari-Sanna Paukkeri (mari-sanna.paukkeri at I am a post-graduate researcher at the Adaptive Informatics Research Centre at the Helsinki University of Technology. At the moment I am visiting the School of Informatics at the University of Edinburgh. I work on unsupervised language-independent methods, especially text mining and clustering. I am also interested in social networks and subjective language use. I have conducted studies for eleven European languages.

Alberto Pérez García-Plaza (alpgarcia at I am a Ph.D. candidate and teaching assistant at the Department of Computer Systems and Languages and member of the UNED group in Natural Language Processing and Information Retrieval. My main research interests are Web Page Clustering, HTML Document Representation, Fuzzy Logic and Self-organizing maps. Currently I am interested in testing my methodology with more than one language. For more detailed information you can take a glance to my home page

Ginevra Peruginelli (peruginelli at is a researcher of the Italian National Research Council. She has a degree in Law (1999) and holds a MA/MSc Diploma in Information Science awarded by the University of Northumbria, Newcastle, UK (2005) and a Ph.D. in Telematics and Information Society from the University of Florence (2009). Since 2000 she has been working at the Institute of Legal Information Theory and Techniques (ITTIG), in Florence. Her main research areas involve: techniques and methods for accessing legal documentation; cross language legal information retrieval, open access to law. From 2004 she is professor under contract of Legal informatics at the Faculty of Law, University of Perugia. In 2003, she was admitted to Bar of the Court of Florence as a lawyer. She has been involved in various European projects on implementing multilingual semantic tools (ontology, thesauri) for accessing European legal information (legislation, case law, legal literature). She has published several academic articles and delivered national and international conference contributions on law and legal language documentation, legal standards and knowledge extraction. Recently she has published two indipendent volumes on multilingual access to law.
Fabien Poulard (fabien.poulard at I'm doing a PhD at Nantes University (France) on content reuse and plagiarism detection. During my master thesis, I've worked on detecting quotations in journalistic articles.
I'm interested in this summer school to extend my knowledge on multilingual approaches. The information retrieval techniques can be particularly usefull to filter content reuse candidates.

Valeria Quochi (vquochi at I currently work as a junior researcher at the CNR-ILC in Pisa. I hold a PhD in Linguistics and a Masters degree in Foreign Languages and Literature. I worked on the representation and acquisition of lexical semantic information esp. of (semi-) transparent multiword expressions or collocations (complex nominals and light verb constructions). At ILC I have also been involved in projects related to the standardization of lexical resources. Currently, I’m interested in automatic acquisition / extraction / annotation of lexical information and semantic relations, in a cross-lingual perspective, for possible applications in MT.

Alejandro Revuelta Martínez (Alejandro.Revuelta at I studied computer science at Castilla-La Mancha University in Spain. Currently I am a PhD student and part of a research project at the same university.
I have studied statistical machine translation and automatic speech recognition systems but I am interested in the whole field of natural language processing
Megan Richardson (megan.richardson at
LIMSI - CNRS, France
Dávid Siklósi (sdavid at
MTA SZTAKI, Budapest, Hungary

Adrian Smales (a.smales at Head of ICT, Natural History Museum, London.
Previous Studies: BEng(Hons) in Electronic and Computer Engineering, Masters in Business Administration, Edinburgh University,
Dissertation: Food Information Logistics
Current Activities:
• Developing IT Technology and Business Strategy for BHL Europe, Chief Technical Architect (BHL WP3), Harmonising data between 28 European institutions.
• Architecting and raising sponsorship funding for “World’s Greenest Data centre” via Dell, IBM, 3PAR, EMC, Intel, Cisco and others. Approx Value circa £7M - £10M.
• Developing sustainability business model for BHL Europe, incorporating Storage, Scanning/Digitisation and Data Centre.

Diana Irina Tanase (tanasedi at Diana Irina Tanase is a PhD student at University of Westminster, London, UK. Her research is focused on integrating user context to cross-language information retrieval. Her other projects include development work on the Computational Science Education Reference Desk (a NSDL pathway), and a number of collaboration web tools for Design Interaction, Royal College of Art. Her initial training was received at Ovidius University, Romania (2001), followed by a Master of Science at University of Northern Iowa, USA (2003).

Marco Turchi (marco.turchi at I'm Research Assistant at University of Bristol, Department of Engineering Mathematics, Pattern Analysis and Intelligent Systems group. I got my Phd at University of Siena, Italy, on "Computer Engineering, adaptive information processing". During my Phd studies, I was visiting student at University of California Davis, Statistical Department, and intern at Yahoo Research Lab. Before coming in Bristol two years ago, I had a temporary research position at Xerox Research Centre Europe (XRCE). My current research is centered about the European project SMART, applying machine learning techniques to statistical machine translation problems, and I am also involved in a media analysis project aimed at modelling the mediasphere based on text mining and cross-language analysis techniques.

Ville Turunen (ville.t.turunen at I am a PhD student in Computer Science at Helsinki University of Technology. My main research topic is speech retrieval and I am currently working on an approach for Finnish speech retrieval based on statistically discovered morphemes. I am especially interested in language independent approaches. My previous studies include language technology and machine learning.
Evgenia Vassilakaki (EVGENIA.VASSILAKAKI at is a second year Ph. D Student in the Dept. of Information & Communications at Manchester Metropolitan University studying under a 3 years, full-time studentship offered by The Information Research Institute. Evgenia's research interests lie in users' image seeking behaviour in multilingual environments. In particular, she is focusing on users' trust and confidence when searching for images across languages. Evgenia is also working as a part time junior researcher on some of the projects in the research center, CERLIM.
Rita Zaharah Wan-Chik (rita.zaharah at I am originally from Malaysia and have just started my PhD in Information Studies with the University of Sheffield early this year. I am currently on a study leave from teaching at the Universiti Kuala Lumpur, Malaysia. My research interest is leaning towards multi-language and cross-language web search particularly on the information seeking related to faith in general and to Islam and Al Quran in specific.
Taras Zagibalov (taras8055 at I am currently a PhD student in University of Sussex, UK. I've been doing some research in unsupervised sentiment and subjectivity extraction, trying to create an approach that could be domain and language independent. I've been experimenting with English, Chinese and Japanese data, also planning to include texts in Russian. I'm quite fluent in English, Chinese (Mandarin) and Russian.
Veronika Zenz (v.zenz at I'm an Information Retrieval Scientist at Matrixware and member of the Matrixware Research Group. I've studied computer science at the Technical University of Vienna, where I received my master degree in Software Engineering & Internet Computing in 2007. I've written my master thesis on music information retrieval, more precisely on chord detection in polyphonic audio. I'm currently helping in the organization of the CLEF-IP track and am also involved in the creation of a digital library for matrixware and the IRF.
Arkaitz Zubiaga (azubiaga at I'm a PhD student at the Languages and Computing Systems Department of the National University of Distance Education (UNED). I'm member of the Natural Language Processing and Information Retrieval Group at UNED. My main research interests revolve around social tagging and the knowledge offered by this kind of metadata, specially to exploit them for data mining and information retrieval tasks. In this way, I would like to learn interesting approaches to work against social networks, which commonly have a multilingual nature.