Liste de nos séminaires

(ordre anti-chronologique)


Séminaire DAPA du 8 / 10 / 2015 à 10h

Contributions à la fouille de données complexes : données géographiques imprécises et/ou données massives

Cyril de Runz (Université de Reims Champagne-Ardenne (Groupe Signal, Image et Connaissance - SIC))


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

L'objectif porte sur la prise en compte de la véracité dans la fouille, visuelle ou non, de données géographiques imprécises et/ou de données massives. Notre travail porte premièrement sur la considération de la représentativité de la donnée floue dans un ensemble, deuxièmement sur la nuanciation des décisions de classements par l'intermédiaire des couleurs et troisièmement sur la collaboration à grande échelle des approches pour améliorer la validité des décisions. . Nos méthodes se basent notamment sur la définition d'indices temporels flous, de graphes, de rangs, de représentativité, de coloriage guidé par les données, etc. Pour cela, nous nous sommes aussi, en partie, appuyés sur des outils et méthodes compatibles avec MapReduce, à l'instar des cartes de Kohonen, et sur la distribution des opérateurs sur le nuage.
Mes contributions contribuent à une meilleure appréhension des données complexes (imprécises, massives, spatiales, temporelles) dans le cadre de leur fouille et leur visualisation.

Bio

Après avoir obtenu en 2005 un master recherche en intelligence artificielle à l'Université Paul Sabatier (Toulouse, France), j'ai soutenu en 2008 un doctorat en informatique à l'Université de Reims Champagne-Ardenne. Je suis, depuis septembre 2009, Maître de Conférences à l'Université de Reims Champagne-Ardenne et j'effectue mes recherches au CReSTIC. Je suis pour l'année 2015-2016 en délégation CNRS au LIP6 (équipe LFI).

Mes travaux portent sur la gestion de l'information spatiotemporelle imparfaite. Dans ce cadre, mes centres d'intérêt sont la fouille de données complexes, l'analyse/traitement de données spatio-temporelles imparfaites, les systèmes d'information géographique, l'exploration visuelle de données et, depuis deux ans, les problématiques liées au Big Data.

Plus d'information sur Cyril de Runz : https://sites.google.com/site/cyrilderunz/


Séminaire DAPA du 28 / 5 / 2015 à 11h15

A Fuzzy Rule-Based Approach to Single Frame Super Resolution

Nikhil Pal (Indian Statistical Institute, Calcutta)


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

High quality image zooming is an important problem. The literature is rich with many methods for it. Some of the methods use multiple low resolution (LR) images of the same scene with different sub-pixel shifts as input to generate the high resolution (HR) images, while there are others, which use just one LR image to obtain the HR image. In this talk we shall discuss a novel fuzzy rule based single frame super resolution scheme. This is a patch based method, where for zooming each LR patch is replaced by a HR patch generated by a Takagi-Sugeno type fuzzy rule-based system. We shall discuss in details the generations of the training data, the initial generation of the fuzzy rules, refinement of the rules, and how to use such rules for generation of SR images. In this context we shall also discuss a Gaussian Mixture Regression (GMR) model for the same problem. To demonstrate the effectiveness and superiority of the proposed fuzzy rule-based system, we shall compare its performance with that of six methods including the GMR method in terms of multiple quality criteria.

Bio

Nikhil R. Pal is a Professor in the Electronics and Communication Sciences Unit of the Indian Statistical Institute. His current research interest includes bioinformatics, brain science, fuzzy logic, pattern analysis, neural networks, and evolutionary computation.
He is currently the Vice President for Publications of the IEEE CIS.
He is a Fellow of the National Academy of Sciences, India, a Fellow of the Indian National Academy of Engineering, a Fellow of the Indian National Science Academy, a Fellow of the International Fuzzy Systems Association (IFSA), and a Fellow of the IEEE, USA.
Nikhil R. Pal is an invited professor of the UPMC from May 22 to June 20, 2015.

Plus d'information sur Nikhil Pal : http://www.isical.ac.in/~nikhil/


Séminaire DAPA du 28 / 5 / 2015 à 10h

Does it all add up? A study of fuzzy protoform linguistic summarization of time series

Jim Keller (University of Missouri (USA))


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

Producing linguistic summaries of large databases or temporal sequences of measurements is an endeavor that is receiving increased attention. These summaries can be used in a continuous monitoring situation, like eldercare, where it is important to ascertain if the current summaries represent an abnormal condition. Primarily a human, such as a care giver in the eldercare
example, is the recipient of the set of summaries describing a time range, for example, last night’s activities. However, as the number of sensors and monitored conditions grow, sorting through a fairly large number of summaries can be a burden for the person, i.e., the summaries stop being information and become yet one more pile of data. It is therefore necessary to automatically process sets of summaries to condense the data into more manageable chunks.
The first step towards automatically comparing sets of digests is to determine similarity. For fuzzy protoform based summaries, we developed a natural similarity and proved that the associated dissimilarity is a metric over the space of protoforms. Utilizing that distance measure, we defined and examined several fuzzy set methods to compute dissimilarity between sets of summaries, and most recently utilized these measures to define prototypical behavior over a large number of normal time periods.

In this talk, I will cover the definition of fuzzy protoforms, define our (dis)similarity, outline the proof that it is a metric, discuss the fuzzy aggregation methods for sets of summaries, and show how prototypes are formed and can used to detect abnormal nights. The talk will be loaded with actual examples from our eldercare research. There is much work to be done and hopefully, more questions than answers will result from the discussion.

Sponsored by the Computational Intelligence Society under its Distinguished Lecturer Program

Bio

James M. Keller is a Curators Professor in the Electrical and Computer Engineering and Computer Science departments at the University of Missouri as well as R. L. Tatum Professor for the college. Keller’s research interests are in computational intelligence with current applications to eldercare technology, bioinformatics, geospatial intelligence and landmine detection.

James M. Keller is a CIS Distinguished Lecturer.

Plus d'information sur Jim Keller : http://engineering.missouri.edu/person/kellerj/


Séminaire DAPA du 12 / 3 / 2015 à 14h

Grille bivariée pour la détection de changement dans un flux étiqueté

Vincent Lemaire (Orange Labs)


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

Cet exposé présentera :

  1. le contexte de la détection de concept drift dans un flux étiqueté : en analyse prédictive et en apprentissage automatique, on parle de dérive conceptuelle lorsque les propriétés statistiques de la variable cible, que le modèle essaie de prédire, évoluent au cours du temps d'une manière imprévue. Ceci pose des problèmes parce que les prédictions deviennent moins exactes au fur et à mesure que le temps passe. La dérive conceptuelle est une des contraintes en fouille de flux de données.

  2. une méthode en-ligne de détection de changement de concept dans un flux étiqueté : elle est basée sur un critère supervisé bivarié qui permet d’identifier si les données de deux fenêtres proviennent ou non de la même distribution. Notre méthode a l’intérêt de n’avoir aucun a priori sur la distribution des données, ni sur le type de changement et est capable de détecter des changements de différentes natures (changement dans la moyenne, dans la variance...). Les expérimentations montrent que notre méthode est plus performante et robuste que les méthodes de l’état de l’art testées.

Bio

Vincent Lemaire is a senior expert in data-mining. His research interests are the application of machine learning in various areas for telecommunication companies with an actual main application in data mining for business intelligence. He developed exploratory data analysis and classification interpretation tools.

Plus d'information sur Vincent Lemaire : http://www.vincentlemaire-labs.fr/


Séminaire DAPA du 15 / 1 / 2015 à 14h

Tensor factorization for multi-relational learning

Raphael Bailly (Heudiasyc, Université Technologique de Compiègne, France)


Lieu : salle 101, couloir 25-26, 4 place Jussieu, 75005 Paris

Learning relational data has been of a growing interest in fields as diverse as modeling social networks, semantic web, or bioinformatics. To some extent, a network can be seen as multi-relational data, where a particular relation represents a particular type of link between entities. It can be modeled as a three-way tensor.

Tensor factorization have shown to be a very efficient way to learn such data. It can be done either in a 3-way factorization style (trigram, e.g. RESCAL) or by sum of 2-way factorization (bigram, e.g TransE). Those methods usually achieve state-of-the-art accuracy on benchmarks. Though, all those learning methods suffer from regularization processes which are not always adequate.

We show that both 2-way and 3-way factorization of a relational tensor can be formulated as a simple matrix factorization problem. This class of problems can naturally be relaxed in a convex way. We show that this new method outperforms RESCAL on two benchmarks.

Bio

R. Bailly is currently post-doc at Heudiasyc (since march 2014), Compiègne. He works with Antoine Bordes and Nicolas Usunier on multi-relational learning and word embeddings. He was previously in Barcelona for a post-doc with Xavier Carreras, whith whom he worked on spectral methods applied to unsupervised setting.

Plus d'information sur Raphael Bailly : https://www.hds.utc.fr/~baillyra/


Séminaire DAPA du 27 / 11 / 2014 à 10h

The Frank-Wolfe Algorithm: Recent Results and Applications to High-Dimensional Similarity Learning and Distributed Optimization

Aurélien Bellet (Télécom ParisTech)


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

The topic of this talk is the Frank-Wolfe (FW) algorithm, a greedy procedure for minimizing a convex and differentiable function over a compact convex set. FW finds its roots in the 1950's but has recently regained a lot of interest in machine learning and related communities. In the first part of the talk, I will introduce the FW algorithm and review some recent results that motivate its appeal in the context of large-scale learning problems. In the second part, I will describe two applications of FW in my own work: (i) learning a similarity/distance function for sparse high-dimensional data, and (ii) learning sparse combinations of elements that are distributed over a network.

Bio

Aurélien Bellet is currently a postdoc at Télécom ParisTech. Previously, he worked as a postdoc at the University of Southern California and received his Ph.D. from the University of Saint-Etienne in 2012. His main research topic is statistical machine learning, with particular interests in metric/similarity learning and large-scale/distributed learning.

Plus d'information sur Aurélien Bellet : http://perso.telecom-paristech.fr/~abellet/


Séminaire DAPA du 13 / 11 / 2014 à 10h

Computer-Aided Breast Tumor Diagnosis in DCE-MRI Images

Baishali Chaudhury (Department of Computer Science and Engineering, University of South Florida)


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

The overall goal of our project is to quantify tumor heterogeneity with advanced image analysis to provide useful information about tumor biology and provide unique and valuable insight into patient treatment strategies and prognosis.We introduced a CAD (computer aided diagnosis) system to characterize breast cancer heterogeneity through spatially-explicit maps using DCE-MRI images. Through quantitative image analysis, we examined the presence of differing tumor habitats defined by initial and delayed contrast patterns within the tumor. The heterogeneity within each habitat was quantified through textural kinetic features at different scales and quantization levels. The functionality of this CAD system was then evaluated by applying it in a multi-objective framework. Various common problems in breast DCE-MRI analysis (like extremely small dataset compared to the number of extracted texture features and highly imbalanced dataset) and different data mining techniques applied in our project to deal with them will be discussed.

Bio

Fourth year PhD Candidate in University of South Florida, Tampa, USA. Currently, working on the “Analysis of DCE-MRI breast tumor images for stratifying patient prognosis”. Broader research interests include: computer vision, data mining and machine learning, sparse data representation.

Plus d'information sur Baishali Chaudhury : http://baishalichaudhury.wix.com/baishali


Séminaire DAPA du 30 / 10 / 2014 à 11h

WaterFowl: a Compact, Self-Indexed RDF Store based on Succinct Data Structures

Olivier Curé (Laboratoire d'informatique Gaspard-Monge, Université Marne-la-Vallée)


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

This talk will start with an introduction on the main strategies for storing and indexing RDF data sets. This will consider solutions based on a native RDF approach but also approaches using a relational or NoSQL storage backend. Then, I will present the main features of an on-going work that aims to distribute highly compressed structures adapted for the storage and querying of RDF triples. The compactness of the represented data is supported by an architecture based on Succinct Data Structures (SDS) which enables to store large datasets in main memory. A special form of entity encoding enables inferences in the RDFS entailment regime.

Séminaire DAPA du 2 / 10 / 2014 à 10h

Subgoal Discovery and Language Learning in Reinforcement Learning Agents

Marie desJardins (Department of Computer Science and Electrical Engineering at the University of Maryland, USA)


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

As intelligent agents and robots become more commonly used, methods to make interaction with the agents more accessible will become increasingly important. In this talk, I will present a system for intelligent agents to learn task descriptions from linguistically annotated demonstrations, using a reinforcement learning framework based on object-oriented Markov decision processes (OO-MDPs). Our framework learns how to ground natural language commands into reward functions, using as input demonstrations of different tasks being carried out in the environment. Because language is grounded to reward functions, rather than being directly tied to the actions that the agent can perform, commands can be high-level and can be carried out autonomously in novel environments. Our approach has been empirically validated in a simulated environment with both expert-created natural language commands and commands gathered from a user study.

I will also describe a related, ongoing project to develop novel option discovery methods for OO-MDP domains. These methods permit agents to identify new subgoals in complex environments that can be transferred to new tasks. We have developed a framework called Portable Multi-policy Option Discovery for Automated Learning (P-MODAL), an approach that extends the PolicyBlocks option discovery approach to OO-MDPs.

This work is collaborative research with Dr. Michael Littman and Dr. James MacGlashan of Brown University, Dr. Smaranda Muresan of Columbia University. A number of UMBC students have contributed to the project: Shawn Squire, Nicholay Topin, Nick Haltemeyer, Tenji Tembo, Michael Bishoff, Rose Carignan, and Nathaniel Lam.

Bio


Dr. Marie desJardins is a Professor in the Department of Computer Science and Electrical Engineering at the University of Maryland, Baltimore County, where she has been a member of the faculty since 2001. She is a 2013-14 American Council of Education Fellow, the 2014-17 UMBC Presidential Teaching Professor, and an inaugural Hrabowski Academic Innovation Fellow. Her research is in artificial intelligence, focusing on the areas of machine learning, multi-agent systems, planning, interactive AI techniques, information management, reasoning with uncertainty, and decision theory. Current research projects include learning in the context of planning and decision making, analyzing and visualizing uncertainty in machine learning, trust modeling in multiagent systems, and computer science education.

Dr. desJardins has published over 120 scientific papers in journals, conferences, and workshops. She is an Associate Editor of the Journal of Artificial Intelligence Research, is a member of the editorial board of AI Magazine, and was the Program Cochair for AAAI-13. She has previously served as AAAI Liaison to the Board of Directors of the Computing Research Association, Vice-Chair of ACM's SIGART, and AAAI Councillor. She is an ACM Distinguished Member, is a AAAI Senior Member, holds an appointment at the University of Maryland Institute for Advanced Studies, is a member and former chair of UMBC's Honors College Advisory Board, is the former chair of UMBC's Faculty Affairs Committee, and serves on the advisory board of UMBC's Center for Women in Technology.

Plus d'information sur Marie desJardins : http://www.csee.umbc.edu/~mariedj/


Séminaire DAPA du 11 / 9 / 2014 à 14h

Clustering-based Models from Model-based Clustering

Mika Sato-Ilic (Faculty of Engineering, Information and Systems. University of Tsukuba)


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

Recent advances in the area of information science have enabled the collection of multi-source data and complex data in vast amounts. Data analysis has been tasked with the increasingly significant mission of dealing with such data. Clustering is one type of data analysis used to detect and characterize the latent structure of data by classifying the objects based on similarities among objects. Model-based clustering is a framework of clustering methods and main issue of this is an assumption of a model to the data and by fitting the model to data, an adjusted partition will be estimated. Although this approach has the benefit of obtaining a clear solution as the result of the partition based on mathematical theory, we cannot avoid the risk the previously assumed model might not adjust to the latent classification structure of the data. Therefore, we propose a framework called clustering-based models in which we exploit obtained clustering result as a scale of latent structure of the data and apply it to the observed data, and then apply the modified data to a model in order to obtain a more accurate result. In this talk, several methods in this framework called clustering-based models with several applications will be introduced.

Séminaire DAPA du 3 / 7 / 2014 à 10h

Clustering de données temporelles, application à l'analyse des données issue des médias sociaux

Julien Velcin


laboratoire ERIC, Université Lyon 2

Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

Les modèles graphiques sont devenus très populaires pour traiter les problèmes de classification automatique. Dans cet exposé, je présenterai les travaux réalisés récemment au laboratoire ERIC pour deux problèmes différents de classification non supervisée. Le premier problème que nous avons attaqué s'inspire de modèles probabilistes de topic modeling pour capturer conjointement l'évolution des thématiques et des opinions exprimées dans un corpus de textes. Le deuxième problème abordé consiste à adapter les modèles de mélanges afin de capturer la dynamique des catégories. Les modèles présentés seront illustrés sur des données réelles issues des médias sociaux. J'en profiterai pour donner un aperçu de leur application, dans le cadre du projet ImagiWeb, qui consiste à extraire et à suivre l'image d'entités sur le Web.

Séminaire DAPA du 1 / 7 / 2014 à 17h

PLUIE (Probability and Logic Unified for Information Extraction): Interim Report

Stuart Russell (University of California, Berkeley)
Ole Torp Lassen (LIP6, UPMC)
Wei Wang (LIP6, UPMC)


Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

The goal of the PLUIE project is to investigate an old approach to
understanding language: the idea that declarative text expresses
information about the world. This idea is captured in the form of a
probability model that describes how sentences are generated from
worlds. A very simple model of this kind exhibits a number of
interesting properties including robust bootstrap inferences and
relation discovery. The talk will summarize the approach and cover two
specific subproblems: efficient split-merge MCMC inference in an
entity-mention model and flexible mention grammars for named entities.

S. Russell est soutenu par, et cette présentation est donnée sous les auspices de, la Chaire Internationale de Recherche Blaise Pascal, financée par l'Etat et la Région Île de France, gérée par la Fondation de l'Ecole Normale Supérieure.


Séminaire DAPA du 5 / 6 / 2014 à 14h

Classification non-supervisée recouvrante par k-moyennes revisité

Guillaume Cleuziou


IUT Informatique d'Orléans

Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

La classification non-supervisée recouvrante (overlapping clustering) consiste à faire émerger d'un ensemble de données, des classes d'individus similaires tout en autorisant chaque individu à apparaître pleinement dans plusieurs classes. Nous montrerons dans cette présentation en quoi ce type de structuration peu conventionnelle est primordiale dans de nombreux domaines d'application et en quoi le clustering recouvrant constitue une problématique de recherche à part entière. L'algorithme de partitionnement bien connu des k-moyennes nous servira alors de base pour introduire différentes modélisations des recouvrements de clusters, les stratégies associées pour explorer l'espace des solutions et enfin les extensions possibles vers les méthodes à noyau.

Séminaire DAPA du 22 / 5 / 2014 à 10h

Collaborative activity in learning situations: forms and processes

Michael Baker


CNRS

Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

Diverse research fields are concerned with modelling collaborative activity, from artificial intelligence and evolutionary anthropology, to several branches of psychology, notably organisational psychology, social psychology and educational psychology. Particular visions of cooperation and collaboration have been elaborated in the field of (computer-supported) collaborative learning (e.g. Dillenbourg, Baker, Blaye & O’Malley, 1996), where models of collaboration are required for interpreting experimental results in terms of how the students interacted together, and for design of technologies for collaboration. In this context, a general distinction between cooperation and collaboration (Roschelle & Teasley, 1995) is now generally accepted: collaboration involves the mostly synchronous joint attempt to elaborate a shared representation of the problem to be solved, whereas cooperation tends towards less synchronous work, with division of sub-task responsibilities between participants. The main questions raised by this definition are: what is the nature of such “shared representations”, and what are the forms and processes by which they are co-elaborated? This paper deepens and extends these definitions in three main ways. Firstly, a definition of what “shared representation” means is proposed, as mutual acceptance, distinguished from belief (Cohen, 1992). Secondly, forms of cooperative activity are defined in terms of combinations of three gradual dimensions: (a)symmetry of interactive roles, (dis)agreement, and alignment (or coordination) on several levels (problem-solving stage, language, discursive representations). Finally, the discursive operations that constitute collaboration are described, in terms of four broad classes: extensional, cumulative, foundational and reformulative. The set of forms of collaboration associated with the specific case of argumentation dialogue will be described in particular detail, with elements of the model being illustrated with examples taken from several corpora of interactions between students.

Séminaire DAPA du 15 / 5 / 2014 à 10h

A normal hierarchical model for random intervals / The silhouette index - an extension to fuzzy clustering and applications to feature selection

Dan Ralescu / Anca Ralescu


Department of Mathematical Sciences, University of Cincinnati, USA / Computer Sciences, EE & CS Dept. College of Engineering University of Cincinnati, USA

Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

(10h30-11h20) Dan Ralescu, Professor, Department of Mathematical Sciences, University of Cincinnati, USA
A normal hierarchical model for random intervals

Many statistical data are imprecise due to factors such as measurement errors, computation errors, and lack of information. In such cases, data are better represented by intervals rather than by single numbers.
Existing methods for analyzing interval-valued data include regressions in the metric space of intervals and symbolic data analysis, the latter being proposed in a more general setting. However, there has been a lack of literature on the parametric modeling and distribution-based inferences for interval-valued data.

(11h20-12h10) Anca Ralescu, Professor, Computer Sciences, EE & CS Dept. College of Engineering University of Cincinnati, USA
The silhouette index - an extension to fuzzy clustering and applications to feature selection

Introduced in 1986 by Peter J. Rousseeuw, as a visualization tool for the results of a clustering algorithm, the silhouette index has found more applications as a clustering validity index. Various features of this index recommend it. In this talk I will discuss two topics: (1) recent work on the extension of the silhouette index to fuzzy clustering, and (2) applications of the silhouette index to feature selection for a classifier.

Séminaire DAPA du 20 / 2 / 2014 à 10h

Robust recommendations and their explanation in multi-criteria decision aiding

Christophe Labreuche


Thales Group, France

Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

Multi-Criteria Decision Aid (MCDA) aims at helping an individual to make choices among alternatives described by several attributes, from a (small) set of learning data representing her preferences. MCDA has a wide range of applications in smart cities, engineering, recommender systems and so on. Among the variety of available decision models, one can cite the weighted majority, additive utility, weighted sum or the Choquet integral.

Once the expression of the decision model has been chosen, the generation of choices among alternatives is classically done as follows. In a constraint approach, from a set of learning data (representing for instance comparisons of alternatives), one then looks for the value of the model parameters compatible with the learning data, which maximizes some functional, e.g. an entropy or a separation variable on the learning data. The comparisons among alternatives are then obtained by applying the model with the previously constructed parameters. The major difficulty the decision maker faces is that there usually does not exist one unique value of the parameters compatible with the learning data. Hence this approach introduces much arbitrariness since the generated preferences are much stronger than the learning data.

Robust preference relations have been recently introduced in MCDA to overcome this difficulty. An alternative is said to be necessarily preferred to another one if the first one dominates the second for any value of the parameters compatible with the learning data. In Artificial Intelligence, this operator is often called entailment. It is actually a closure operator. This necessity preference relation is usually incomplete, unless the model is completely specified from the preferential information of the decision maker.

The introduction of robust preference relation brings many new challenges:

  • algorithmic aspects: how to design efficient algorithms to construct it?
  • explanation: how to explain to the decision maker the recommended robust preferences? In other words, how are the recommendations derived from the learning data?

We will address these points in the talk.


Séminaire DAPA du 6 / 2 / 2014 à 10h

Apprentissage actif en classification évidentielle sous contraintes

Violaine Antoine


ISIMA Limos

Lieu : salle 101, couloir 25-26, 4 place Jussieu, 75005 Paris

La classification évidentielle et non supervisée se caractérise par l'utilisation de fonctions de croyance, et notamment l'utilisation de la notion de partition crédale. Cette notion élargit le concept de partition nette, floue, probabiliste ou possibiliste. Ainsi, elle permet de mesurer de manière précise l'incertitude quant à l'affectation d'un objet à une classe.

La classification sous contraintes, également appelée classification semi-supervisée, est une approche qui introduit une connaissance a priori sous forme de contraintes sur la partition recherchée. Nous nous intéressons ici à des contraintes au niveau des objets : une contrainte Must-Link spécifie que deux objets doivent être dans la même classe alors qu'une contrainte Cannot-link indique que deux objets se trouvent dans des classes différentes. L'ajout de contraintes permet une amélioration sensible des résultats de classification. Néanmoins, dans le cadre d'applications réelles, il est parfois difficile d'obtenir un jeu de contraintes intéressant. L'apprentissage actif consiste donc à obtenir ces informations à moindre coût.

Dans cette présentation, nous proposons deux nouveaux algorithmes de classification sous contraintes utilisant le cadre théorique des fonctions de croyance. Grâce à la partition crédale qu'ils retournent, nous pouvons identifier de manière précise les objets problématiques pour la classification. Un nouvel algorithme d'apprentissage actif est alors proposé afin de réduire l'erreur de classification.


Séminaire DAPA du 19 / 12 / 2013 à 10h

The raise of graph databases/dataspaces and their relations with Linked Data and Ontologies

André Santanchè


Universidade Estadual de Campinas, Brazil

Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

Graph as a data model to represent, store and link data has been receiving an increasing attention. Social networks, Linked Data and ontologies share common challenges, which foster research in topics like graph databases and dataspaces. This talk will present and overview of this scenario, emphasizing the following topics: exploiting latent semantics in "social content"; link-driven integration, Linked Data, dataspaces and "pay-as-you-go" integration; topology-aware, IR-inspired metrics for declarative graph querying; from graphs to ontologies. We will present examples in Biology domain.

Séminaire DAPA du 16 / 12 / 2013 à 16h

Extended Logic Programming and Intelligent System Development

Asushi INOUE


University of Cincinnati

Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

A long-term effort toward a general application framework
for intelligent systems is introduced. Many intelligent systems
adopt a knowledge-based system architecture, and their development
thus differs from other application development. Expressing knowledge
as rules shifts one's perspective from data manipulation to relation
investigation. Our recent progress about two components are focused
- Extended Logic Programming (ELP), i.e. the keystone of this framework,
and a multi-view visualization scheme in order to effectively and
efficiently visualize the reasoning processes of ELP. Few representative
applications are showcased as time allows.

Reference:

K. Springer, M. Henry, A. Inoue, "A General Application Framework for Intelligent Systems,"
The 20th Midwest Artificial Intelligence and Cognitive Science Conference (MAICS2009),
Fort Wayne, IN, pp. 188-195, 2009.


Séminaire DAPA du 5 / 12 / 2013 à 10h

Granular Models for Time Series Forecasting

Rosangela Ballini


Institute of Economics, University of Campinas, Brazil

Lieu : salle 105, couloir 25-26, 4 place Jussieu, 75005 Paris

Granular models based on fuzzy clustering are presented as an approach for
time series forecasting. These models are constructed in two phases.
The first one uses the clustering algorithms to find group structures in a
historical database. Two different approaches are discussed: fuzzy c-means
clustering and participatory learning algorithms. Fuzzy c-mean clustering,
which is a supervised clustering algorithm, is used to explore similar
data characteristics, such as trend or cyclical components. Participatory
learning induces unsupervised dynamic fuzzy clustering algorithms and
provides an effective alternative to construct adaptive fuzzy systems.
In the second phase, two cases are considered. In the first case, a
regression model is adjusted for each cluster and forecasts are produced
by a weighted combination of the local regression models. In the second
case, prediction data are classified according to the group structure
found in the database. Then, forecasts are produced using the cluster
centers weighted by the degree with which prediction data match the
groups. The weighted combination of local models constitutes a forecasting
approach called granular functional forecasting modeling, and the approach
based on weighted combination cluster centers comprises granular
relational forecasting modeling. The effectiveness of the granular
forecasting approaches is verified using three different applications:
average streamflow forecasting, pricing option estimation and modeling of
regime changes in Brazilian nominal interest rates.