TELECOM ParisTech Seminars on Perception, Indexing and Learning

07/05/09

 

Pascal Matsakis
Department of Computing and Information Science University of Guelph, (Ontario, Canada).

Time/Location: 07/05/09, 2:00 pm, Amphi Emeraude, TELECOM ParisTech. Organizer: Isabelle Bloch and Hichem Sahbi

Title : Understanding the Spatial Organization of Image Regions by Means of F-Histograms and F-Templates: A Guided Tour,

Abstract: Space plays a fundamental role in human cognition. In everyday situations, it is often viewed as a construct induced by spatial relationships, rather than as a container that exists independently of the objects located in it. Spatial relationships, therefore, have been thoroughly investigated in many disciplines, including cognitive science, psychology, linguistics, geography and artificial intelligence. In computer vision and related fields, understanding the spatial organization of regions in images is an important task. The modeling of spatial relationships raises two fundamental questions: How to identify the spatial relationships between two given objects? How to identify the object that best satisfies a given relationship to a reference object? F-histograms and F-templates are tools designed to answer these questions. In this talk, we will present them, and compare them with other existing tools. We will reflect on their duality, describe their characteristics and properties, discuss their strengths and weaknesses. We will review the different algorithms that have been developed for F-histogram and F-template calculation. We will also review the current and potential applications in various domains, such as scene description, human-robot communication, object classification and retrieval.




09/04/09

 

Nikos Paragios
l'Ecole Centrale et equipe GALEN/INRIA Saclay (Paris, France).

Time/Location: 09/04/09, 2:00 pm, Amphi Emeraude, TELECOM ParisTech. Organizer: Hichem Sahbi

Title : (to come),

Abstract: (to come)




26/03/09

 

Stéphanie Allassonnière
CMAP, l'Ecole Polytechnique (Paris, France).

Time/Location: 26/03/09, 2:00 pm, Amphi Emeraude, TELECOM ParisTech. Organizer: Hichem Sahbi

Title : (to come),

Abstract: (to come)




12/02/09

 

Tamy Boubekeur
LTCI, Telecom ParisTech (Paris, France).

Time/Location: 12/02/09, 2:00 pm, Amphi Emeraude, TELECOM ParisTech. Organizer: Hichem Sahbi

Title : GPU : principes et applications,

Abstract: Durant cette présentation, je reviendrai sur l'historique des GPUs et leur architecture actuelle. Je détaillerai ensuite différentes technologies permettant de les programmer avant de présenter divers applications 3D et générales. Je conclurai sur le futur des API graphiques et de leur matériel dédié.




31/01/09

 

Department Strategy Workshop
Maison de La Recherche (Paris, France).

Time/Location: 30/01/09, 9:00am till 6:00 pm. Organizers: Stephan Clemencon, Yves Grenier and Hichem Sahbi

Speakers : Gael Richard, Isabelle Bloch, Beatrice Pesquet, Francois Roueff, Cedric Fevotte, Tamy Boubekeur, Catherine Pelachaud and Bianchi






05/02/09

 

Julien Mille
Ceremade / Université Paris Dauphine (Paris, France).

Time/Location: 05/02/09, 2:00 pm, Salle C48, TELECOM ParisTech. Organizer: Isabelle Bloch and Hichem Sahbi

Title : Energies de Région Locales pour les Contours et Surfaces Déformables,

Abstract: Active contours and surfaces are deformable geometric models for 2D and 3D image segmentation. They usually evolve by minimization of an energy functional, basically made-up of a regularizing term and an image term. Hence, segmentation by means of such models makes a trade-off between regularity and coherence to data. Region-based active contours segment images according to global statistical features computed over the inside and outside of the curve. For instance, the data term of the Chan-Vese model penalizes the curve splitting the image into heterogeneous regions, using intensity variances. It is devoted by essence to the segmentation of uniform objects and backgrounds. Such an ideal case is rarely encountered, as the background usually contains various objects, which differ in their overall intensities or textures. In order to handle objects in non-uniform backgrounds, we rely on the narrow band principle to define new region terms. Indeed, many objects can be segmented with respect to a region criterion in the vicinity of their boundaries. In this talk, we deal with two local region terms, to handle respectively uniform and piecewise-uniform backgrounds over a narrow band. We also present an extension of this banded formulation: the 2D deformable generalized cylinder, a geometrically constrained model aiming at recovering tubular objects.

Les contours et les surfaces actives sont des modèles déformables employés en segmentation d'images 2D et 3D. Généralement, leur évolution est gouvernée par la minimisation d'une fonctionnelle d'énergie, composée principalement d'un terme de lissage et d'un terme image. Ainsi, la segmentation par modèle déformable réalise un compromis entre la régularité du modèle et l'adéquation aux données. Les contours actifs basés région segmentent l'image selon des critères statistiques globaux, calculés sur les domaines intérieur et extérieur à la courbe. Par exemple, le terme de région du modèle de Chan et Vese pénalise une courbe qui sépare l'image en régions hétérogènes, en utilisant la variance des intensités. Par nature, cette approche est dédiée à la segmentation d'objets et d'arrière-plans uniformes. Dans des images réelles, cette condition est rarement satisfaite, car le complémentaire de la région d'intérêt contient souvent de nombreuses autres structures d'intensités (ou de couleurs, de textures, ...) différentes. Pour gérer ce type d'images, nous nous basons sur le principe de la bande étroite et formulons de nouveaux termes de région. En effet, nous considérons que de nombreuses structures peuvent être segmentées selon un critère d'homogénéité calculé uniquement à proximité des frontières. Nous présentons ici deux termes de région locaux qui gèrent les cas où l'arrière-plan est respectivement uniforme et uniforme par morceaux, et ce dans une bande étroite. Nous présentons également une extension de cette formulation par bande : un cylindre 2D généralisé déformable. Il s'agit d'un modèle contraint du point de vue de la géométrie, dédié à la segmentation d'objets tubulaires.




22/01/09

 

Francis Bach
ENS-ULM / INRIA (Paris, France).

Time/Location: 22/01/09, 2:00 pm, Amphi Emeraude TELECOM ParisTech. Organizer: Hichem Sahbi

Title : Machine Learning and Kernel Methods for Computer Vision,

Abstract: Kernel methods are a new theoretical and algorithmic framework for machine learning. By representing data through well defined dot-products, referred to as kernels, they allow to use classical linear supervised machine learning algorithms to non linear settings and to non vectorial data. A major issue when applying these methods to image processing or computer vision is the choice of the kernel. I will present recent advances in the design of kernels for images that take into account the natural structure of images.




08/01/09

 

Grégoire Malandain
Equipe ASCLEPIOS, INRIA Sophia Antipolis (Nice, France).

Time/Location: 08/01/09, 2:30 pm, Amphi C48, TELECOM ParisTech. Organizer: Elsa Angelini and Hichem Sahbi

Title : Construction d'Atlas Anatomiques pour la Radiotherapie,

Abstract: (to come)




27/11/08

 

Alain Bretto
GREYC-CNRS, Univ. de (Caen, France).

Time/Location: 27/11/08, 2:00 pm, Amphi B312 TELECOM ParisTech. Organizer: Sofiane Rital and Hichem Sahbi

Title : Compression géométrique,

Abstract: Les hypergraphes ont été utilisés en analyse d'images bas niveau. Dans cette présentation nous développons une nouvelle application de ce modèle en compression. Une image peut être modélisée grâce a un hypergraphe dont les sommets sont les pixels, (voxels) et les hyperarêtes les rectangles de pixels, (voxels) partageant un critère d'homogénéité et maximaux pour ce critère. Grâce à cette modélisation on peut construire un algorithme de compression pour des images 2D ou 3D ayant une complexité en O(n*n).




13/11/08

 

Tamy Boubekeur
LTCI, Telecom ParisTech (Paris, France).

Time/Location: 13/11/08, 2:00 pm Amphi B312 TELECOM ParisTech. Organizer: Hichem Sahbi

Title : Acquisition, Traitement, Edition et Synthèse 3D,

Abstract: Durant cette présentation, je ferai un tour d'horizon des différents travaux de recherche que j'ai mené ces dernières années. Ces travaux s'inscrivent dans la thématique générale de l'informatique graphique, et plus précisément de la modélisation géométrique des surfaces 3D et du rendu temps-réel. J'aborderai notamment la numérisation 3D, le traitement numérique des surfaces, l'édition interactive de grandes masses de données 3D et la synthèse temps-réel de géométrie puis d'images sur GPU. Je conclurai en dessinant les premiers axes de mon projet de recherche.




23/10/08

 

Jean-Philippe Vert
Cancer computational genomics and bioinformatics Mines ParisTech - Institut Curie - INSERM U900 (Paris, France).

Time/Location: 23/10/08, 11:00 am, Amphi B312 TELECOM ParisTech. Organizer: Hichem Sahbi

Title : Inference of missing edges in biological networks,

Abstract: The elucidation of large-scale biological networks is currently a major challenge in systems biology. These networks include, for example, protein-protein interaction, metabolic, or regulatory networks, which describe various aspects of how life is organized at the molecular level. While these large-scale networks are slowly being discovered using various technologies, in silico methods to expand our current knowledge and predict missing edges are needed. In this talk I will present some approaches we developed in the last few years to infer missing edges using supervised machine learning techniques, in particular based on support vector machines and kernel methods, and illustrate their behaviour on several examples.

Title 2 (a quick survey): Collaborative filtering with attributes Abstract : Collaborative Filtering (CF) refers to the task of learning preferences of customers for products, such as books or movies, from a set of known preferences. More formally, this can be seen as the task of filling missing entries in a matrix where some entries are known. A standard approach to CF is to find a low rank approximation to the matrix. This problem is computationally difficult and some authors have proposed recently to search instead for a low trace norm matrix, which results in a convex optimization problem. We generalize this approach to the estimation of a compact operator, of which matrix estimation is a special case. We develop a notion of spectral regularization which captures both rank constraint and trace norm regularization, as well as many others. The major advantage of this approach is that it provides a natural method of utilizing side-information, such as age and gender, about the customers (or objects) in question - a formerly challenging limitation of the low-rank approach. We provide a number of algorithms, and test our results on a standard CF dataset with promising results. This is a joint work with Jacob Abernethy (UC Berkeley), Francis Bach (INRIA), and Theodoros Evgeniou (INSEAD).



02/10/08

 

Josef Sivic
INRIA - Willow Project, Département d'Informatique, Ecole Normale Supérieure (Paris, France).

Time/Location: 02/10/08, 2:00pm, Amphi Emeraude-TELECOM ParisTech. Organizer: Hichem Sahbi

Title : From Locations to Scenes: Searching and Browsing Large Photo Collections,

Abstract: How can we retrieve images of specific objects and places with the speed, ease and accuracy with which web-search engines retrieve web-pages containing specific words? Can we browse a large collection of photographs taken at many different locations as if navigating in a single virtual 3D world? In the first part of the talk, I will describe our work on visual retrieval of particular objects and buildings in large image collections where the query is specified by an image of the object/building. Images are represented by a set of local viewpoint invariant descriptors so that matching can proceed despite changes in viewpoint, lighting and partial occlusion. I will also discuss issues arising from scaling up the search to a collection of more than 1 million Flickr images. In the second part of the talk I will present a system for exploring large collections of photos in a virtual 3D space. Here we do not assume the photographs are of a single real 3D location, nor that they were taken at the same time. Instead, we organize the photos according to scene types, such as city streets or skylines, which we call themes, and let users navigate within each theme using intuitive 3D controls that include pan, zoom and rotate, creating a "being there" impression, as if the images were of a particular 3D location. We present results on a collection of several millions of images downloaded from Flickr. Joint work with Ondrej Chum, Michael Isard, James Philbin, Andrew Zisserman, Biliana Kaneva, Antonio Torralba, Shai Avidan and Bill Freeman.



22/05/08

 

Donald Geman
Department of Applied Mathematics and Statistics. Johns Hopkins University (USA) and CMLA, ENS-Cachan (France).

Time/Location: 22/05/08, 2:00pm, Amphi B310-Telecom ParisTech. Organizer: Hichem Sahbi

Title : Stationary Features and Cat Detection,

Abstract: Semantic scene interepretation is one of the most challenging problems in computer vision. Most algorithms for detecting and describing instances from object categories consist of looping over a partition of a "pose space" with dedicated binary classifiers. This strategy is inefficient for a complex pose: fragmenting the training data severely reduces accuracy, and the computational cost is prohibitive due to visiting a massive pose partition. To overcome data-fragmentation I will discuss a novel framework centered on pose-indexed features, which allows for efficient, one-shot learning of pose-specific classifiers. Such features assign a response to a pair consisting of an image and a pose, and are designed so that the probability distribution of the response is invariant if an object is actually present. To avoid expensive scene processing, the classifiers are arranged in a hierarchy based on nested partitions of the pose, which allows for efficient search. The hierarchy is then "folded" for training: all the classifiers at each level are derived from one base predictor learned from all the data. The hierarchy is "unfolded" for testing. I will illustrate these ideas by detecting and localizing cats in highly cluttered greyscale scenes. This is joint work with Francois Fleuret.



24/04/08

 

Andrés Almansa
CNRS, LTCI, Telecom ParisTech (France).

Time/Location: 24/04/08, 2:00pm, Amphi Emeraude-Telecom ParisTech. Organizer: Hichem Sahbi

Titre : De la restauration d'images à la vision stéréoscopique à très haute précision. Applications à la conception de nouveaux systèmes d'imagerie satellitaire.

La reconstruction 3D à partir de paires stéréoscopiques utilise d'habitude des rapports base/hauteur assez grands afin de réduire au minimum l'effet des erreurs de localisation dans la mise en correspondance. Bien que ce soit bien fondé pour des systèmes binoculaires simultanés, ce choix introduit des nouvelles sources d'erreur dans les systèmes d'observation terrestre, car c'est le même satellite qui doit prendre les deux points de vue à des moments différents de son parcours orbital. Dans une telle situation, si on veut éviter les changements d'illumination et de la scène, on est obligés de raccourcir la base autant que possible, jusqu'à deux ordres de magnitude en pratique. Quand le rapport base/hauteur est si petit des nouvelles opportunités s'ouvrent : Le problème de déterminer une carte de disparité assez dense, sans erreur, et d'une manière complètement automatique et sans à priori commence à devenir abordable. En revanche le petit b/h pose aussi des nouveaux défis : Les disparités entre deux image d'une paire doivent être déterminées avec une précision inédite (de l'ordre du dixième voir du centième de pixel), pour être utiles dans la construction de modèles numériques d'élévation. De tels niveaux de précision exigent une révision minutieuse et exhaustive de toute la chaîne de traitement d'images. Dans cet exposé je me concentrerai sur nos travaux sur quelques étapes de cette chaîne, qui montrent d'une comment l'apprentissage statistique, et les modèles de la perception visuelle interagissent avec l'analyse de Fourier et l'optimisation convexe afin de fournir des estimations précises de la carte de disparités.



17/04/08

 

Alexei (Alyosha) Efros
Carnegie Mellon University (USA).

Time/Location: 17/04/08, 2:00pm, Amphi Emeraude-Telecom ParisTech. Organizer: Hichem Sahbi

Title : From Images to Scenes: using lots of data to infer geometric, photometric and semantic scene properties from a single image

Abstract: Reasoning about a scene from a photograph is an inherently ambiguous task. This is because a single image in itself does not carry enough information to disambiguate the world that it's depicting. Of course, humans have no problems understanding photographs because of all the prior visual experience they can bring to bear on the task. How can we help computers do the same? We propose to "brute force" the problem by using massive amounts of visual data, both labeled and unlabeled, as a way of capturing the statistics of the natural world. In this talk, I will present some of our recent results on inferring geometric, photometric, and semantic scene properties from a single image. I will first briefly describe our system for estimating the rough geometric surface layout of a scene as well as the camera viewpoint. I will show how this information, in turn, can be useful for modeling objects in the scene. Next, I will describe a very simple way of using the surface layout information as a way of estimating a rough illumination map for the scene. Finally, I will describe a new system that uses millions of unlabeled photographs from Flickr to capture some implicit semantic scene structure of an image. Applications of our methods to computer graphics will be shown.



20/03/08

 

Gabriel Peyré
CNRS, CEREMADE, Université Paris-Dauphine (France).

Time/Location: 20/03/08, 2:00pm, Amphi Emeraude-Telecom ParisTech. Organizer: Hichem Sahbi

Titre : Analyse et synthèse de textures par grouplets

Resumé: les textures naturelles contiennent des structures complexes, souvent turbulentes. Les outils classiques de l'approximation d'images (variation totale, ondelettes, etc) ne sont pas assez efficaces pour compresser ces régularités fortement anisotropes. L'obtention de représentations adaptées pour les textures géométriques est capital pour traiter efficacement ce type d'images ainsi que pour synthétiser de nouvelles images ayant les mêmes caractéristiques. Dans cet exposé, je vais présenter une nouvelle transformée en grouplets capable de tirer partie de l'information géométrique. Les familles redondantes de grouplets utilisent un champ d'association multiéchelles qui permet de suivre les structures fines d'une texture. Ces grouplets peuvent être utilisées pour modifier la géométrie des textures ainsi que pour réaliser de l'inpainting d'images.



03/03/08

 

Alberto Del Bimbo
Département "Systèmes et informatique", Università degli Studi di Firenze (Italy).

Time/Location: 03/03/08, 3:30pm, C48-Telecom ParisTech. Organizer: Yann Gousseau

Title: Overview of research in the MICC center in Florence

Abstract: I will give a short overview of the activities that our team develops at MICC (Media Integration and Communication Center) in Florence. They include CBR (3D faces retrieval and ontology based semantic video retrieval); Video Surveillance (PTZ camera-based surveillance) and HCI (tabletop computer vision-based natural interaction solutions).



07/02/08

 

Jean Ponce
Département d'Informatique, Ecole Normale Supérieure Head, ENS/INRIA project-team Willow (France).

Time/Location: 07/02/08, 2:30pm, Amphi Estaunie-Telecom ParisTech. Organizer: Hichem Sahbi

Title: Overview of Research in Willow

Abstract: Willow is a joint effort between Ecole Normale Supérieure et INRIA, focussing on the representational issues involved in visual scene understanding. Concretely, our objective is to develop geometric, physical, and statistical models for all components of the image interpretation process, including illumination, materials, objects, scenes, and human activities. These models will be used to tackle fundamental scientific challenges such as three-dimensional (3D) object and scene modeling, analysis, and retrieval; human activity capture and classification; and category-level object and scene recognition. They will also support applications with high scientific, societal, and/or economic impact in domains such as quantitative image analysis in science and engineering; film post-production and special effects; and video annotation, interpretation, and retrieval. Machine learning is a key part of our effort, with a balance of practical work in support of computer vision application, methodological research aimed at developing effective algorithms and architectures, and foundational work in learning theory. I will present in this talk an overview of Willow's research activities and present several recent results in 3D photography and markerless motion capture, category-level object recognition, video interpretation, and machine learning. I will conclude with a brief discussion of our ongoing and new projects and partnerships.