Documents OpenAIRE (Open Access Infrastructure for Research in Europe)
http://hdl.handle.net/10230/8581
Thu, 23 May 2019 03:06:55 GMT2019-05-23T03:06:55ZDocuments OpenAIRE (Open Access Infrastructure for Research in Europe)http://repositori.upf.edu:80/bitstream/id/8055/
http://hdl.handle.net/10230/8581
Derivatives and inverse of cascaded linear + nonlinear neural models
http://hdl.handle.net/10230/37265
Derivatives and inverse of cascaded linear + nonlinear neural models
Martínez García, Marina; Cyriac, Praveen; Batard, Thomas; Bertalmío, Marcelo; Malo, Jesús
In vision science, cascades of Linear+Nonlinear transforms are very successful in modeling
a number of perceptual experiences. However, the conventional literature is usually too
focused on only describing the forward input-output transform. Instead, in this work we present
the mathematics of such cascades beyond the forward transform, namely the Jacobian
matrices and the inverse. The fundamental reason for this analytical treatment is that it
offers useful analytical insight into the psychophysics, the physiology, and the function of
the visual system. For instance, we show how the trends of the sensitivity (volume of the
discrimination regions) and the adaptation of the receptive fields can be identified in the
expression of the Jacobian w.r.t. the stimulus. This matrix also tells us which regions of the
stimulus space are encoded more efficiently in multi-information terms. The Jacobian w.r.t.
the parameters shows which aspects of the model have bigger impact in the response, and
hence their relative relevance. The analytic inverse implies conditions for the response and
model parameters to ensure appropriate decoding. From the experimental and applied perspective,
(a) the Jacobian w.r.t. the stimulus is necessary in new experimental methods
based on the synthesis of visual stimuli with interesting geometrical properties, (b) the Jacobian
matrices w.r.t. the parameters are convenient to learn the model from classical experiments
or alternative goal optimization, and (c) the inverse is a promising model-based
alternative to blind machine-learning methods for neural decoding that do not include meaningful
biological information. The theory is checked by building and testing a vision model
that actually follows a modular Linear+Nonlinear program. Our illustrative derivable and
invertible model consists of a cascade of modules that account for brightness, contrast,
energy masking, and wavelet masking. To stress the generality of this modular setting we
show examples where some of the canonical Divisive Normalization modules are substituted
by equivalent modules such as the Wilson-Cowan interaction model (at the V1 cortex)
or a tone-mapping model (at the retina).
Mon, 01 Jan 2018 00:00:00 GMThttp://hdl.handle.net/10230/372652018-01-01T00:00:00ZA geometric model of brightness perception and its application to color images correction
http://hdl.handle.net/10230/37264
A geometric model of brightness perception and its application to color images correction
Batard, Thomas; Bertalmío, Marcelo
; Human perception involves many features like contours, shapes, textures, and colors to name a few. Whereas several geometric models for contours,
shapes and textures perception have been proposed, the geometry of color perception has received very little attention, possibly due to the fact that our perception of colors is still not fully understood. Nonetheless, there exists a class of mathematical models, gathered under the name Retinex, that aim at modeling the color perception of an image, that are inspired by psychophysical/physiological knowledge about color perception, and that can geometrically be viewed as the averaging of perceptual distances between image pixels.
Some of the Retinex models turn out to be associated to an efficient image processing technique for the correction of camera output images. The aim of this paper is to show that this image processing technique can be improved by including more properties of the human visual system. To that purpose, we first present a generalization of the perceptual distance between image pixels by considering the parallel transport map associated to a covariant derivative
on a vector bundle, and from which can be derived a new image processing model for color images correction.
Then, we show that the family of covariant derivatives constructed in [T. Batard and N. Sochen, J. Math. Imaging Vision, 48(3) (2014), pp. 517-543] can model some color appearance phenomena related to brightness perception. Finally, we conduct experiments in which we show that the image processing techniques induced by these covariant derivatives outperform the original approach.
Mon, 01 Jan 2018 00:00:00 GMThttp://hdl.handle.net/10230/372642018-01-01T00:00:00ZStatistics of natural images as a function of dynamic range
http://hdl.handle.net/10230/37263
Statistics of natural images as a function of dynamic range
Grimaldi, Antoine Vincent; Kane, David; Bertalmío, Marcelo
The statistics of real world images have been extensively
investigated, but in virtually all cases using only low
dynamic range image databases. The few studies that
have considered high dynamic range (HDR) images have
performed statistical analyses categorizing images as
HDR according to their creation technique, and not to
the actual dynamic range of the underlying scene. In this
study we demonstrate, using a recent HDR dataset of
natural images, that the statistics of the image as
received at the camera sensor change dramatically with
dynamic range, with particularly strong correlations with
dynamic range being observed for the median, standard
deviation, skewness, and kurtosis, while the one over
frequency relationship for the power spectrum breaks
down for images with a very high dynamic range, in
practice making HDR images not scale invariant. Effects
are also noted in the derivative statistics, the single pixel
histograms, and the Haar wavelet analysis. However, we
also show that after some basic early transforms
occurring within the eye (light scatter, nonlinear
photoreceptor response, center-surround modulation)
the statistics of the resulting images become virtually
independent from the dynamic range, which would
allow them to be processed more efficiently by the
human visual system.
Tue, 01 Jan 2019 00:00:00 GMThttp://hdl.handle.net/10230/372632019-01-01T00:00:00ZIn praise of artifice reloaded: caution with natural image databases in modeling vision
http://hdl.handle.net/10230/37262
In praise of artifice reloaded: caution with natural image databases in modeling vision
Martínez García, Marina; Bertalmío, Marcelo; Malo, Jesús
Subjective image quality databases are a major source of raw data on how the visual
system works in naturalistic environments. These databases describe the sensitivity of
many observers to a wide range of distortions of different nature and intensity seen
on top of a variety of natural images. Data of this kind seems to open a number of
possibilities for the vision scientist to check the models in realistic scenarios. However,
while these natural databases are great benchmarks for models developed in some other
way (e.g., by using the well-controlled artificial stimuli of traditional psychophysics), they
should be carefully used when trying to fit vision models. Given the high dimensionality
of the image space, it is very likely that some basic phenomena are under-represented
in the database. Therefore, a model fitted on these large-scale natural databases will
not reproduce these under-represented basic phenomena that could otherwise be easily
illustrated with well selected artificial stimuli. In this work we study a specific example of
the above statement. A standard corticalmodel using wavelets and divisive normalization
tuned to reproduce subjective opinion on a large image quality dataset fails to reproduce
basic cross-masking. Here we outline a solution for this problem by using artificial stimuli
and by proposing a modification that makes the model easier to tune. Then, we show
that the modified model is still competitive in the large-scale database. Our simulations
with these artificial stimuli show that when using steerable wavelets, the conventional unit
norm Gaussian kernels in divisive normalization should be multiplied by high-pass filters
to reproduce basic trends in masking. Basic visual phenomena may be misrepresented
in large natural image datasets but this can be solved with model-interpretable stimuli.
This is an additional argument in praise of artifice in line with Rust and Movshon (2005).
Tue, 01 Jan 2019 00:00:00 GMThttp://hdl.handle.net/10230/372622019-01-01T00:00:00Z