![]() |
Promoting vision research and its applications |
|
||||||||
|
NONLINEAR VISION - Abstracts from meeting on 16 December 1998 Do we need a special nonlinear channel to see second-order motion? A Johnston, C P Benton, P W McOwan# (Department of Psychology, UCL, Gower Street, London WC1E 6BT; #Department of Mathematical and Computing Sciences, Goldsmiths College, New Cross, London SE14 6NW; fax: +44 171 436 4276; e-mail : a.johnston@ucl.ac.uk) Over the last ten years there has been considerable interest in the mechanisms by which we see second-order motion. Second-order motion stimuli are non-rigid motion sequences in which movement is defined by translation of some property of the image which does not result in a change in the expected mean luminance of the pattern. The prevailing view is that some nonlinear operation is required to distort the signal such that spatial variations in the level of the transformed signal can be generated. The motion of this transformed signal can then be analysed by mechanisms similar to those proposed to account for the motion of patterns defined by luminance variation. This can be accomplished by introducing a special channel for motion processing, which includes an explicit nonlinear processing stage. Alternatively, identifiable second-order features may be tracked to allow judgements of direction of motion. Previously we have demonstrated that the motion of some second-order patterns, such as contrast modulations of sine gratings, can be predicted by spatio-temporal gradient techniques (Johnston & Clifford, 1985 Vision Research 35 1771). The model accomplishes this without the aid of an explicit nonlinear preprocessing stage. Here we show that, using a two-dimensional version of the model, we can extend the class of second-order patterns which can be detected to include contrast modulations of static and dynamic binary noise.
Discriminating the motion of contrast-modulated patterns: is there a third way? A Derrington (Psychology Dept, University Park, Nottingham, NG7 2RD, UK; fax: +44 115 951 5324, email: Andrew.Derrington@nottingham.ac.uk) In principle there are three ways that the visual system might analyse the motion of contrast-modulated patterns. It could use the same system of motion detectors that it uses for analysing the motion of luminance modulations; it could compute their motion by tracking the observed changes in their position over time, or it could have a special purpose motion detector designed for contrast-modulated patterns. I shall present a variety of evidence that suggests that when contrast is high, contrast-modulated patterns generate distortion products within the visual system that, in many respects, are processed in the same way as luminance modulations with comparable spatial periods. I shall also show data that suggest that when their contrast is low, the motion of contrast-modulated patterns depends on feature tracking. I shall argue that there is no need to look for a third way to analyse the motion of contrast-modulated gratings.
Nonlinear mechanisms in the geometric illusions M J Morgan (Institute of Ophthalmology, 11-43 Bath Street, London, EC1V 9EL; fax +44 171 608 6850; e-mail: m.j.morgan@ucl.ac.uk) Many of the classical geometric illusions can be seen as resulting from coarse positional coding. In the Poggendorff illusion, for example, the position of the line intersections seems to be miscoded, with a consequent illusion of non-collinearity. A shift in the location of the intersection, corresponding to a local maximum in filter output, is seen in a coarse-filtered version of the Poggendorff figure. Low-pass filtering of the Poggendorff figure enhances the illusion. However, filtering in the luminance domain cannot be invoked to account for the illusion, because the extent of the illusion is not altered in a luminance-balanced figure. The same is true of the Mller-Lyer figure, and the Z”llner figure. The Poggendorff effect is also seen in a version where the parallel lines are replaced by a patch of grating. The orientation of the grating has some effect on the magnitude of the illusion, but not on its direction, which argues against the cross-orientation inhibition theory. These findings suggest that the illusions arise in coarse-scale "collector units" (Morgan and Hotopf, 1989 Vision Research 29 1005-1015; Morgan et al., 1990 Vision Research 30 1793-1810), which are preceded by a rectifying nonlinearity (Morgan, M.J. "The Poggendorff illusion: a bias in the estimation of the orientation of virtual lines by second-stage filters". Vision Research, in press).
Adaptive Spatial Filtering Prior to Edge Analysis: A Nonlinear Image Processing Model for the Perception of Stationary Plaids T S Meese (Vision Sciences, Aston University, Aston Triangle, Birmingham B4 7ET, UK; fax: +44 121 333 4220; e-mail: t.s.meese@aston.ac.uk) Much evidence suggests that early vision contains multiple spatial filters with spatial-frequency (f) and orientation (q ) full-bandwidths around 1.6 octaves and 40o. Clues to the ways these filters are combined have come from the perception of structure in stationary plaids. For example, a pair of superimposed sine-wave gratings can look like either a blurred-checkerboard (combination across orientation) or a pair of overlapping gratings (no filter combination across orientation), depending upon: contrast, angle, remote adaptation and the addition of low-contrast harmonics (see Georgeson & Meese, 1997 Vision Research 37 3255-3271). These results have driven the design of the following image processing model. Multiple basis-filters (see above) are arranged in triplets: a pair of linear even- and odd-filters (F[f, q , even], F[f, q , odd]), and a nonlinear 'complex' filter made from their Pythagorean sum (F[f, q , energy]). At each position in the image, the number (N) and identities (i) of triplets having local response maxima in F[f, q , energy] across (f, q ) are determined. The response of the complex filter in each triplet is used to normalise the response of similar filters at a three times higher fi, to give r3i , which approaches zero in the absence of broad-band image structure at q i. N synthetic filters are created by using r3i to control the range of orientations around q i for which summation of even-filters is performed such that isotropy is approached as r3i approaches zero. The output of each synthetic filter is quantised to 2 grey levels and superimposed to produce an edge map. In contrast with inflexible filter models, good qualitative agreement is achieved between model output and spatial structure perceived in plaids. An implementation incorporating noise and mediation of filter-combination through a chain of inter-filter links provides a good quantitative account of contrast, angle and adaptation effects.
Contour integration and scale integration processes in visual edge detection S C Dakin & R F Hess (McGill Vision Research, Department of Ophthalmology, 687 Pine Avenue West, H4-14, Montr‚al, Qu‚bec, H3A 1A1, Canada; e-mail: scdakin@vision.mcgill.ca) If early visual processing is based on a local Fourier description then the detection of extended visual structure, such as contours, necessitates linking processes between local spatial filters. Such contour integration processes combine the outputs of filters across the visual field; filters whose positions and orientations are mutually consistent with the presence of a contour. However, contour information in the natural visual environment derives mainly from edges, which are defined both by orientation structure and by local, spatial-frequency structure. Specifically, the response of a filter bank (tuned to a range of spatial frequencies) to an edge will contain local phase-alignment (i.e. zero-crossings which coincide at the edge boundary). Thus the representation of edges requires not only contour integration, but scale integration (local combination of filters across spatial frequency). In order to determine how these two types of combination fit together, we measured the detectability of contours composed of broad-band edge elements, alternating with narrow-band Gabor elements. A contour integration system operating independently at a number of spatial scales should be able to ignore the distracting influence of edge structure in such patterns. However, subjects cannot ignore edge structure indicating that local phase-alignment across spatial scale is coded prior to, or concurrent with, contour integration.
Popout, feature conjunctions, and asymmetries in visual search in a model of intracortical interactions in V1. Z Li (Gatsby Computational Neuroscience Unit, University College London, 17 Queen Square, London, WC1N, 3AR, U.K.; fax: +44 171 391 1173; email: zhaoping@gatsby.ucl.ac.uk) Visual search for a target among distractors is associated with segmentation. It can be fast, e.g., a vertical line pops-out or is instantaneously detectable among horizontal ones, or slower, e.g., a red ‘X’ is hard to spot amongst green ‘X’s and red ‘O’s where the target is defined by a conjunction of redness and ‘X’ness (Treisman and Gelade, 1980 Cognitive Psychology 12 97-136). The ease of search can even be asymmetric with respect to the switches between targets and distractors, e.g., it is easier to spot a long line among shorter ones than vice versa (Treisman and Gormican, 1988 Psychol. Rev. 95 15-48). It has been unclear which neural mechanisms or cortical areas control the ease of search, and no physiological correlates have been found for search asymmetry. I show in a V1 model (Li, 1998 Perception 27 suppl. 45) that intracortical interactions in V1 between nearby layer 2-3 cells play a significant role. The intracortical interactions alter the neural responses (interpreted as saliencies) to targets and distractors according to their own features as well as those of the contextual distractors or targets. Hence, the relative saliencies of targets and distractors, which are assumed to determine the ease of search, depend on the particular target-distractor pair involved. The pattern of the intracortical connections largely determines the minimum feature (e.g., orientation) difference between the target and distractors for pop out (Foster and Ward, 1991 Proc. Roy. Soc. 243 83-86), as well as which feature conjuntions are less salient. Asymmetry is a natural consequence of contextual influences.
The temporal properties of first- and second-order vision A Schofield, M A Georgeson (School of Psychology, The University of Birmingham, Edgbaston, Birmingham, B15 2TT;fax: +44 121 414 4897; e-mail: a.j.schofield@bham.ac.uk) The temporal properties of first- and second-order visual detection mechanisms were derived from studies of temporal integration and two-pulse summation. Second-order stimuli were defined as sinusoidal (2c/deg) variations in the contrast of a dynamic white noise carrier. First-order stimuli comprised similar sinusoidal variations of luminance either added to the noise carrier or presented alone. Detection thresholds were measured in a 2IFC design where the subjects had to indicate which of two intervals contained the modulation. In the temporal integration experiment thresholds were measured for a single pulse of modulation whose duration varied from 18ms to 1152ms. In the two-pulse summation experiment, sensitivity was measured against the onset asynchrony between two 18ms pulses. Asynchrony varied from 18ms to 288ms and the pulses were either in phase or out of phase. Data from the two-pulse experiment were used to derive temporal impulse response functions for the three types of modulation. These functions were then used to predict performance in the temporal integration experiment. Detection of luminance signals without noise was characterised by a bi-phasic impulse response that implies transient responding to onset and offset of longer pulses of modulation. Luminance modulations in noise produced a mono-phasic impulse response corresponding to a sustained response to longer pulses. Contrast modulations of noise also produced a mono-phasic (sustained) impulse response. The full-widths at half height of the first- and second-order impulse responses were about 40ms and 50ms respectively (although the latter had a longer tail), indicating that the second-order system is only slightly slower than the first-order system in the presence of noise.
The relationship between the flicker motion aftereffect (FMAE) and the velocity aftereffect (VAE). M J Wright (Department of Human Sciences, Brunel University, Uxbridge, UB8 3PH, U.K.; fax +44 1895 237573; e-mail: Michael.Wright@brunel.ac.uk) Flicker motion aftereffects (FMAE) are typically seen when a counterphase grating is used as a test stimulus and a drifting grating as an adapting stimulus. FMAE’s differ in their properties from the MAE obtained with a stationary test stimulus, and it has been proposed that FMAE’s are sensitive to second-order as well as first-order motion. FMAE’s differ from MAE’s in being velocity sensitive rather than temporal frequency sensitive, but in this respect they resemble velocity aftereffects (VAE). A counterphase grating may be regarded as the sum of two drifting gratings of equal and opposite velocity. Might FMAE be understood simply as the sum of the VAE’s of its components? Counterphase test gratings were used to measure FMAE, using a single-interval forced-choice judgement of net direction. Either the relative contrast or the relative velocities of the component drifting gratings was varied to generate a set of counterphasing test stimuli with different values of directional bias. The effect of the same adapting stimulus could then be measured on the counterphase grating (superimposed components) or on the drifting gratings of which it was composed (separated components). The task for the subject in either case was simply to report the predominant direction of motion in the test stimulus. The results indicate that FMAE is larger than the sum of its component VAE's, that is, in the superimposed condition, a greater imbalance in the test stimulus components is required to null the aftereffect than in the separated condition. Moreover, after adaptation to a plaid, VAE is maximum in the direction perpendicular to the plaid’s components, whereas FMAE tuning is related to pattern direction over a broad range of component directions. It is concluded that FMAE is influenced by later stages of motion analysis than VAE.
Detecting deviations from linear flow in colour space D H Foster, S M C Nascimentoô, K A Amano (Vision Sciences, Aston University, Birmingham B4 7ET, UK; ôDepartment of Physics, University of Minho, 4709 Braga Codex, Portugal; fax: +44 121 333 4220; e-mail: d.h.foster@aston.ac.uk) The colour of the light reflected from a uniform surface depends on the illuminant. When the illuminant changes, the colour of the reflected light changes, and this change can be plotted as a vector in some appropriate colour space, that is, as a directed line segment from the initial to final colour. If there are many surfaces rather than one, then a family of vectors is obtained, which locally have similar lengths and fall almost parallel to each other, defining a locally linear "flow" in the colour space. There are, however, some naturally occurring surfaces and illuminant changes for which the magnitude and direction of these vectors differs from that of the majority. Can human observers detect such deviations? Data are reported from a colour-matching experiment suggesting that observers can distinguish between flows with and without deviations. Other data are considered showing that observers tend to attribute flows to illuminant changes and deviations to surface-reflectance changes, even when the deviations are actually due to illuminant changes.
Psychological colour diagram L D Griffin (Vision Sciences, Aston University, Birmingham, B4 7ET, UK; fax: +44 121 333 4220; e-mail: l.d.griffin@aston.ac.uk) Wittgenstein (1950) asserted that "Red is more akin to yellow than to blue". To investigate the universality of such judgements, subjects answered questions of the form "which pair is more similar A & B or C & D?"; where A, B, C and D were drawn from the eleven basic colour terms (Berlin & MacKay 1969). This investigation is psychological as compared to previous experiments which, by their use of coloured stimuli, should be termed psychophysical. A total of 47557 answers were collected from 194 subjects. The questions elicited varying levels of agreement. For example, 38 vs. 0 choose Brown & Black as more similar than Pink & Green; whereas subjects split 17 vs. 17 over Green & Red vs. Brown & White. A model with deterministic and stochastic components was fitted. The deterministic component was a distance structure over the eleven colours. The stochastic component was a function predicting the response rate to questions based on the distance structure. The function was constrained to be consistent with the metric. A c 2 test showed the fit to be statistically significant. It was found that the fit of the model could be maintained if direct connections between certain pairs of colours (e.g. black and white) were removed. The distance between a pair of colours without a direct link was the shortest distance via other colours. Of the 55 pairs of colours, it was possible to remove direct links between 22 pairs and still achieve a fit. The selection of links removed accorded well with expectation (e.g. yellow/black and pink/green) as did the shortest indirect routes found by the model (e.g. yellow/brown/black and pink/grey/green). The model can be well represented in a single diagram.
Coding chromatic and achromatic nonlinearities in natural scenes M G A Thomson (Vision Sciences, Aston University, Birmingham B4 7ET; fax: +44 121 333 4220; e-mail: m.g.a.thomson@aston.ac.uk) It has been argued that real-world visual scenes are, to a certain extent, members of an ensemble with common statistics, and that visual neural representations may thus be perceptually matched to those ensemble statistics. Natural images do display first- and second-order statistical regularities which are consistent from image to image; these regularities are usually defined such that they can be quantified by the images’ power spectra, and so must arise as a result of globally linear stochastic processes. Only higher-order statistics are capable of quantifying globally nonlinear stochastic processes, and although it has proved difficult to demonstrate higher-order consistencies in natural images, such an approach is motivated by the observation that image phase spectra (whose structure depends on higher-order image statistics only) appear to convey much more visual information than image power spectra. The present study defines some simple higher-order measures and describes their application to the achromatic, red-green and blue-yellow channels of a number of coloured natural images. The higher-order statistics---in particular, measures of image sparseness---measured in the achromatic channel are shown to be quite different in nature from those measured in the chromatic channels. This finding is related to existing psychophysical results regarding the processing of colour information in natural scenes.
Is the "collapse" of stereopsis in isoluminant random-dot stereograms due to a failure of second-order mechanisms? D R Simmons, F A A Kingdomô, and St‚phane Rainvilleô (Department of Vision Sciences, Glasgow Caledonian University, City Campus, Cowcaddens Road, Glasgow G4 0BA, Scotland; ô McGill Vision Research, Department of Opthalmology, 687 Pine Avenue West H4-14, Montr‚al, Qu‚bec, Canada H3A 1A1; fax +44 141 331 3387; e-mail: drsi@gcal.ac.uk) One potential mechanism underlying the so-called "collapse" of stereopsis in isoluminant random-dot stereograms (RDSs) is the failure of second-order stereopsis. Simmons and Kingdom (1995 JOSA A 12 2094-2104) showed that stereopsis at isoluminance was particularly impaired when the disparities were large relative to the spatial scales of the carriers of Gabor stimuli. It is generally thought that stereopsis in this disparity range is subserved by second-order stereopsis mechanisms. Kov cs and Feh‚r (1997 Vision Research 37 1167-1175) have suggested that depth perception at large disparities with bandpass filtered RDSs is also subserved by a second-order mechanism. We compared contrast thresholds for depth identification (front/back) for isoluminant red-green and isochromatic yellow-black low-pass filtered RDSs with those obtained using concordant figural stimuli (rectangular bars), at a range of disparities. It was found that (1) at optimal disparities for the task, depth judgements were no more impaired at isoluminance with random-dot stereograms than when the stimulus was figural; (2) at large disparities depth judgements with RDSs were particularly impaired at isoluminance and (3) when the judgements were form based (i.e. is the disparate region a vertical or horizontal rectangle?) then judgements in the optimum disparity range were, relatively, much more impaired at isoluminance. These results are consistent with the idea that, while second-order stereopsis is impaired at isoluminance, this is not the cause of the oft-observed demonstrations of stereoscopic "collapse". We suggest instead that the culprit is a combination of the chromatic stereopsis mechanism's relatively poor contrast sensitivity and a particular deficit in its processing of stereoscopic form. [Supported by a grant from MRC (Canada) no. 11554 to FK and an NSERC (Canada) PGS B Fellowship to SR]
Classical colour constancy: an improved measure K A Amano, D H Foster (Vision Sciences, Aston University, Birmingham B4 7ET, UK; fax: +44 121 333 4220; e-mail: k.amano@aston.ac.uk) Colour constancy refers to the invariant perception of surface colour under changes in illuminant. It is commonly assessed by observers making asymmetric colour matches between patches embedded in separate Mondrian-like patterns presented simultaneously under individual illuminants. The typical result of such measurements is that if the eye is not allowed to become adapted to the separate illuminants, the extent of the colour constancy is variable and often limited. The aim of this study was to test the hypothesis that colour-matching measures of colour constancy can be improved if more use is made of transient cues which are thought to be associated with the computation of spatial ratios of cone excitations . These cues should be enhanced if patterns are presented sequentially, in the same location, rather than simultaneously, side-by-side. As predicted, colour constancy averaged over six observers was found to be significantly better with sequential rather than simultaneous presentation.
Edge detection: how much calculus does early vision know? G S A Barbieri, M A Georgeson (School of Psychology, University of Birmingham, Birmingham B15 2TT; fax: +44 121 414 4897; e-mail: G.S.A.Barbieri@bham.ac.uk) Finding the location of luminance edges is a key process in early human and machine vision. In machine vision, the usual definition of an edge is a point at which intensity changes most steeply across space - a peak in the (smoothed) gradient magnitude or first derivative. Such edge points can also be found as zero-crossings (ZCs) in the second derivative. Our previous psychophysical work (Barbieri, G., 1996 M.Sc. Thesis, University of Birmingham) supports the use of local derivatives of luminance (rather than local energy) in human vision, but does not distinguish between the use of 1st and 2nd derivatives in locating edges. We have now designed stimuli that differ in their gradient profiles but have identical 2nd (and higher) derivatives. The basic idea is borrowed from luminance increment detection, translated into the gradient domain. Stationary 1-D stimuli consisted of a uniform luminance gradient with a localised increment or decrement of gradient - the 'blip'. Thus the blip always gave a ZC, but it was either a peak or trough of gradient magnitude. The data suggest that the detectability of the gradient blip was determined by its absolute height (the change in gradient) irrespective of the magnitude of the background luminance ramp. Observers could reliably distinguish between increments and decrements, and only the gradient peaks were classified as single edges. Thus, contrary to several models, analysis of spatial structure cannot be based solely on the 2nd and higher spatial derivatives.
Integration of oriented contours across space S J Guest, M A Georgeson (School of Psychology, University of Birmingham, Birmingham B15 2TT; fax: +44 121 414 4897; email: m.a.georgeson@bham.ac.uk) Recent evidence suggests that luminance-modulated (LM) and contrast-modulated (CM) gratings are analysed in human vision by separate mechanisms, at least for detection tasks. We have asked whether LM and CM information is later combined to improve performance on perceptual tasks, such as orientation discrimination. When LM and CM were spatially superimposed we found little or no improvement compared with LM or CM alone - hence no perceptual integration of the two cues. Here we consider the integration of contour information across space, using spatially separated patches of grating. The stimuli consisted of either one or two small patches of grating (2 c/deg, 0.75 deg diameter patch) presented in a static binary noise field, in separate locations above and below the central fixation point The gratings were modulated either in contrast (CM) or luminance (LM). Orientation discrimination thresholds were measured using a 2-AFC task and a staircase procedure for single (LM or CM) patches, and for pairs of patches (LM/LM, CM/CM, LM/CM), all presented at 4 times detection threshold. Results so far reveal very marked improvements in orientation discrimination for two patches over a single patch only when both patches are of the same type (i.e. LM/LM or CM/CM) and the bars are globally aligned. With pairs of unlike patches or misaligned bars, discrimination thresholds show only modest improvements over a single patch. These data suggest that integration of collinear sub-units by higher-order 'collator' units occurs only for sub-units of the same type: hence collator units are also specific for LM and CM information.
Is kurtosis a useful measure of the effectiveness of visual coding models? B Willmore, P A Wattersô, J S Lauritzen, D J Tolhurst (Department of Physiology, Downing Street, Cambridge CB2 3EG, UK; fax: +44 1223 333 840; e-mail: bw200@cam.ac.uk; ô Department of Computing, Macquarie University NSW 2109, Australia) It is widely assumed that the receptive fields of neurons in area V1 perform sparse coding, which implies that the response distributions of the neurons should be leptokurtic. However, it has been suggested (Baddeley, 1996 Network 7 409-421) that kurtosis is an ‘uninteresting’ property of response distributions – strongly leptokurtic distributions can be expected from any zero-D.C. filters exposed to images with uneven local intensity variance. We have been investigating the response distributions of the basis functions of several models of processing in V1. Three prominent models – Principal Components Analysis, Independent Components Analysis and the Olshausen-Field generative model (Olshausen and Field, 1997 Vision Research 37 3311-3325) – produce basis functions giving strongly leptokurtic distributions on convolution with natural scenes. However, two extremely simple models – Gabor wavelets, localised Fourier analysis (sinusoidal grating basis functions) – also produce very similar leptokurtic distributions. Moreover, for Gabor wavelets, the kurtosis is significantly reduced by adding the non-linear but physiologically-plausible step of local mean-luminance normalisation (a model of light adaptation in the retina). We expect the same effect from other spatially-localised basis functions. These findings tend to support Baddeley’s suggestion that kurtosis alone is not a sufficient measure of the effectiveness of coding. |
||||||||
| webdesign by ablen | The AVA is a registered charity (No: 1049146) |