![]() |
Promoting vision research and its applications |
|
||||||||
|
AVA 2004 - AVA Annual Meeting - Visual Perception and Encoding 31 March 2004 The Geoffrey J. Burton memorial lecture was given by: Dr Anya Hurlbert,University of Newcastle. "How our brain encodes colours of real objects" Abstracts of Meeting
PAPER PRESENTATIONS
Manipulating contour smoothness: Evidence that the association-field model underlies contour integration in the periphery. P. George Lovell Field et al. 1993 proposed that an association field model (AFM) underlies performance in path-paradigm (PP) tasks. The AFM integrates dynamically the outputs of filters with different orientation preferences. In the current study, simulations examined whether PP tasks could be solved by a simple-filter model (SFM). The SFM posits that 2AFC decisions are based upon the maximum length of zero-bounded regions after convolution of stimuli with elongated filters. For the SFM, integration only occurs between the outputs of co-oriented filters. In contrast to Hess and Dakin (1999), initial simulations found that manipulations of Gabor patch phase were an inadequate control for the contribution of the SFM towards PP performance. In a further simulation, the angular difference between neighbouring elements was held constant, while the global smoothness of contours was varied. The SFM favoured jagged contours and was relatively impaired in the detection of smoother contours. Conversely, human observers favoured smoother contours in the fovea and parafovea (13:). Whilst the SFM could account for the detection of jagged and randomly structured contours, it is inadequate as an account of the detection of smooth contours. Consequently, the AFM may provide a parsimonious account of contour integration across the whole visual field. Field, D.J., Hayes, A., & Hess, R.F. (1993). Contour Integration by the Human Visual-system: Evidence for a Local "Association Field". Vision Research, 33(2), 173-193.
Transfer of tilt-after-effects between second-order cues Alice G. Cruickshank and Andrew J. Schofield Background: Second-order cues are visual stimuli which are detectable by human observers, while not eliciting a peak in Fourier energy which corresponds to their perceptual properties. The most commonly studied exemplars of second-order cues are those defined by modulation of local contrast (CM). It is widely accepted that such cues are initially detected separately from first-order, luminance modulated (LM), cues. However, after-effects have been shown to transfer between first-order LM, cues and second-order CM cues (Georgeson & Schofield, 2002, Spatial Vision, 16, 59-76). This suggests the existence of a late link in the mechanisms which subserve their processing. Methods: To extend the investigation of the mechanisms for processing second-order cues we consider cues defined by modulations in local orientation (OM), using a tilt- after- effect (TAE) paradigm. Results: We found partial transfer of adaptation between LM and OM cues, confirming the presence of a link between first and second-order cues. Further, between OM and CM cues we found a partial transfer of TAE. Conclusions: These results suggest that, at or before the site of adaptation, information from all visual cues is combined. However, as transfer of adaptation is below 100% in all cases, this is only a partial integration of information.
Spatial variation in the statistics of binocular disparity Paul B. Hibbard* and Julie M. Harris+ *School of Psychology, University of St Andrews, St Andrews, KY16 9JP, UK. There has been much interest in recent years in the statistical properties of natural images. We present some predictions for the expected statistics of binocular disparity. This is a particularly interesting case to consider since binocular disparity will depend on both the structure of the environment and the viewing geometry, and will vary from one image location to another. The current study focussed on this expected spatial variation in relative horizontal binocular disparity, and how it is influenced by the structure of the environment. A simple three-dimensional collage model was employed, in which disparity distributions were obtained for an environment that was filled with randomly positioned, opaque spheres (Hibbard, 2004, Visual Cognition). Spatial variations in disparity were assessed by calculating (i) the fourier amplitude spectrum of horizontal binocular disparity (ii) the variance of disparity in local image neighbourhoods, and its relationship to object distance and (iii) the effects of simple physical constraints, such as the presence of a horizontal ground plane or table-top. Amplitude spectra showed a typical 1/f fall-off with spatial frequency. The variance of disparity in a local neighbourhood showed a clear inverse relationship with the distance to objects in that neighbourhood, showing that relative disparities provide information that, in principle, could be used to provide estimates of absolute object distance (Glennerster et al, 1998, Perception, 27, 1357). Finally, introduction of a horizontal ground plane or table-top results in a clear gradient in the expected disparity. In sum, these results show that the statistics of binocular disparity are predicted to be spatially non-stationary, and that spatial variations can be related directly to the three-dimensional structure of the environment.
Lower visual field advantage during motion segmentation Louise Lakha* and Glyn Humphreys+ *Department of Human Sciences, Brunel University, Uxbridge, Middlesex, UB8 3PH. A series of visual enumeration tasks were conducted investigating the role of the dorsal visual stream in motion segmentation. Enumeration places greater stress on segmentation processes compared with visual search as it involves selecting several items, all of which must compete successfully with distracters for limited attentional resources (Watson & Humphreys, 1999). We looked for differences in processing displays in the upper versus lower visual field. The lower visual field has greater connections with the parietal cortex and should therefore show an advantage for processes driven by the dorsal stream (Previc, 1990). In a baseline condition, random configurations of moving and static items were presented briefly (200ms) to the upper or lower visual field. Fast and efficient enumeration, known as subitization, took place both for moving targets and for static targets. In a separate task requiring motion segmentation, moving targets were presented with static distracters and static targets were presented with moving distracters. Performance was poor for enumerating static targets among moving distracters. For the moving displays, a lower visual field advantage was found when the inclusion of static distracters demanded segmentation by motion. This effect disappeared when the targets were presented in canonical patterns and appeared not to result from easier task conditions since there was no lower field advantage even for the worse performers. The results are consistent with a dependence of motion segmentation processes on dorsal regions of the visual cortex that show greater sensitivity to the lower visual field and to magnocellular-based input.
Deduction of Feature Categories from Colour Images Lewis D Griffin Imaging Sciences, King's College London, London. In Marr's Primal Sketch model, qualitative descriptions (edge, bar, etc.) of local image structure are computed from the quantitative measurements of co-localized linear visual neurons. He leaves open what the complete list of feature categories is, and how the neuron joint activity space is partitioned into sub-regions each of which corresponds to a particular feature type. Our concern is how to rate candidate systems of feature categories. Various criteria could be used to rate proposed systems of feature categories. The one studied in this work is that categories are preferred that make features stable to modifications of little importance (e.g. one might like the pattern of edges to change little with scene lighting). Initial experiments using ten RGB images as a source of data relevant to this criterion have been carried out. To make use of RGB images, the colour bands were treated as if they were distinct grey-level images of the same scene but acquired under different coloured illuminants. Gaussian derivative filters of 1st and 2nd order (5 filters) were applied to all images. From the five measurements per pixel, three parameters were calculated: Koenderink's 2nd order shape index, a term capturing the degree to which 1st or 2nd order structure dominates, and a combined 1st and 2nd order amplitude term. Partitions of this three parameter space into a five-category system of features (featureless, edge, bright bar, dark bar and hyperbolic) were evaluated. The class of partitions that was considered were parameterized by three thresholds: one on the amplitude term separated 'featureless' from the other features; one on the order term separated 'edges' from the 'bar' and 'hyperbolic' features; and one on the shape term separated the two types of 'bar' from the 'hyperbolic' feature. For each triple of partition-defining thresholds, the degree to which the features computed for different colour bands (R, G & B) agreed was scored. R vs. G, R vs. B, and G vs. B comparisons scores were summed. Scores for the ten test images were summed. The scoring formula included a correction for the degree of agreement that would occur by chance. The values of the three thresholds that maximized feature agreement were determined. Although it has not yet been subject to objective assessment, the feature categories that were derived correspond well with the categories that are obtained by manually adjusting the thresholds so that the 'features look right'. This supports the hypothesis that feature categories can be derived by considering the separate channels of RGB images.
Temporal Adaptability and the Inverse Relationship to Sensitivity: A System Identification Model K. Langley† and T.J. Atherton* †Department of Psychology, University College London, London. *FixerLabs Ltd, Watford. Following a prolonged period of visual adaptation to a temporally modulated sinusoidal luminance pattern, the threshold contrast of a similar pattern is elevated. The threshold contrast effect: is selective for spatial frequency; may saturate at low adaptor contrasts; and increases as a function of the spatio-temporal frequency of the adapting signal. A model for signal extraction capable of explaining the threshold contrast effect is proposed. The model accounts for the threshold contrast effect by the estimation of the parameters of an environmental model of the visual signal at a stage of spatial bandpass filtering. The adaptability of threshold contrast as a function of signal frequency is explained by the competing goals of maximal noise suppression and minimal signal bias at the stage of bandpass filtering. The proposed model supports the hypothesis that the adaptability of threshold contrast is governed by non-predictable signal variations present in the visual signal, and represents an internal adjustment that takes into account unpredictable signal variations given the possibility for signal corruption by additive noise.
Energy efficiency in early sensory coding B. Vincent and R. Baddeley Department of Experimental Psychology, 8 Woodland Road, University of Bristol, Bristol BS8 1TN, UK. In many biological systems, evolution has found solutions that balance function and structure with metabolic expense. This certainly seems to be the case in the energetically expensive locomotor system, and so maybe similar efficiency optimisations exist in the central nervous system that are also energy expensive. This notion is tested against three sensory coding systems which have been well characterised, these are monochromatic and chromatic sensitive neurons in the early visual system and sound sensitive neurons of the auditory system. Simple linear models are constructed to make predictions of the optimal receptive fields that balance information coding with energy efficiency. More specifically, synaptic energy efficiency is examined and is found to predict many aspects of luminance and spatio-chromatic as well as auditory coding. These results support previous work that claims efficiency of firing rates to be a factor in cortical coding of visual images. B. Vincent & R. Baddeley, (2003). Vision Research, 43, 1283–1290
Orientation-texture-defined form: A computational model T.J. Atherton* and K. Langley† *FixerLabs Ltd, Watford. We propose a computational model of spatial frequency channels for orientation-texture-defined (OTD) form. The model accounts for second stage OTD form where foreground objects are differentiated from their background by virtue of their orientation. The model extends the first stage energy and phase orientation model proposed by Atherton (2001). It processes energy orientation outputs of the first stage mechanisms with second stage processes each identical to the first stage. The second stage processing provides energy and phase orientations, as such it marks OTD edges and OTD bars. It naturally extends to higher-order second-stage orientation symmetries. The model is consistent with much of the current understanding of early processing in mammalian visual cortex. We illustrate the model with results of OTD form processing on example images some of which have hitherto been a challenge to computational models. The processing may have wide implications for psychophysics, for an understanding of the variety of “complex-cell” properties found in visual cortex, and for subsequent processing. Encoding the Distributions of Local Motion Signals Johannes M. Zanker Department of Psychology, Royal Holloway University of London, Surrey TW20 0EX, England For both, biological and artificial vision systems, distributions of motion signals are a rich source of information because they provide essential cues about the movement of the observer and the three-dimensional structure and texture of the environment. To understand the early stages of encoding motion information in the visual systems, a biologically motivated model of motion detection can be used as starting point for computer simulations. The information available to visual systems under various stimulus conditions can be studied by assembling simple, ‘elementary’, motion detectors in extended two-dimensional arrays, which generate signal maps reflecting the distribution of motion in the sensor input. The value of this model can be demonstrated for a wide range of scenarios from psychophysics and visually controlled behaviour in virtual and natural environments. Two examples are in the focus of this presentation. (i) How can small involuntary eye movements lead to motion illusions when an observer is looking at simple geometrical patterns, such as those frequently found in Op Art images? (ii) What kind of structure in real-life flowfields can be extracted from image sequences that were generated by moving a video capture system through natural environments?
Lateral masks are more effective interocularly Joshua A Solomon and Michael Morgan Dept. of Optometry & Visual Science, City University, Northampton Square, London EC1V 0HB Monocular blur can be suppressed when the other eye¹s image is in focus. (This is the Monovision phenomenon.) We wondered whether a monocular target would be just as hard to see when the in-focus mask was displayed to the same eye. Adapting Simpson¹s method for measuring interocular suppression of monocular blur, we measured threshold contrast for detecting a monocularly viewed Gabor pattern surrounded by a blurry frame (also viewed monocularly). Adding an unblurred frame to either eye¹s image had little effect. When the blurry frame was removed, the unblurred frame still had no effect when added to the target¹s eye, but nearly doubled threshold when added to the other eye. These results echo contrast-matching results discussed by Meese at last year¹s meeting. Although its effect on detection is small when a blurry frame appears in the target¹s eye, the unblurred frame produces a rightward and (slightly) upward shift of the dipper-shaped function mapping pedestal contrast to increment threshold, when presented to the other eye. Model fits suggest the variance of visual signals elicited by the Gabor pattern increases with the intensity of the interocularly presented frame at the same rate as it increases with the intensity of the Gabor pattern.
Salience predicts change detection in pictures of natural scenes Michael J Wright Department of Human Sciences, Brunel University, Uxbridge, Middlesex UB8 3PH. Salience measurements were obtained for a set of stimuli that had previously been used in a “Change Blindness” study (Wright, Shah & Alston, AVA Xmas meeting, 2002) in which it had been reported that changes in objects were more detectable than changes in shadows. Observers indicated with a mouse click the five most interesting points in the image. In agreement with Parkhurst (Vision Sciences Society, Sarasota, Florida, 2003), selected points were highly clustered, and the nearest neighbour distance was smallest for the first choice points and increased for later choices. Changes in first-choice locations were more easily detected and they did not usually result in change blindness. In order to obtain ratings of salience of image regions that gave rise to significant change detection errors it was necessary to include all 5 choices. When this was done it was found that salience and change detection were significantly correlated. Using multiple regression, the salience measurement was the most significant predictor of the percentage correct change detection for a particular picture (F=9.0; df=1,28; p<0.005), and the influence of the type of change (object change versus shadow change versus surface colour change) was thereby reduced. It can therefore be argued that shadow changes are less detectable than object changes primarily because objects are more salient in an image than shadows. It was confirmed that shadows are rarely chosen as salient even when of relatively high contrast. Whereas it is argued that first-choice salient locations are often predictable from image properties by bottom-up salience models, the later choices may be influenced by structural interpretations of a scene. 'Shinethrough' in simultaneous displays: a case of low spatial frequency masking? Michael Morgan and Joshua A Solomon Dept. of Optometry & Visual Science, City University, Northampton Square, London EC1V 0HB. 'Shinethrough' is an intriguing phenomenon described by Herzog and Fahle. A two-line vernier target is masked by two lateral, parallel flanks on either side of the target; but the addition of further lateral masks (up to 15) reduces this masking effect. Originally the effect was described with a temporal delay between target and mask, and was attributed to some higher-level temporal integration rules. However, we report here that 'shinethrough' is also found with simultaneous mask and target. We conjecture that masking results from the edges of the mask, as seen through a low spatial frequency filter that fails to resolve the individual bars. In confirmation of this idea, we find that adding a further masking line in the position of the vernier target paradoxically improves performance.
POSTER PRESENTATIONS
An investigation of visual masking using real and illusory contours S R Abdelaal, B T Barrett, P V McGraw and D McKeefry Department of Optometry, University of Bradford, Bradford, BD7 1DP. Purpose: In a previous experiment we investigated the standing wave of invisibility illusion using luminance-defined stimuli. We now extend this work to examine the influence of illusory masking contours on stimulus visibility. Methods: The stimulus consisted of a target bar flanked by adjacent masking bars, with no spatial or temporal overlap. In one condition both target and mask were luminance-defined. In a second condition, the masks were defined by illusory contours. Illusory contours were produced using either increments or decrements in luminance, relative to background. In addition, a control condition was investigated where the elements did not produce an illusory masking contour. In any single trial, the target was preceded and followed by the masking stimuli. This sequence of mask-target-mask appeared on one side of fixation, while a reference bar, against which contrast of the target bar was judged, appeared on the opposite side. The reference and target could appear randomly on either side of fixation. The perceived contrast of the target was taken as a measure of masking strength. Results: Luminance-defined masks resulted in a masking amplitude of ~15%. Illusory masks, on the other hand, produced a much lower masking amplitude (~5%). There was little or no masking in non-illusory condition. Illusory contours defined by luminance increments produced greater masking than those defined by decrements. Conclusion: It is widely accepted that illusory contours are encoded by neural structures in cortical area V2. Our demonstration of masking by such illusory contours supports a role for extra-striate cortical areas in visual masking.
Stereoscopic correspondence for ambiguous targets is affected by eccentricity and fixation distance Samira Bouzit and Paul B. Hibbard School of Psychology, University of St Andrews, St Andrews, KY16 9JP, Scotland, UK. When presented with stimuli in which binocular correspondence is ambiguous, observers show a small disparity preference, tending to choose those correspondences that minimise binocular disparity (McKee & Mitchison, 1988, Vision Research 28, 1001). This can be understood if it is assumed that small disparities are more probable than large disparities (Prince & Eagle, 2000, Vision Research, 40, 1143). A consideration of binocular viewing geometry and the structure of the natural environment suggests that the most likely disparity will not necessarily be zero, and will depend on both on eccentricity and the fixation distance. Specifically, crossed disparities would be expected to be more probable both when fixating a distant target (thus increasing the likelihood that other objects are closer than fixation) and in the lower half of the image (due to the influence of the horizontal ground plane). The effects of eccentricity and fixation distance on binocular correspondence were therefore investigated. Observers fixated a central cross, and were presented with a dichoptic squarewave pattern above or below fixation. The left and right eyes were presented with 8 and 9 half-cycles of the squarewave, respectively. Images were presented either with a disparity of one half-cycle, in order to produce an ambiguous correspondence, or with an additional crossed or uncrossed disparity, to overcome any bias in the interpretation of the ambiguous stimuli. Stimuli were presented at 5 viewing distances ranging between 1m and 5m. A clear bias was observed for the ambiguous stimuli, with those presented below fixation tending to appear closer than fixation. For these stimuli, there was an additional effect of fixation distance, with the tendency to report stimuli as closer than fixation increasing with increasing fixation distance. No biases, or effects of fixation distance, were observed for ambiguous stimuli presented above fixation. These results show a clear influence of eccentricity and fixation distance on binocular correspondence, consistent with the expected spatial distribution of disparities in natural images.
Perceptual distortions of three-dimensional shape are consistent between stationary and moving objects Peter Scarfe and Paul B. Hibbard School of Psychology, University of St Andrews, St Andrews, KY16 9JP, Scotland, UK. A Euclidean representation of three-dimensional shape cannot be obtained from horizontal binocular disparity without knowledge of the viewing distance. Errors in three-dimensional shape judgment tasks have been attributed to misestimates of viewing distance (Johnston, 1991 Vision Research 31 1351), and it has been suggested that as a consequence perceptual space in non-Euclidean (Todd and Norman 2003 Perception and Psychophysics 65 31). Others however have attributed these results to a simple ‘contraction bias’, an experimental artifact due to uncertainty in the experimental task, rather a true perceptual bias (Mon Williams et al, 2000 Experimental Brain Research, 133 407). Even if visual space is distorted with respect to physical space, performance in some shape tasks such as depth matching could remain accurate through the use of perceptual heuristics, or ‘rules of thumb’ (Glennester et al, 1996 Vision Research 36 3441-3456). Here we investigated shape constancy for disparity defined shapes that moved in depth. Observers performed three tasks: (i) a standard “apparently circular cyclinder” task in which the three-dimensional shape of a half-cylinder was adjusted so as to appear circular (ii) a shape matching task in which the shape of a half-cylinder was adjusted to match that of another, presented at a different distance and (iii) a shape constancy task, in which observers judged whether a cylinder moving in depth increased or decreased in depth extent. Observers overestimated depth at close distances and underestimated depth at far distances consistently in all three tasks. This is surprising given that accurate performance could have been achieved in the depth matching task through a simple perceptual heuristic (Glennerster et al, op cit) and in the shape constancy task via the combination of disparity and motion information (Richards, 1982; JOSA, A2, 343). The results also question any explanation in terms of a simple ‘contraction bias’, and have implications for determining the circumstances under which perceptual heuristics are used in the interpretation of three-dimensional shape.
The effect of flicker on the Ebbinghaus illusion N.E.Scott-Samuel & R.A.Bowman Experimental Psychology, University of Bristol, Bristol. Following research that suggests that the dorsal stream is not deceived by visual illusions, the effects of presenting a flickering Ebbinghaus illusion were investigated. Flickering stimuli are thought to be processed by the magnocellular pathway, which is associated with the dorsal stream. Stimuli were target circles, either surrounded by flankers or not; the entire stimulus either flickered or did not. Two target circles (one of fixed size, one variable size) were presented simultaneously and participants were required to decide which one they perceived as appearing larger. It was found that presence of small flankers made the central circle appear larger than a circle surrounded by large flankers, as expected. It was also found that flickering the stimuli had no effect on this illusory effect. If flickering stimuli preferentially activate the dorsal stream, the results suggest that this pathway is still deceived by visual illusions.
Lightness Constancy under Variations of Scene Geometry Caterina Ripamonti*§, David H Brainard§ and Marina Bloj¨ * Department of Physiology, University of Cambridge, UK. § Department of Psychology, University of Pennsylvania, Philadelphia, USA. ¨ Department of Optometry, University of Bradford, UK. The perceived lightness of a surface varies with how it is oriented with respect to a directional light source (Ripamonti et al. VSS 03). We have investigated to what extend observers compensate for such variation. The degree to which observers do so is considered to be the degree of lightness constancy. We report data from a lightness-matching task that assesses lightness constancy with respect to changes in the slant of a standard object. On each trial of the experiments, observers viewed an achromatic uniformly-painted flat card and indicated the best match from a palette of 36 greyscale samples. Observers’ matches were measured as a function of test card slant for two different light source positions. We found that observers are neither lightness constant nor luminance matchers and there is considerable individual variation in performance. We present a parametric model (Equivalent Illuminant Model) that accounts for how observers’ lightness matches vary as a function of surface slant. The model is based on an inverse optics process that could achieve lightness constancy. The model has two parameters: the position of a directional light source and the relative intensity of directional and ambient illumination in the chamber. These parameters, together with a model of image formation, predict how lightness should vary as a function of test card slant. We found that (i) the model describes observers’ individual data well, (ii) the parameters of the model vary from observer to observer, and (iii) the parameters of the model vary sensibly when the physical light source position is changed. [Supported by: NIH Grant #EY10016.]
Spatial summation regions for the detection of contrast-modulated blobs are larger than for luminance-modulated blobs at the fovea and in the periphery. Subash Sukumar and Sarah J. Waugh. Anglia Polytechnic University, Cambridge. Evidence from masking studies suggests that the detection of luminance-defined and contrast-defined spatial stimuli under foveal viewing conditions is mediated by independent mechanisms (Schofield and Georgeson, 1999). In this study we characterized spatial summation regions for the detection of luminance-defined and contrast-defined stimuli at the fovea and in the periphery. Gaussian blobs modulated by dynamic noise to create luminance-modulated (LM) and contrast-modulated (CM) stimuli. Detection thresholds were measured at a 1m viewing distance for different blob sizes (sigma = 0.03 to 5 deg) at the fovea and in the inferior visual field (2.5, 5, 10 deg). In addition, thresholds were measured for scaled working distances so that the noise density was progressively lowered with increasing eccentricity. Spatial summation areas calculated from threshold versus size data were found to be approximately 3 times larger for CM than LM blob stimuli at the fovea and in the periphery, qualitatively similar to physiological findings of larger receptive field sizes found in V2 than in V1. Detection thresholds for LM and CM stimuli were differently affected by increasing eccentricity. These differences appear to be related to differences in the detection responses to the internal dot density of the blobs for the two types of stimuli. Together these findings lend further support to the notion that at the detection level, separate processing of LM and CM targets occurs both at the fovea and in the peripheral visual field. |
||||||||
| webdesign by ablen | The AVA is a registered charity (No: 1049146) |