![]() |
Promoting vision research and its applications |
|
||||||||
|
HUMAN VISION - WHEN IT WORKS AND WHEN IT FAILS CALL FOR PAPERS The seventh Applied Vision Association Christmas Meeting was held in the Vision Sciences building at Aston University on Wednesday 18th December 2002. Invited talks were given by: 1) Pete Bex (University College London) 2) John Harris (University of Reading) 3) David Rose (University of Surrey) The Seventh AVA Christmas Meeting 10.15 am Session 1 (Spatial Vision) Chair: Mark Georgeson 11.15 am 11.30 am 11.45 am 12.00 noon (Invited talk) 12.30 - 1.30 pm Session 2 (Time & Changes) Chair: Tom Troscianko 1.30 pm (Invited talk) 2.00 pm 2.15 pm 2.30 pm 2.45 pm 3.00 - 3.30 pm Session 3 (Dynamic Vision and Noise) Chair: Andrew Schofield 3.30 pm (Invited talk) 4.00 pm 4.15 pm 4.30 pm 4.45 pm 5.00 onwards The visual control of braking. Trade stands (all day) TrackSys
A model for motion sharpening: contrast gain control precedes compressive non-linearity. Blurred edges appear sharper in motion than when they are stationary. We have previously shown how such distortions in perceived edge blur may be explained by a model which assumes that luminance contrast is encoded by a local contrast transducer whose response becomes progressively more compressive as speed increases. To test this model further, we measured the sharpening of drifting, periodic patterns over a large range of contrasts, blur widths and speeds (0-32 deg/s). The results indicate that while sharpening increased with speed it was practically invariant with contrast. This contrast invariance cannot be explained by a fixed compressive non-linearity since that predicts almost no sharpening at low contrasts. We show by computational modelling of spatio-temporal responses that if a dynamic contrast gain control precedes the static non-linear transducer then motion sharpening, its speed dependence, and its invariance with contrast, can be predicted with reasonable accuracy. Perceiving edge blur: linear filtering and a rectifying non-linearity. We studied the visual mechanisms that encode edge blur in images. Our previous work suggested that the visual system spatially differentiates the luminance profile twice to create the 'signature' of the edge, and then evaluates the spatial scale of this signature profile by applying Gaussian derivative templates of different sizes. The scale of the best-fitting template indicates the blur of the edge. In blur-matching experiments, a staircase procedure adjusted the blur of a comparison edge (40% contrast, 0.3 s duration) until it appeared to match the blur of test edges at different contrasts (5-40%) and blurs (6-32 min arc). Results showed that lower contrast edges looked progressively sharper. We also added a linear luminance gradient to blurred test edges. When the added gradient was of opposite polarity to the edge gradient, it made the edge look progressively sharper. Both effects can be explained quantitatively by the action of a half-wave rectifying nonlinearity that sits between the first and second (linear) differentiating stages. This rectifier was introduced to account for a range of other effects on perceived blur (Barbieri-Hesse & Georgeson, 2002, Perception 31, suppl., 54), but it readily predicts the influence of the negative ramp. The effect of contrast arises because the rectifier has a threshold: it not only suppresses negative values but also small positive values. At low contrasts, more of the gradient profile falls below threshold and its effective spatial scale shrinks in size, leading to perceived sharpening. Orientation-masking: suppression and mechanism bandwidths. ‘Within-channel’ models of masking suppose that mask and test excite a common detecting mechanism whose output is followed by a compressive nonlinearity at moderate contrasts and above. From this, Phillips & Wilson (1984; JOSA A, 1, 226-) estimated that orientation half-bandwidths (h) for low spatial frequency (SF) mechanisms are about 30°. More recent ‘cross-channel’ models (e.g. Foley, 1994; JOSA A, 11, 1710-) include suppressive contributions from an inhibitory pool. In these models, a mask component can elevate detection threshold without exciting the detecting mechanism. If it elevates the entire orientation-masking function then the estimate of its half-width increases and, if not accounted for, h is overestimated. We performed masking experiments similar to Phillips and Wilson but extended the range of mask and test orientation differences from 0-45° to 0-90°. For transient presentation (one cycle of 15 Hz) and low SFs (1 - 3 c/deg), masking functions were orientation tuned from 0° to about 30° and then declined gently to 90°, where thresholds were raised by up to factor of 4. We also measured contrast-masking functions and found facilitation at low mask contrasts for parallel but not orthogonal masks. We fitted a model of cross-orientation suppression to our data and those of Phillips and Wilson and found h to be narrower than many previous estimates (as low as 18° at 1.0 c/deg). We found little or no evidence for cross-orientation suppression at high spatial frequencies or for sustained presentation. These and other results suggest that (phase-insensitive) psychophysical mechanisms with properties similar to those ascribed to magnocellular channels have tight orientation tuning and receive substantial suppression from a much more broadly tuned source.
Border ownership and holes. Marco Bertamini (Psychology Dept, University of Liverpool, Liverpool L69 7ZA, UK; E-mail: m.bertamini@liverpool.ac.uk) Unilateral border ownership implies that a hole is a background region and therefore shapeless, yet people recognise the shape of holes as well as the shape of objects. If what people perceive is the surrounding object then its shape has a reversed curvature polarity (i.e. a changed sign of curvature) compared to the hole region perceived as an object. We found before that observers are faster at judging the position of convex regions, therefore we predicted that a manipulation of figure/ground should produce a crossover interaction (i.e., a reversal of the relative speeds when the same regions were perceived as holes instead of objects). The interaction means that independently from the shape used, the response is always faster when the vertex is perceived as convex. Moreover, with stereoscopic information (random dot stereograms) depth stratification was made unambiguous and with these stimuli we found an even stronger interaction. We conclude that a change from figure to hole always reverses the encoding of curvature polarity. In turn polarity obligatorily affects the processing of position.
Distortions in the visual perception of size in Parkinson's disease. John Harris (School of Psychology, University of Reading, Whiteknights, Reading RG6 6AL UK; E-mail: j.p.harris@reading.ac.uk; fax: +44 118 9316715) Although Parkinson's disease (PD) is often regarded clinically as a motor disorder, much research over the past 25 years has demonstrated changes in perception and attention in the illness, some of which may contribute to the well-known problems of motor control. Patients with worse left-sided motor symptoms misbisect horizontal lines too far towards the right, and vertical lines too far down. They also judge rectangles in left visual space to be narrower than rectangles in right space, and rectangles in upper space to be shorter than rectangles in lower space. Patients with worse right-sided motor symptoms do not differ from controls on these tasks. The pattern of data suggests a compression of at least the left and upper regions of visual space in PD (a compression which is greater in patients with worse left-sided motor symptoms, and so presumably worse right hemisphere damage). Some contrasts with the literature on perceptual distortions after stroke, and some speculations about the underlying neural mechanisms, will be put forward. Time and the Observer Revisited The serial processing model of vision still guides some aspects of our research, but has been criticised from a number of directions. Neuroscientific and psychological evidence for the alternative models (parallel processing, recurrent processing) are well known. These have recently been joined by conceptual critiques, which aim particularly at the timing of visual processes, such as decision making and awareness (e.g. "Time and the Observer", by Dennett and Kinsbourne, Behavioral and Brain Sciences, 1992). In this talk I compare and (attempt to) integrate the philosophical and empirical approaches. I conclude that Dennett and Kinsbourne's "multiple drafts" model, while vague, is consistent with much current evidence as to how visual processing works. Distorting time. Derek H. Arnold1, Colin W.G. Clifford2 & Alan Johnston1 (1Dept. of Psychology, University College London; 2 Dept. of Psychology, The University of Sydney).It has been suggested that the time course of perceptual experience is not determined by the time course of neural activity, but by an interpretive process that corrects for the inherent temporal ambiguities of sensory processing (Eagleman & Sejnowski 2000 Science 287 2036 – 2038). We examined the possibility that the time course of perceptual experience could be distorted by manipulating a low-level stimulus attribute designed to influence sensory processing in a characteristic fashion. When successively presented opponent directions of motion were contrasted, the second interval of motion needed to be longer than the first to be perceived as being of the same duration. This asymmetry was reversed when the angular difference between the successive directions was reduced to 90°. This indicates that the inherent dynamics of sensory processing can perturb our sense of timing. Therefore, our results suggest that any interpretive analysis that is causally involved in the production of perceptual experience may be subject to the temporal limitations of the sensory processing on which it is based.
The ability of observers to detect different types of changes in complex scenes was measured, using pairs of colour slides presented across an inter-stimulus interval (ISI). A localisation method was used. In keeping with “change blindness” findings, only 34.4% of changes were correctly located on a single presentation, but this showed wide variation with different pictures (0%-90.6%). Multiple regression was used to test different variables as predictors of change localisation. The only significant predictor was the rated salience of the pre-cued picture difference. Measures derived from the 2D Fourier amplitude spectrum of the difference image did not reach significance as predictors, despite evidence that such measures are effective in explaining the discrimination of natural images at threshold. Other non-significant predictors were rated complexity and presentation order. Analysis of the unstandardised residuals from the regression showed significant effects of the type of visual information. It was found that object additions and deletions were more detectable than predicted, whereas shadow additions and deletions were less detectable than predicted. Deletions were more detectable than additions for objects but not for shadows. It is argued that when we view a scene, we build an incomplete representation of that scene, in which objects are more strongly represented than shadows and surface colour changes.
Neural correlates of change detection and change blindness. Event related potentials (ERPs) were recorded during a change detection task with two-frame stimulus presentation. The stimuli were pictures of faces and places presented centrally two at a time and concurrently with a peripherally presented letter search. Using a divided attention task ensured a failure to detect changes on a proportion of trials. Consistent with previous reports of attentional modulation in extrastriate cortex, when change detection was successful there was some evidence of increased amplitude of occipital P1 and N1 on presentation of the first stimulus frame. The clearest effect of hit versus miss trials was evident in terms of an amplification of P300 and N400 following the second stimulus. The largest evoked potentials were localised in right occipital-parietal areas, although differences between amplitudes were maximal in right parietal areas for the place stimuli and medial frontal areas for face stimuli. Place stimuli also produced differences in activation of the medial frontal areas but these effects were weaker than for face stimuli and may have reflected lower detection rates. Face stimuli would be predicted to produce activation in the fusiform gyrus - however this would not be readily recorded using EEG. The P300 can be evoked by presentation of old (recognised) objects, whereas the N400 is produced by novel stimuli. The medial frontal stimulation in addition to the more occipital-parietal / temporal activation is consistent with visual working memory processes. This was coupled with increased attention to the pictures during change detection evident in terms of greater P1 and N1 components of ERP.
Attentional capture by new objects and attentional loss by old objects. It is well known that new objects onsetting in a visual display tend to capture attention at the expense of already present “old” objects. Observers are more efficient at detecting a target that is a new object than a target that is an old object. We report a series of experiments investigating how this effect varies with the age of the old objects. Performance was studied for two tasks, one in which a target was defined by identity (E or P amongst other letters) and one in which a target was defined by a sensory feature (dark square amongst lighter squares). In circular mixed displays of new and old objects there was an advantage for new targets at even very short SOAs for both tasks. This difference increased with SOA when displays contained more than a single new object, though not otherwise. For the letter identity task, the increase largely reflected improving performance for new objects, with only a modest decrement in detection of old targets. For the feature target task, by contrast, detection of new targets increased only to a small degree with SOA whereas detection of old targets became extremely inefficient. These findings suggest a differential loss of access to information about objects as they age. Access to low level (featural) information about an object appears to diminish rapidly after onset whereas access to higher level (identity) information is much less affected. This was confirmed in two studies with displays containing objects all of the same age. Performance on the feature task fell dramatically within 500 ms of object onset whereas performance on the letter identification task was much less affected by object age.
Spatial interference in dynamic stimuli. P.J. Bex, A. J. Simmers & S. C. Dakin (Institute of Ophthalmology, UCL, London, UK; E-mail: p.bex@ucl.ac.uk) Contrast sensitivity, visual acuity, crowding, reading, and oculomotor stability are worse for peripherally-viewed stimuli, but sensitivity to dynamic stimuli remains relatively unchanged. To establish which, if any, of these visual factors underlies the improvement in reading rates afforded by rapid serial visual presentation (RSVP), we measured acuity and crowding effects with drifting gratings and letters. Observers either reported the identity of a Sloan letter or the orientation of a T target that moved along an annulus of constant eccentricity, flanked by up to four similar elements. Size thresholds and crowding increased with eccentricity, in line with previous studies. However, while size thresholds increased with speed, spatial interference zones (the area around the target within which crowding occurred), remained constant. Single elements that were more peripheral or moved ahead of the target crowded more than those that were more foveal or trailed behind it. Observers also identified the direction (up/down/left/right) of a drifting Gabor patch flanked by four Gabors whose directions covaried to form meaningful (rotation, radiation, translation) or random global patterns of movement. Crowding was greatest when flanking elements all moved in different directions, regardless of the global form they created and when flanking and target elements were of the same spatial frequency at any temporal frequency tested. Thus letter resolution and crowding are not improved by temporal modulation under conditions of steady fixation and do not account for the effectiveness of RSVP. Explanations of crowding that are based on compulsory texture integration vs segmentation cannot easily explain why crowder directions that average to zero are the most effective, whether in random or meaningful configurations. These findings are consistent with explanations in which local structure is encoded independently, but with positional uncertainty within the spatial interference zone. Males are ‘noisy females’ when it comes to reporting the psychological structure of the basic colours. Lewis D. Griffin (Imaging Science, School of Medicine, King’s College London; E-mail: L.D.Griffin@kcl.ac.uk)The assumption that reported gender differences in colour language use are due to cultural effects has been called into question by two discoveries (i) ‘extra’ photopigment genes are more common in women than men [Neitz et al. 1998, Vision Res, 38, 3221-3225], and (ii) a performance difference in a colour task that correlated with the number of genes [Jameson et al. 2001, Psychonomic B Review, 8, 244-261]. With this context in mind, using data collected for another purpose [Griffin 2001, Color Res Appl, 26, 151-157], I investigated gender differences in the psychological structure of the basic colour terms. 82 subjects of each gender answered a total of 32497 similarity questions (e.g. “which are more similar brown & blue or white & orange?”). 24 subjects of each gender answered a total of 2612 lightness questions (e.g. “which is lighter yellow or pink?”). Analysis showed: (i) no gender difference for lightness questions; (ii) a significant (p<0.05) gender difference for similarity questions; (iii) that the gender difference is noise-like i.e. devoid of significant structure; (iv) a model of females as ‘noisy males' is rejected (p<0.05); (v) a model of males as ‘noisy females’ is accepted. As corroborative evidence for the model, I note that both individual and pooled male similarity judgements are less transitive than females (both tests p<0.05). This argues against the gender difference being due to greater between-male than between‑female variation, rather it suggests that males are individually noisy. I conclude that: the psychological structure of the basic colours is the same in males and females, but males are slightly more careless reporters of this structure.
Understanding cone distributions from saccadic dynamics. Is information rate maximised? Although the retinal cone distribution is known for all retinal locations, there is not yet a satisfactory explanation as to why cones are so distributed. Intuitively, the cone density is highest at the fovea since eye movements bring objects of interest there. However, there has been little progress beyond this qualitative suggestion. Another proposal is that a decreasing sampling density with eccentricity facilitates scale invariant recognition of objects centred at the fovea. However, this cannot account for the radial asymmetry of cone distributions. We use information theory to establish a quantitative relationship between the retinal cone distribution and the dynamics of eye movements. The rate of information transfer is maximised if the receptor density at any location is proportional to the average rate at which information is received on that part of the retina, or the probability that the image of an object should be at that location. We build a model to compute this probability from experimental data on saccade dynamics. The resulting probability increases as (1/eccentricity) for small eccentricities, and is approximately constant in the periphery, in agreement with the general form of the retinal cone distribution. Using this approach we can also address the radial asymmetry of the receptor distribution. Our model provides a valuable link between two so far unrelated bodies of experimental data, and makes experimentally testable predictions, e.g., on how receptor distributions in some animals could be affected by eye movement dysfunction.
Ambiguity and biological motion. Ian M. Thornton (Max Planck Institute for Biological Cybernetics,Tuebingen, Germany; E-mail: ian.thornton@tuebingen.mpg.de). Ambiguity - and the "errors" it creates -- have long been used as probes into visual processing. Here I describe a new form of dynamic ambiguous stimuli - the chimeric point-light walker - which is created by superimposing the profile views of a left and right facing figure. When viewed in isolation, this figure - which is ambiguous as it simultaneously suggests motion in both directions - does not appear to walk, but rather to be performing some complex novel action. However, when the figure is presented in a mask of additional moving dots, observers consistently fail to notice anything odd about the walker, reporting instead that they are watching an unambiguous figure moving either to the left or right. Some observers report that the initial percept fluctuates, moving first to the left, then to the right, or vice versa others always perceive a constant direction. All observers, when briefly shown the unmasked ambiguous figure, have no difficulty in perceiving the novel motion pattern once the mask is returned. These two findings, the initial report of unambiguous motion and the subsequent "primed" perception of the ambiguity are both consistent with an important role for top-down processing in biological motion. I will discuss several domains within the realm of biological motion processing where this simple stimuli may have an application.
Analysing optic flow generated by locomotion through a natural environment. Some 50 years have passed since Gibson drew attention to the characteristic field of velocity vectors generated on the retina when an observer is moving through the three-dimensional world. Although many theoretical, psychophysical, and physiological studies have demonstrated the use of such optic flow for a number of navigational tasks under laboratory conditions, we still know little about the actual flowfield structure under natural operating conditions. To study what motion information is available to the visual system in the real world, we moved a panoramic imaging device on accurately defined paths in outdoors environments and captured image sequences under a variety of conditions. These image sequences were used as input to a biologically inspired motion detector network which allows us to analyse the distribution of motion signals generated by such locomotion. We found that motion signals are sparsely distributed in space and that local directions can be ambiguous and noisy, thus giving rise to motion signal maps in which local direction and strength can vary considerably. Spatial or temporal integration would be required to retrieve reliable information on local motion vectors. On the other hand, the overall structure of the flowfield, with distinct centres of expansion and contraction, is obvious even in sparse and noisy motion signal maps, and a surprisingly simple algorithm can be used to retrieve rather accurately the direction of heading, demonstrating the richness of information gathered with a panoramic field of view. Our approach is a first step to assess the role of specific behavioural, environmental and computational constraints on natural optic flow processing. The visual control of braking. Lee (1976, Percep., 5, 437-459) proposed that drivers control their braking towards a static target by maintaining the rate of change of time-to-contact (tau-dot) at a margin value between 0 and -0.5. The work reported here tests this proposal using an interactive, computer-based task in which participants use a foot-brake to control their simulated approach towards a visual target. Results are broadly consistent with an implementation of Lee’s hypothesis (Yilmaz & Warren, 1995, JEP: HPP, 21, 996-1014) in which both the direction and magnitude of brake adjustments are determined by the discrepancy between the instantaneous value of tau-dot and the margin value. We also report a new analysis demonstrating that different starting conditions converge upon similar tau-dot values, as would be expected if participants adopt a consistent strategy based upon tau-dot. Currently, we are investigating whether or not physical constraints imposed upon the values of tau-dot that are achievable during braking could produce results that are suggestive of a tau-dot based strategy from random behaviour alone. Alternative strategies for the control of braking were reviewed by Yilmaz & Warren. These include the use of 3D environmental structure to compute required deceleration from spatial variables – a strategy which cannot as yet be discounted. |
||||||||
| webdesign by ablen | The AVA is a registered charity (No: 1049146) |