![]() |
Promoting vision research and its applications |
|
||||||||
|
VISIONS OF VISION The sixth Applied Vision Association Christmas Meeting will be held in the Vision Sciences building at Aston University on Monday 17th December 2001. Invited talks will be given by: Bob Snowden Programmme Vision Sciences, Aston University 10.15 am 10.55 am Session 1 11.30 am 11.45 am 12.00 noon 12.15 pm 12.30 pm 12.45 - 1.30 pm Session 2 2.00 pm 2.15 pm 2.30 pm 2.45 pm 3.00 pm 3.15 - 3.45 pm Session 3 4.15 pm 4.30 pm 4.45 pm 5.00 pm 5.15 onwards Speed, accuracy and performance in visual search Dynamic visual processes in normal reading: Implications for developmental dyslexia? How do task demands influence human gaze shifts in a 3-D scene? Variations in perceptual changes viewing an ambiguous stimulus: methodological difficulties Variations in perceptual changes viewing an ambiguous stimulus: differences between naive and experienced observers Pattern-contingent colour aftereffects are formed at a subconscious level Color sensitivity function and specific visual adaptation Trade stands (all day) CRS Meeting Abstracts Complex scenes, simple neurons, and complex applications A great deal is known about the behaviour of the human visual system from both psychophysical and physiological studies using simple stimuli such as gratings. However, the visual environment consists of complex scenes and often elicits complex actions. Can we use information gained about the behaviour of simple units in the visual pathway to say something about how we perceive complex scenes? If so, what novel applications exist that can make use of this knowledge?
Local image structure, metamerism, norms and natural image statistics
Boundary extension in a virtual world Boundary Extension (BE; Intraub & Richardson, 1989, JEP:LMC, 15, 179-187) refers to a memory distortion in which observers appear to remember a greater expanse of a scene than was actually shown. For instance, if they are shown a close-up photograph of a child sitting on the stairs, they will later remember a wider-angle scene. Intraub and her colleagues suggest that BE is mediated by perceptual schemas that anticipate the probable contents of future views. The majority of BE studies have used photographs or line drawings. Here we used virtual reality (VR) to present 3D objects either in isolation (NOSCENE condition) or as the centre-piece of a virtual living room (SCENE condition). Observers were shown a 1 sec. view of each object from a particular viewing distance and orientation relative to the objects' vertical axis. After a 5 sec., blank retention interval, the same object/scene appeared but the viewing distance and orientation were randomized. Observers actively recreated the original viewpoint by updating their virtual position using a joystick-like device. For the SCENE condition a robust BE effect was observed, the magnitude of this error dropping sharply as initial viewing distance increased. In the NOSCENE condition observers underestimated their initial distance, a tendency that increased with viewing distance. Contrasting explanations based on either layout expansion or misjudged size/distance were explored in additional experiments. Discussion also focuses on the use of VR with it's ability to quickly and easily manipulate the presence/absence of both scene and object within and across trials.
Pre-attentive segmentation and correspondence in stereo Traditional stereo grouping models, (e.g., Marr & Poggio 1976, Science, 15, 283-7) have focused on the stereo correspondence problem—the matching of the corresponding monocular inputs to obtain 3D depth. Correct stereo correspondence is responsible for, e.g., disparity capture (the propagation of depth information from the boundaries to the centre of a depth plane to break, e.g., the wall-paper illusion), and (depth) transparency. V2 cells were recently observed to exhibit disparity capture via contextual influences (Bakin et al, J. Neurosci, 20, 8188-8198). Recent physiological data, however, revealed additional unexpected stereo grouping behaviour. Some V2 cells increase their responses to stimuli of their preferred depth when the stimuli within their receptive fields are at or near the boundary of a depth surface (von der Heydt et al, 2000, Vis. Res. 40, 1955-1967). Such highlights to depth edges are seemingly not required computationally merely to solve the correspondence problem. Computationally, these highlights make the boundaries of a depth surface more salient, serving pre-attentive segmentation and attracting visual attention. In special cases, they enable the psychophysically observed perceptual pop-out of a target from a background of visually identical distractors at a different depth. To achieve the highlights, mutual inhibition between disparity selective cells tuned to the same or similar depths is required. However, such mutual inhibition should impede the computation for the correspondence problem, which requires mutual excitation instead between the same cells. In this work, I introduce the first computational model to address both stereo correspondence and pre-attentive stereo segmentation. The computational mechanisms in the model are based on intracortical interactions in V2. I will demonstrate that the model captures the following physiological and psychophysical phenomena: 1) depth edge highlighting, 2) disparity capture, 3) pop-out, and 4) transparency.
Seeing edge blur: receptive fields as multi-scale neural templates Edge blur is an important perceptual cue, but how does the visual system encode the degree of blur at edges? Blur could be measured by the width of the luminance gradient profile, peak-trough separation in the 2nd derivative profile, or the ratio of 1st-to-3rd derivative magnitudes. In template models, the system would store a set of templates of different sizes and find which one best fits the 'signature' of the edge. The signature could be the luminance profile itself, or one of its spatial derivatives. I tested these possibilities in blur matching experiments. In a 2AFC staircase procedure, observers adjusted the blur of Gaussian edges (30% contrast) to match the perceived blur of various non-Gaussian test edges. In Expt 1, test stimuli were mixtures of 2 Gaussian edges (e.g. 10' and 30' blur) at the same location, while in Expt 2, test stimuli were formed from a blurred edge sharpened to different extents by a compressive transformation. Predictions of the various models were tested against the blur-matching data, but only one model was strongly supported. This was the template model in which the input signature is the 2nd derivative of the luminance profile, and the templates are applied to this signature at the zero-crossings. The templates are Gaussian derivative receptive fields that co-vary in width and length to form a self-similar set (i.e. same shape, different sizes). This naturally predicts that shorter edges should look sharper. As edge length gets shorter, responses of longer templates drop more than shorter ones, and so the response distribution shifts towards shorter (smaller) templates, signalling a sharper edge. The data confirmed this, including the scale-invariance implied by self-similarity, and a good fit was obtained from templates with a length-width ratio of about 1. The simultaneous analysis of edge blur and edge location may offer a new solution to the multi-scale problem in edge detection.
Rod contribution to colour appearance A research programme at the Colour & Imaging Institute is investigating the appearance of colours projected onto a screen, in both cinema viewing environments and conventional room presentation conditions. This colour appearance data set will be used to derive or modify a colour appearance model such as CIECAM97s. However most of the present colour appearance models assume that the viewing environment is photopic, even though in many cases viewing conditions for displays in darkened rooms are actually mesopic. Therefore in this study, we aim to understand better the rod contribution to colour appearance.
Revealing perception and action pathways in normal vision: clutching at straws? Peter Thompson & Andrew Dunn (Department of Psychology, University of York, York YO10 5DD, U.K.; E-mail: pt2@york.ac.uk, akd100@york.ac.uk) Since Milner and Goodale's 'perception' and 'action' streams supplanted 'parvo-' and 'magno-' as the thinking man's visual system dichotomy of choice, the race has been on to reveal these streams in the normal visual system.
The early front runner has been the proposal that visual illusions affect the 'perceptual', world-based ventral system but not the 'action', ego-based, dorsal system. We have probed this claim in a series of experiments utilising pointing accuracy towards the end-points and mid-point (marked or unmarked) of the Judd illusion. Further we have investigated the effects of interposing a delay between stimulus presentation and the required response. Dorsal stream representations are short lived and visually guided actions must switch to world based (perceptual) frames of reference after a short delay, allegedly. Thus pointing performance should become equivalent to perceptual performance after a delay.
Stereomotion speed discrimination at multiple disparity pedestals When motion-in-depth is simulated in a random dot stereogram (RDS), the changing disparity (CD) is accompanied by a concomitant inter-ocular velocity difference (IOVD), the combination of lateral monocular motion signals at different velocities in each eye. Dynamic random dot stereograms (DRDSs), however, feature a new random array of dots in each frame and therefore isolate the CD cue. In a 2IFC experiment, the relative contribution of CD and IOVD cues was assessed by measuring speed discrimination thresholds for RDS and DRDS stimuli for a range of mean disparity pedestals. Using ferro-electric shutter glasses and a high-speed fast-phosphor monitor (120Hz per eye), 4 observers (3 naïve) compared the perceived speed of foveally presented pairs of RDS or DRDS stimuli at disparity pedestals of -0.3, 0 or +0.3 deg. Stimuli measured 7.3 x 1.3 deg, receded with a median speed of 0.62deg/s, and were presented for 600ms. An ever-present background pattern of static random dots allowed us to avoid visibility issues, while monocular half-occlusion artifacts were minimised by employing horizontally extended stimuli. For each of 3 observers, thresholds for DRDSs were significantly higher than those for RDSs across the range of disparity pedestals tested (ANOVA, p < 0.004). The mean thresholds for these observers were 27, 23 and 23% for the RDS and 43, 41 and 45% for the DRDS stimuli, at the 3 pedestals respectively. The remaining observer also showed higher DRDS thresholds, except at the near pedestal. In a control experiment, two observers showed no significant effect of varying stimulus duration (500-700ms), suggesting that they are able to respond specifically to the speed, while ignoring initial/final disparity or total disparity displacement. In concert with other recent studies, we conclude that both disparity change and monocular motion cues influence stereomotion speed discrimination.
Detection of 3-D motion is predicted from probability summation of mechanisms sensitive to lateral motion and motion in depth When an object moves in 3-D, its motion can be considered as a combination of two orthogonal components, one parallel to the plane of the eyes (lateral motion) and one perpendicular to it (motion in depth). For any 3-D motion the lateral motion component is the same in both eyes, but the motion in depth component is roughly equal and opposite. How are such 3-D motions detected by the visual system?
A non-orthogonal basis-set for orthogonal components of complex motion Within certain constraints, the complex motions in optic flow can be decomposed into orthogonal two-dimensional vector fields of expansion/contraction, rotation, and two directions of deformation. It might be useful for vision to perform a decomposition of this kind because very different information is provided by the different components (e.g. rate of expansion informs about time-to-contact and deformation informs about surface pose). Psychophysical experiments suggest that human vision does contain specialised mechanisms for complex motion, though the details of the basis-set remain to be elucidated. Here, random-dot coherence thresholds were measured using a sub-threshold summation technique to test whether vision contains mechanisms that form an orthogonal basis-set. In stimulus pairings in which motion components were orthogonal both locally and globally, the components were detected independently (Meese & Harris, 2001, Perception, 30, 1189-1202). However, for a pairing of deformation and rotation, where motions were orthogonal only globally, substantial summation was found indicating non-independent detection. This result is consistent with a model containing detecting mechanisms with direction templates matched to the stimulus components but implies that some of those mechanisms (e.g. rotation) are not antagonised by motion in their anti-preferred directions (Meese & Harris, 2001, Vision Research, 41, 1901-1914). In a second experiment, linear summation over space and direction was found when four cardinal directions of local motion were arranged to approximate rotation, but not when the arrangement approximated deformation. This suggests that vision does not contain mechanisms with two-dimensional motion templates matched to deformation. In general, the results imply a visual system containing multiple mechanisms for complex motion, but not those from an orthogonal basis-set.
Global motion mediated by a red-green mechanism The interaction of colour and motion cues for global motion integration across space has only recently been studied (Edwards & Badcock, 1996, Vision Research, 36, 2423-2431). By using random dot kinematograms with 300 coloured gaussian blobs (0.22 deg, 1deg/s, 5.1deg x 4 deg) we assessed the chromatic selectivity of the global motion mechanism. Observers had to distinguish between an interval with random motion and an interval with 40% of the blobs moving either left or right (2IFC).
Fragmenting the barber pole illusion In a Barber Pole stimulus, different local motion signals (directions) arise from boundary regions and the central region of the aperture, which need to be combined in order to produce a coherent motion percept. Changing the stimulus geometry affects the tendency to perceive motion along the major axis of an elongated aperture (the Barber Pole illusion). Subjects were asked to report perceived direction and the strength of their percept, and the orientation of gratings moving behind a rectangular aperture and the aspect ratio of the aperture were varied independently. Perceived motion direction is closest to perpendicular to grating orientation when aspect ratio approaches unity, i.e. square-shaped apertures, and when grating orientation is close to parallel to the shorter aperture boundary. The pattern of results indicates an interaction between the cycle ratio, which is the sinewave grating equivalent of the terminator ratio for line stimuli, and grating orientation that is effective in the central region. This suggests that a simple cycle (or terminator) ratio explanation cannot fully account for the properties of the Barber Pole Illusion, and generates the prediction that the illusion should be stronger, if the overall length of the boundaries is increased while keeping overall stimulus area and aperture shape constant. The prediction was tested experimentally by fragmenting the aperture in a set of smaller apertures of identical shape and constant cumulative area, and measuring perceived direction. The results of this experiment indicate that the strength of the illusion increases with the number of aperture fragments, i.e. the ratio between circumference and overall area, or the relative contributions from the boundary regions.
Interactions between visual stimulus across the visual field The increase in the phenomenon of 'crowding'—the loss of legibility of letters when surrounded by other letters—is well documented peripheral visual field, however its explanation is not yet clear. As letters contain energy at many spatial scales it is possible that the increase in crowding in the periphery may simply reflect the increasing spatial scale of the peripheral retina. On the other hand if crowding were still to increase in the peripheral field when the stimuli contained only a single spatial scale, this would suggest that the peripheral field is not merely a coarser version of the foveal field. We have measured detection thresholds for a target stimulus that was flanked by two 'masking' stimuli. All stimuli were small patches of sinewave grating so as to limit the spatial scale of the stimuli. We found evidence for increasing interactions between the stimuli as one moves from the fovea to the periphery—though whether this reflects simply stronger interactions, or interactions over a greater distance is not yet clear. However interactions as assessed by summation thresholds do not change across the visual field.
Integration of spatial frequency signals in visual search How do we locate and discriminate targets in multiple arrays? In localisation experiments, subjects indicated the position of an oddball spatial frequency (SF) Gabor target stimulus amongst uniform SF Gabor distracter stimuli. In discrimination experiments, they indicated whether the target was higher or lower in SF than the distracters. Spatial frequency difference thresholds were measured by a forced choice method in which the constant-stimuli were sets of SF differences (df). In both localisation and discrimination experiments, the stimuli were 150msec single frame presentations, usually of 4 Gabor targets. We used a procedure whereby a proportion (k<1) of the SF difference signal (df) is added to the distracters (Baldassi & Burr, 2000, Vision Res, 40, 1293-1300). We refer to this fraction as the bias. Thus with zero bias, if the SF of the distracters was f cycles/deg, that of the target was f+df cycles per degree. For a nonzero bias the SF of the target would still be f+df but the distracters would be f+kdf. Positive values of bias (k) weakened localisation but enhanced discrimination whereas negative bias enhanced localisation and weakened discrimination. We found that the SF sensitivity (1/threshold) was a linear function of the bias. The slope of this function divided by the sensitivity at zero bias (S0) is a variable (m) interpretable in terms of the way signals are combined across target and distracters. For localisation, m was close to -1, despite stimulus changes (contrast, eccentricity, set size) and task changes (unidirectional versus bidirectional SF differences) that influenced threshold S0. For discrimination, m was close to +1 for parafoveal stimuli and increased with eccentricity but was generally smaller than predicted from "compulsory averaging" of target and distracters. Thus the extent of integration of target and distracter stimuli in visual search depends on task demands as well as visual field eccentricity.
The eyes can search large displays more effectively than small ones: an oculomotor paradox? Several experiments were carried out to examine the use of spatial frequency information in the accurate programming of saccades. Subjects were asked to search for a Gabor patch which had a predefined spatial frequency content. In the first experiment, a target (of a predefined spatial frequency content) was presented on the horizontal meridian at either 3 or 6 degrees from the centre while a distractor (which had a different spatial frequency content) was shown at the other eccentricity. Both patches were shown on the left or right hand side of the screen. In a second experiment, sixteen vertically oriented Gabor patches were presented in two annuli with 8 stimuli on each. One target was shown along with 15 distractors. Subjects‚ eye movements were recorded on a DPI eyetracker. Subjects could not discount the presence of the distractor from the saccade programming when a distractor was placed between the target and the central fixation point. However, they were able to accurately direct first saccades on the basis of a difference in spatial frequency when the target was presented in the circular layout even when a distractor was placed between the fixation spot and the target as in the first experiment. The results suggest the paradoxical conclusion that the greater the number of distractor elements the easier it is to localise the target. That this search paradox was carried out on the basis of grouping the distracting elements (Bravo & Nakayama, 1992, Perception & Psychophysics, 51, 465-472; Duncan & Humphreys 1989, Psychological Review, 96, 433-458) is discounted.
The shape of orientation pop-out An orientation singularity is rapidly detected in a display of iso-oriented elements, but its location may be coded imprecisely (Solomon & Morgan, 2001, Journal of Vision, 1, 9-17). We describe the exact shape of such pop-out at different positions in the visual field. The figure shows a trial sequence. Stimulus arrays were 9x9 (as shown) or 5x5. There were 3840 target trials for each array, and up to 160 target-absent trials to estimate response bias. Nearly all errors were toward distracters near the target. Their distribution over the visual field was inhomogeneous, with most errors for targets at larger angles of visual eccentricity, above and below fixation. Results show that localisation was more accurate in the dense array, where there were more distracters, and more potential target locations. This finding is supportive of a role for contextual influences in orientation pop-out, and precludes an explanation in terms of signal detection among independent orientation samples. Learning and attention had considerable effects on performance in this task, and their contribution will be discussed.
Motion vs. position in the perception of head-centred movement Observers compensate for the retinal motion created by an eye movement by adding sensed retinal motion to the felt movement of the eye. One technique used to investigate the relationship between retinal and extra-retinal motion signals asks observers to pursue a target and adjust the velocity of the background pattern until it appears stationary. Typically, the background must move in the same direction as the eye to achieve the null. This Filehne illusion suggests that extra-retinal, eye-velocity signals are smaller than their retinal counterpart, a conclusion that underwrites much thinking in the literature. Like the motion after-effect, however, the Filehne illusion is not accompanied by any compelling change in perceived position yet motion and position are confounded when using the traditional technique. We devised a new technique, based on global motion stimuli, that degrades the influence of familiar position cues. Stimuli consisted of signal and noise dots that were displayed as observers pursued a moving target. All dots moved at the same retinal speed. Observers adjusted the percentage of signal dots until the stimulus appeared stationary with respect to the head. We found that as base retinal speed increased, less signal was needed to achieve the null. One consequence is that the different signal and noise mixtures at the null point should appear to move at the same retinal speed. A second experiment confirmed this idea and also showed that the matched retinal speed equalled that obtained using the traditional nulling technique. Positional information appears to have little influence on the Filehne illusion.
Speed, accuracy and performance in visual search In visual searches through random displays in which target contrast (c) and distractor number (set size, N) are varied, we model behavior by d' = signal/noise = c.T/(N.T.Ve+T.Vi). There are two free parameters: Ve, the external noise variance due to each distractor, and Vi, the internal noise. The total noise is the sum, assuming independence of each noise source. The time T required for processing the display is the mean correct RT less the simple RT (the sensory + motor 'residual' latency, estimated from the time to respond to a target presented with no distractors). Signal strength is target contrast (c) multiplied by observation interval (T) assuming a constant rate of information extraction. In our experiments, c was manipulated by varying the Euclidean distance in [u', v'] color space between the grey field and an equiluminous colored target. Similarly for Ve and distractor contrast.
Dynamic visual processes in normal reading: Implications for developmental dyslexia? Data from two studies relating visual task performance to contextual reading are presented. The first study investigated the relationship between contextual reading and: a) relative spatial encoding for symbol arrays, as well as b) central versus peripheral sensitivity to the frequency doubling illusion. In the first study, thirty unselected school children were measured on their ability to solve a foveally-presented spatial encoding task, as well as their sensitivity to the frequency doubling illusion across the retina. Their performance in the frequency doubling and spatial encoding tasks was uncorrelated, suggesting that these tasks tap independent visual processes. Peripheral (but not central) sensitivity to frequency doubling, as well as spatial encoding, predicted statistically significant, independent proportions of variance in contextual reading (Neale Analysis of Reading Ability). These effects persisted even when variance due to age, IQ, phonological skill and short-term memory was statistically accounted for. The data suggest that successful reading requires not only information about letter identity, but also at least two additional sources of information, probably related to spatial processing of words. The first is a central mechanism that may define the relative spatial location of letters within words, and the second is a peripheral mechanism that we speculate may be related to the attentional processes involved in coarse-scale localisation within a body of text. Consistent with this speculation, we found in the second study, that reading accuracy for dyslexic readers was most impaired relative to chronological- and age-matched controls when contextual material was presented in whole paragraphs, rather than line-at-a-time or word-at-a-time reading conditions.
How do task demands influence human gaze shifts in a 3-D scene? We move our eyes 3-5 times every second to obtain information about our visual surroundings. In everyday situations this information is embedded within a highly complex 3-D scene. How do different task demands influence the dynamics of these gaze shifts? We have begun to study this in a 3-D scene containing real world objects.
Variations in perceptual changes viewing an ambiguous stimulus: differences between naïve and experienced observers. The perceptual changes (PCs) associated with viewing the Necker Cube (NC) occur in two phases. For 2-3mins the rate increases before entering a stationary phase wherein the rate remains steady and the data are amenable to time series analysis (Brown, 1955, Amer. J. Psychol. 68, 358-371; Borsellino et al, 1972, Kybernetik, 10, 139-144). However, our experienced observers only seem to exhibit the stationary phase. We wanted to know why the initial phase was absent.
Variations in perceptual changes viewing an ambiguous stimulus: methodological difficulties Observers find it easy whilst viewing a Necker cube (NC) to indicate the perceptual changes (PCs) that appear to occur between the “cube-up” and “cube-down” perceptual alternatives. Moreover, the frequency distributions associated with either the “cube-up” or “cube-down” percepts, when plotted as a function of percept duration, are adequately modelled by the gamma distribution (Borsellino et al, 1972, Kybernetik, 10, 139-144). However, we failed to replicate this finding when our four experienced observers simply recorded PCs. REGISTRATION FEES Students Other Payments (cheques made payable to 'Applied Vision Association') and notes of intention to attend the meeting should be sent to Vicky Heath (v.e.heath@aston.ac.uk) FURTHER DETAILS: For further information, payment of registration fees, or requests to be included on the mailing list, please contact: Dr Tim Meese (AVA) |
||||||||
| webdesign by ablen | The AVA is a registered charity (No: 1049146) |
|||||||