Next Article in Journal
Intracorneal Ring Segment Implantation for the Management of Keratoconus in Children
Previous Article in Journal
Contrasting a Misinterpretation of the Reverse Contrast
 
 
Concept Paper
Peer-Review Record

Attention as a Unitary Concept

by Adam Reeves
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Submission received: 24 July 2020 / Revised: 26 October 2020 / Accepted: 2 November 2020 / Published: 9 November 2020

Round 1

Reviewer 1 Report

The author makes an ambitious attempt to review and unite attention as a selection phenomenon. Where, by selection, he means a process that improves performance, woefully still our only criterion for the presence of attention.

 

I like the review aspect of the paper, it is well thought out, broad and inclusive. The modeling part is interesting, as is the extension to some classic paradigms — but it is hard to this level of modeling into a few pages in a comprehensible fashion. I will come back to it later.

 

My problem is that I am not sure what the overall message is, and I think that the author is unsure as well.

 

One core point is the performance advantage given by attention: “Intuition suggests that paying attention should enhance performance.” But then in the final paragraph, he lists two papers that challenge the attention-performance link — Fine and Reeves and Bahcall and Kowler. He then concludes that attention need not improve performance. It must be operationalized by instruction or task and not by outcome.

 

This is a pretty weak ending because it is not clear how you would know that your instructions were appropriate. Perhaps I tell you to pay attention to the cued location but performance shows no cuing advantage. I would conclude that the cue was ineffective (too brief, visual field too cluttered, response instructions too complicated) before I would pass on to the next interpretation – attention was there but the signal was impervious to attention’s powers.

 

In other words, the author’s conclusion about performance could just as easily be applied to his operationalization of attention through instructions and task. Specifically, the appropriate instructions and task could normally ensure the engagement of attention but need not do so.

 

I suggest that the author could back up the failures of attention with a deeper analysis and hold to improvement as the mark of attention. It certainly intends to improve performance, but it can be thwarted. This would be a more defensible position than giving up and holding on to good instructions as the definition of attention.

 

Yes there are cases where the participant’s brain has engaged the proper mechanisms to favor processing of the target but this incurs costs as well. Bahcall and Kowler found that when two targets are close, they interfere with each other’s readout – this is a simple crowding effect. Attention has inhibitory surrounds (in space, Jeffery Mounts, time, Yaffa Yeshrun, and feature space, Tstosos, Störmer & Alvarez) so attending to one location, moment, or feature has costs nearby. Perhaps in the Fine and Reeves case, the attentional load for refining the focus of attention outweighed the baseline performance of attending to all locations.

 

So yes, practically speaking, the consequence of attending may include a loss of performance. These can then be examined and lead to a better understanding of the architecture of attentional selection. It should not lead to a suggestion that the performance criterion be abandoned. Especially not as the final and most important conclusion of the paper.

 

The author should also point out the other factors that can improve performance that we should not consider as attention. First, of course, learning. Please differentiate learning and attention. Second, short term memory. Well, this is really learning too, but on a shorter time scale. Most would say that attention, or at least awareness, is necessary for items to enter explicit memory and certainly we could think of attention as memory at time zero. The author has avoided the issue of attention as a gateway to awareness (and so to memory) and that is a good thing. There is a lot of debate there, but it does not really lead anywhere. Anyway, how do we distinguish between performance advantages that arise from memory and those that arise from attention. I believe the author could make this distinction clear in a sentence or two. What about other low level effects like aftereffects – orientation adaptation raises threshold but also improves orientation discrimination around the adapted orientation. Small points but if the author were to stick to performance as the key diagnostic of attention, then he needs to dismiss other factors that boost performance.

 

Now what about all this modeling, which makes up the bulk of the paper.

 

The prescriptive model is impressively explicit in giving labels to factors but assumes some kind of error minimization. That is problematic because it requires that the goal be known – by the system. That sounds a lot like supervised learning. How does the system know the goal? ( Or could the goal be a high signal to noise ratio? But how to know that?)  I feel like attention is unsupervised – it does not know what will be at location X but it will give it all possible advantages for recognition. It is like the volume control on the stereo system. It doesn’t need to know what is being processed or what the content goal might be, it just boosts what is coming in.

 

The descriptive model is also quite weighty but describes overall performance in a task involving decisions, goals, weighting, and attention – where it is not clear what parameters are controlled by attention, these “need specifying.” This may be too broad to lead to insights about attention.

 

The AGM model, if you will permit me the redundancy, is more targeted on attention and the attentional window. This could be broadened to include inhibitory surrounds (see papers by Mounts and Tsotsos) and feature space (Tosotsos again, and Störmer and Alvarez). This identifies some of the architecture of attention itself.

 

The application to visual search again strays from the infrastructure of attention and models the overall cognitive process. It is not clear where attention plays its role here. There is the “A” in the M equation but as before we do not know (unlike the AGM model) which parameters are controlled by attention so we cannot know how quirks of attention would be visible in the performance. Also “A” has changed from being all parameters controlled by attention, into the attention window. Perhaps if the model presentation began with that, it would be more cohesive. What is confusing is that the different sections appear first to be about different, perhaps competing models. But in the end, they are a cumulative growth to a model "M" of everything. If I have understood -- this should be made clear at the start. Or if indeed these are separate models, that happen to borrow bits of each other, than make that clear as well. 

 

In the section for the derivation of d’ we definitely have a problem that we cannot know what part of d’ is due to attention and what part is due to learning. It is not clear here how to disentangle the two. The factor "A(..)" in the model doesn't help much.

 

For the attention repulsion effect, the model M is expanded to somehow allow shifts in location. I am not sure how this happens nor what we should conclude from it. . Does boosting signal increase space as well as signal? It is interesting as it is the first section dealing with the ability of attention to affect stimulus properties other than signal strength. Carrasco’s studies of attention’s effect on contrast might be another example, although controversial, and also just signal srength.

 

It is interesting that the author has gone through all these modeling exercises without mention Claus Bundesen who has done pretty much the same thing. Or Posner’s ANT approach that tries to separate elements of attention in addition to selection. They at least deserve mention.

 

OK, overall, the model is very descriptive – here are some equations that have symbols for pretty much everything that is happening in a task. One of the is labeled “A” and so encapsulates all that attention is and does. Fine, but I don’t feel that gets us anywhere. Only the section on the attention window gave some specific properties that are known for attention's architecture. More of that please. One way to expand on it is to include not just time, space, and feature space in the window (with inhibitory surrounds) but also content. Attention is famously limited in bandwidth. That has not been discussed but could be an explicit part of “A(..)”

 

The modeling is enclosed in a discussion of whether attention can be defined by performance improvement. That needs sharpening and also needs something to distinguish attention-driven improvement from the host of other factors that can improve attention.

 

 

 

 

 

 

 

 

 

Author Response

Reviewer1.doc   aug 14, 2020

The author makes an ambitious attempt to review and unite attention as a selection phenomenon. Where, by selection, he means a process that improves performance, woefully still our only criterion for the presence of attention.

I like the review aspect of the paper, it is well thought out, broad and inclusive. The modeling part is interesting, as is the extension to some classic paradigms — but it is hard to this level of modeling into a few pages in a comprehensible fashion. I will come back to it later.

My problem is that I am not sure what the overall message is, and I think that the author is unsure as well.

 The point of the paper is to present an account of what ‘selection’ means; this is made clear both by the prescriptive section at the start, and the (brief) comparison with other forms of attention at the end.

  The ending of the paper mislead the reviewer into thinking that that the core message was that attention must be operationalized by task rather than outcome. I agree that this formed a weak ending, especially as only Fine & Reeves really nailed this, and in a review article like this one, a single research result cannot be taken as definitive. I nevertheless believe it. I therefore moved that paragraph up, and introduced it as a caveat to the notion of ‘selection’ as necessarily improving performance. 

 One core point is the performance advantage given by attention: “Intuition suggests that paying attention should enhance performance.” But then in the final paragraph, he lists two papers that challenge the attention-performance link — Fine and Reeves and Bahcall and Kowler. He then concludes that attention need not improve performance. It must be operationalized by instruction or task and not by outcome.

This is a pretty weak ending because it is not clear how you would know that your instructions were appropriate. Perhaps I tell you to pay attention to the cued location but performance shows no cuing advantage. I would conclude that the cue was ineffective (too brief, visual field too cluttered, response instructions too complicated) before I would pass on to the next interpretation – attention was there but the signal was impervious to attention’s powers.

In other words, the author’s conclusion about performance could just as easily be applied to his operationalization of attention through instructions and task. Specifically, the appropriate instructions and task could normally ensure the engagement of attention but need not do so.

I now point out that focusing did improve performance on optotypes, just not on letters, and direct the interested reader to Reeves (2019) which explains why; in a word, ‘noise’ in the letter task increases with focusing by more than does the signal, but in the optotype task by less than the signal. (The reviewer appears to think that focusing was ineffective, but with letters, it made things significantly worse.)

I suggest that the author could back up the failures of attention with a deeper analysis and hold to improvement as the mark of attention. It certainly intends to improve performance, but it can be thwarted. This would be a more defensible position than giving up and holding on to good instructions as the definition of attention.

I agree that attention normally improves performance but in some cases, it does not - and ‘thwarts’ expresses this well. But I conclude from this that defining selective attention by an improvement in performance is contrary to fact, as increasing attention (by focusing on 2 of 4 possible well-separated positions in the visual field, an easy task) reduced d’ in our study. It does so when increasing attention increases the ‘noise’ more than the ‘signal’, as modelled by Reeves (2019).  I regard this explanation as ‘deeper’ than the one offered by the reviewer. However I acknowledge that only if other labs come up with further examples in which increasing attention increases the noise more than the signal, will this idea gain currency.

 Suppose that focusing attention increases the signal and noise equally, as would occur if both are encoded by the same filter; then the S/N ratio will not change even as attention is increased. An example is listening for a tone in narrow-band noise, both being in the same critical band, or selecting a color from a patch of surrounding colors, when the total array falls inside the radius of the attention field. This would be another, I think fairly common, way in which performance and attention may be uncoupled. Concentrating attention doesn’t always improve performance, and when other factors are controlled for, thinking about the S/N ratio may explain why.

Yes there are cases where the participant’s brain has engaged the proper mechanisms to favor processing of the target but this incurs costs as well. Bahcall and Kowler found that when two targets are close, they interfere with each other’s readout – this is a simple crowding effect. Attention has inhibitory surrounds (in space, Jeffery Mounts, time, Yaffa Yeshrun, and feature space, Tstosos, Störmer & Alvarez) so attending to one location, moment, or feature has costs nearby. Perhaps in the Fine and Reeves case, the attentional load for refining the focus of attention outweighed the baseline performance of attending to all locations.

There are two ideas here. The first is attentional inhibition, which is compatible with more attention decreasing performance, and so does not alter my case. Crowding may be attentional or due to low-level confusion of features inside the same receptive field – there are examples for both- but this is complicated and would take me to far off track to discuss here. The other idea is ‘load’ which is an important concept and might explain why increasing attention has no further effect – the maximum attention has already been given -but does not apply to the optotype /letter contrast in Fine & Reeves’ study, as I now explain.

The author should also point out the other factors that can improve performance that we should not consider as attention. First, of course, learning. Please differentiate learning and attention. Second, short term memory. Well, this is really learning too, but on a shorter time scale. Most would say that attention, or at least awareness, is necessary for items to enter explicit memory and certainly we could think of attention as memory at time zero. The author has avoided the issue of attention as a gateway to awareness (and so to memory) and that is a good thing. There is a lot of debate there, but it does not really lead anywhere. Anyway, how do we distinguish between performance advantages that arise from memory and those that arise from attention. I believe the author could make this distinction clear in a sentence or two. What about other low level effects like aftereffects – orientation adaptation raises threshold but also improves orientation discrimination around the adapted orientation. Small points but if the author were to stick to performance as the key diagnostic of attention, then he needs to dismiss other factors that boost performance. 

I have not taken this advice. Each of these other factors has its own role to play in performance and discussing them would turn this paper into a textbook. A simple example is learning. Most research on selective attention uses well-practiced subjects who have either learnt the task over several sessions, or use a task that is so elementary that only a short period of practice is needed.  That is, the factor of ‘learning’ is eliminated as far as possible. From the point of view of learning theory, one may ask if learning presupposes attention or not – can one ‘learn without awareness’? I see this question as basic to learning but one that can be answered either way, without altering the meaning of ‘selective’ attention as specified in the model I am presenting. I just don’t think it necessary to list all the other (well-known) factors that can influence performance, in a paper which specifically argues that selective attention is defined by pre-chosen weights and by biases, and is not defined by changes in performance.

 Now what about all this modeling, which makes up the bulk of the paper.

The prescriptive model is impressively explicit in giving labels to factors but assumes some kind of error minimization. That is problematic because it requires that the goal be known – by the system. That sounds a lot like supervised learning. How does the system know the goal? ( Or could the goal be a high signal to noise ratio? But how to know that?)  I feel like attention is unsupervised – it does not know what will be at location X but it will give it all possible advantages for recognition. It is like the volume control on the stereo system. It doesn’t need to know what is being processed or what the content goal might be, it just boosts what is coming in.

An excellent point. I was using the term ‘error’ based on my understanding of saccade models in which the ‘error’ is a difference between a goal and a current value (here, y-y’), not an ‘error’ in the sense of a discrepancy with reality or logic (as in ‘2+2=5’). I have now made this plane. Thankyou.

The descriptive model is also quite weighty but describes overall performance in a task involving decisions, goals, weighting, and attention – where it is not clear what parameters are controlled by attention, these “need specifying.” This may be too broad to lead to insights about attention.

 I made the model (Eq. 1, 2)  as generic as possible so that any function of time and space could be inserted into the term for attention (A), as I was trying to lay down its pre-conditions, such as, having defined goals, biases, and so on. So the generic model is maximally ‘broad’ about selective attention itself, that is, how weights are to be applied to inputs. (The AGM is just an illustrative example of an attention model that can be fit into the generic model, up to the point at which short-term memory is included, when it leaves the domain of ‘selection’ and becomes concerned with retention.) Whether the generic model will turn out to be useful is to be seen; I hope it will.

The AGM model, if you will permit me the redundancy, is more targeted on attention and the attentional window. This could be broadened to include inhibitory surrounds (see papers by Mounts and Tsotsos) and feature space (Tosotsos again, and Störmer and Alvarez). This identifies some of the architecture of attention itself.

 This is a very interesting point. In simple terms, it means that some items receive not zero, but negative weights. I have now stated this in presenting the definition of w. I thank the reviewer for making this point. I have not expanded on this any further, as the paper already includes a good number of examples and I am not claiming to be exhaustive. But the reviewer thinks that the ‘architecture’ of attention, which I take it means more than just including negative weights, is involved.  This may well be so, in which case the generic model would have to be modified.

The application to visual search again strays from the infrastructure of attention and models the overall cognitive process. It is not clear where attention plays its role here. There is the “A” in the M equation but as before we do not know (unlike the AGM model) which parameters are controlled by attention so we cannot know how quirks of attention would be visible in the performance. Also “A” has changed from being all parameters controlled by attention, into the attention window. Perhaps if the model presentation began with that, it would be more cohesive. What is confusing is that the different sections appear first to be about different, perhaps competing models. But in the end, they are a cumulative growth to a model "M" of everything. If I have understood -- this should be made clear at the start. Or if indeed these are separate models, that happen to borrow bits of each other, than make that clear as well. 

Yes, they are all meant to be examples of how ‘M’ can be specialized. I now clarify this in the Introduction and also at the start of the relevant section.

In the section for the derivation of d’ we definitely have a problem that we cannot know what part of d’ is due to attention and what part is due to learning. It is not clear here how to disentangle the two. The factor "A(..)" in the model doesn't help much.

The model (and factor A) assumes stationarity, i.e., that ‘learning’ has ceased and ‘performance’ will not change during the course of an experiment, so any ‘disentangling’ has to be done experimentally. I thank the reviewer for bringing up this point,  and I now make stationarity explicit.

 

For the attention repulsion effect, the model M is expanded to somehow allow shifts in location. I am not sure how this happens nor what we should conclude from it.  Does boosting signal increase space as well as signal? It is interesting as it is the first section dealing with the ability of attention to affect stimulus properties other than signal strength. Carrasco’s studies of attention’s effect on contrast might be another example, although controversial, and also just signal strength.

Boosting the signal can increase the effective space parameter if a fixed filter with a gaussian shape, fixed sigma, and variable amplitude, is embedded in constant internal neural noise. In this case, the tails of the gaussian, initially indistinguishable from noise, may be boosted enough above the noise level to capture a wider range of signals. The model as presented does not include this property, which has been observed in neurons but, as far as I know, not behaviorally. Another interesting question, but I have not tried to address it here.

It is interesting that the author has gone through all these modeling exercises without mention Claus Bundesen who has done pretty much the same thing. Or Posner’s ANT approach that tries to separate elements of attention in addition to selection. They at least deserve mention.

Yes. I had thought that Bundesen’s TVA was primarily a model of short-term visual memory, and I set it aside, but on reading the ‘extended’ TVA model of 2015, I see real communalities with Eq. 1 , which I now refer to. Thank you for this illuminating comment.

 I now reference Posner’s (2012) paper, especially as he identifies possible brain mechanisms underlying selection (or ‘orientation’) versus alerting and executive control, at the end of the paper where I list other meanings of ‘attention’. Thanks for this useful pointer.`

OK, overall, the model is very descriptive – here are some equations that have symbols for pretty much everything that is happening in a task. One of the is labeled “A” and so encapsulates all that attention is and does. Fine, but I don’t feel that gets us anywhere. Only the section on the attention window gave some specific properties that are known for attention's architecture. More of that please. One way to expand on it is to include not just time, space, and feature space in the window (with inhibitory surrounds) but also content. Attention is famously limited in bandwidth. That has not been discussed but could be an explicit part of “A(..)”

I now include a definition of attentional bandwidth as the integral of the weights determined by the attention window multiplied by the selected inputs. This is a nice way of incorporating an important concept. I thank the reviewer for this hint.

  A different meaning of ‘bandwidth’ is capacity limitation. Given that items are attended, can they all be processed further on, recalled, etc.? For example, not more than 4 items in visual short-term memory. I treat this as a STM limitation, but it implies that the attention window must be limited in time so as not to overwhelm STM, which I now state. I thank the reviewer for this very helpful point.

 As for leaving ‘A’ undefined, except for examples, I believe this is the correct procedure, given that attention does not have a universal architecture but a set of particular architectures, because the various sensory and perceptual system solve different problems and need selection to occur in ways that are tailored to each system. 

The modeling is enclosed in a discussion of whether attention can be defined by performance improvement. That needs sharpening and also needs something to distinguish attention-driven improvement from the host of other factors that can improve attention.

So yes, practically speaking, the consequence of attending may include a loss of performance. These can then be examined and lead to a better understanding of the architecture of attentional selection. It should not lead to a suggestion that the performance criterion be abandoned. Especially not as the final and most important conclusion of the paper.

I agree; it is not the main conclusion of the paper. I have moved this paragraph up and presented our result as a caveat concerning the main topic, which is how selection can be defined. 

Reviewer 2 Report

Attention as a unitary concept

Page 2 line 58 italicize a to designate it as a parameter rather than the indefinite article.

I am not sure about the assumption that

“If w is graded, more highly-weighted inputs impose proportionally greater costs.”

If I am waiting for my friend off the train then I am, in a sense, primed to see my friend hence less computational costs? Broadbent refers to this as “pigeon-holing”. The relation between weights and costs may be the reverse of what is stated in the quote..

I am very favourably inclined towards more definite theoretical explication however, on reflection the notions of “error” and “goals” and “cost” are introduced, but are in need of further explanation.

Let’s assume that I am trying to hear what the person sat behind me on the train is saying without turning round to watch. My goal is to comprehend what they are saying and error could be defined relative to recovering a sensible message. However, there is no guarantee that what I think was said and what was actually said are the same. In this case, what I recover may be in error but without knowing what was actually said there is no way to quantify this error. I am therefore left puzzling as to how it is that error is to be quantified for the system to be able to operate in the way discussed.

Now it may be that I am misconstruing what is meant by "error" (and reading on this is quite possible) but still, this important topic needs clarification. I think what is being discussed is “prediction error” but if so this does need spelling out and its relation to attention needs further discussion.

Line 193. “convers”?

I was unable to get hold of a copy of Fine and Reeves (2018) and this does seem to be a critical reference.

L. 363-364. This is the first time I have consulted the work of Bahcall and Kowler and this summary does not jib with what their account implies (for me at least). In their first model (difference in Gaussians model) the dip in performance at close separations arises because the positive region of one DOG overlaps with the negative region of the adjacent DOG hence overall activation is reduced. Surely the more relevant literature is on visual crowding but these links remain unexplored.

In sum, I enjoyed reading this brief report and I am very favourably inclined to theoretical definiteness. However, as I have noted above, I take issue with some of what is written. In addition, I feel that the paper lacks a clear narrative and seems to end with undue haste. It comprises an episodic summary of a handful of studies and I would be hard pressed to repeat what the take-home message actually is. In fact, having read the paper, I re-read the abstract and was slightly surprised to find that what is, written there, as the case being made, did not (for me at least) emerge from the text itself. So, lots of interesting content but currently the lack of a clear story makes it hard to recover what of critical import is being said and why.

Author Response

Reviewer 2

Attention as a unitary concept

Page 2 line 58 italicize a to designate it as a parameter rather than the indefinite article.

I used the Greek a, in the original article, to avoid ‘a’ for this reason. I hope the version the reviewer gets to read permits this font.

I am not sure about the assumption that

“If w is graded, more highly-weighted inputs impose proportionally greater costs.”

If I am waiting for my friend off the train then I am, in a sense, primed to see my friend hence less computational costs? Broadbent refers to this as “pigeon-holing”. The relation between weights and costs may be the reverse of what is stated in the quote..

Yes, I accept this point. Items that are selected for will typically get more processing later on – you will notice details about your friend that you would not notice about a stranger – and this will take time and impose some form of cost. But priming will reduce the computational burden if the prime is close enough to the target, and distant enough from possible competitors, to make it easier to extract the features which define the input. I now mention that priming, like the word frequency effect or other influences on the efficiency of encoding, can modulate the input, x.

I am very favourably inclined towards more definite theoretical explication however, on reflection the notions of “error” and “goals” and “cost” are introduced, but are in need of further explanation.

Yes, I agree. These terms need better definitions. About ‘error’; I hope the new text clarifies my meaning; error is just the discrepancy between the output and the goal, (y-y’), not a discrepancy with external reality, and y’ may be subjective. However, if y and y’ are objective, as in visual search, then hits and false alarms can be derived.  Taking this on board made me re-write the section of deriving d’ from M, as I had not stated, but now state, that the goal, y’, must be objective to permit this application. Otherwise I have left ‘goal’ generic, except to give examples (like walking or search for a target among distractors), where the goal is obvious, as any goal is specific to each task. I have also gone into the concept of ‘cost’ in more detail.

Let’s assume that I am trying to hear what the person sat behind me on the train is saying without turning round to watch. My goal is to comprehend what they are saying and error could be defined relative to recovering a sensible message. However, there is no guarantee that what I think was said and what was actually said are the same. In this case, what I recover may be in error but without knowing what was actually said there is no way to quantify this error. I am therefore left puzzling as to how it is that error is to be quantified for the system to be able to operate in the way discussed.

Now it may be that I am misconstruing what is meant by "error" (and reading on this is quite possible) but still, this important topic needs clarification. I think what is being discussed is “prediction error” but if so this does need spelling out and its relation to attention needs further discussion.

See above

Line 193. “convers”?

Fixed – should be covers

I was unable to get hold of a copy of Fine and Reeves (2018) and this does seem to be a critical reference.

Reeves (2019) covers the same ground and includes the model which explains how attentional focusing can reduce performance. I now state so, for the interested reader.

  1. 363-364. This is the first time I have consulted the work of Bahcall and Kowler and this summary does not jib with what their account implies (for me at least). In their first model (difference in Gaussians model) the dip in performance at close separations arises because the positive region of one DOG overlaps with the negative region of the adjacent DOG hence overall activation is reduced. Surely the more relevant literature is on visual crowding but these links remain unexplored.

On re-consideration, I agree with the reviewer, and I have moved this reference to an earlier spot to make the general point that focusing on a location can inhibit adjacent locations.

The reviewer is correct. Visual crowding indeed depends on selective attention, especially on the movement of attention; as the focus of attention shifts, Bouma’s bound expands up to one half the visual field before shrinking back again when the focus stabilizes. Jeff Nador and I plan to publish this work in the future. 

In sum, I enjoyed reading this brief report and I am very favourably inclined to theoretical definiteness. However, as I have noted above, I take issue with some of what is written. In addition, I feel that the paper lacks a clear narrative and seems to end with undue haste. It comprises an episodic summary of a handful of studies and I would be hard pressed to repeat what the take-home message actually is. In fact, having read the paper, I re-read the abstract and was slightly surprised to find that what is, written there, as the case being made, did not (for me at least) emerge from the text itself. So, lots of interesting content but currently the lack of a clear story makes it hard to recover what of critical import is being said and why.

 I hope that the prescriptive model will turn out useful for other researchers, as (I believe) it clarifies the prerequisites for ‘selection’ to occur. Examples are given. This was the aim of the paper. The concluding section discussed how this compared to other meanings of attention. I hope this structure is now more apparent.

Reviewer 3 Report

Overall, I found this a helpful presentation of the effort to examine models that provide explanation for the role of visual attention.  The title however is puzzling because the article does not deal with the issue of whether there is a unified concept of attention. The article deals only with visual selection, although the author admits that there are many other forms of attention (e.g. auditory, bodily and memory to name some) while at thevery end he line 305 he says his intuition is that the concept is unitary he does not provideany evidence for this.  However, it seems likely that illuminating models of selection in vision would provide important constraints for selection in other forms of attention.

The bulk of the paper is concerned with models of selection so that a title such as “Integrating Models of  Visual Selective Attention” would be a more accurate description of the contents of the paper.

In addition to the title there are a number of places that could use moreclarification. I have detailed these below by line.

line 18.  The likely readers of this journal probably think of attention as a field or study.  I am not sure what layman think but for them a single concept might fit.

line 98. There might be an optimal speed at which to move at which accuracy would decline for both faster and slower movements

line 109 An example of speed accuracy tradeoff might be helpful. here, for example in Fitts' law

line 188 Recently attention has been thought to be oscillatory (see ref below) in the work of Kastner how would this fit with this idea of individual rate changes.

Fiebelkorn, IC & Kastner, S. (2019) TRENDS IN COGNITIVE SCIENCES    23/ 2    ‏ 87-101

line 199  Could you clarify the following sentence? Possibly the blocking of numeral rate allowed λ to change with rate, which saves Ma, but this was not  proven.

line 234 What would be the consequences of this idea for the experiment Wolfe conducted or for other studies

line 244 derive should be derived

line 275 According to the abstract of Bonnel et al  On every trial, the subject judged both pairs. Results showed that d′ increased from 0.77 with 20% allocation to 1.69 with 80%, indicating that sensitivity is modulated by attentional instructions.

This sees the opposite of what is being said here.

Line 277. The signal detection model postulates an early sensitivity effect based on the sensory evidence and a late decision criterion effect. However, this two stage model seems out of touch with the nervous system top down interaction with sensory evidence.  Also the spatial frequency change by attention as found by Cerassco does not fit with a criterion change.

Line 305. This intuition is the only link between the paper and its title.

Line 350 I am not sure anyone does assume attention always enhances performance if it is to the wrong location of stimulus.

 

Author Response

Overall, I found this a helpful presentation of the effort to examine models that provide explanation for the role of visual attention.  The title however is puzzling because the article does not deal with the issue of whether there is a unified concept of attention. The article deals only with visual selection, although the author admits that there are many other forms of attention (e.g. auditory, bodily and memory to name some) while at thevery end he line 305 he says his intuition is that the concept is unitary he does not provideany evidence for this.  However, it seems likely that illuminating models of selection in vision would provide important constraints for selection in other forms of attention.

The bulk of the paper is concerned with models of selection so that a title such as “Integrating Models of  Visual Selective Attention” would be a more accurate description of the contents of the paper.

Correct. Title changed

In addition to the title there are a number of places that could use more clarification. I have detailed these below by line.

line 18.  The likely readers of this journal probably think of attention as a field or study.  I am not sure what layman think but for them a single concept might fit.

Nice point. I now specify ‘lay’

line 98. There might be an optimal speed at which to move at which accuracy would decline for both faster and slower movements

 An interesting point. This could involve some function velocity added to A, as in A(s, t, h(s/t)). An experiment in which costs, inputs, and l are fixed, so sensitivity could be monitored, could help pin down the function h(.).

line 109 An example of speed accuracy tradeoff might be helpful. here, for example in Fitts' law

 I have added examples from visual search (Carrasco; Santhi & Reeves) and recognition memory (Reed)

line 188 Recently attention has been thought to be oscillatory (see ref below) in the work of Kastner how would this fit with this idea of individual rate changes.

Fiebelkorn, IC & Kastner, S. (2019) TRENDS IN COGNITIVE SCIENCES    23/ 2    ‏ 87-101

Oscillation of attention was raised by Wundt and remains an important idea. It could be incorporated by adding an oscillatory function in A(s, t) which could flatten or sharpen the weights, w, on different trials. However the data considered here are all averaged over trials, so any oscillations are averaged over. This is an interesting reference and I thank the reviewer for it, but I have not pursued it. 

line 199  Could you clarify the following sentence? Possibly the blocking of numeral rate allowed λ to change with rate, which saves Ma, but this was not  proven.

clarified

line 234 What would be the consequences of this idea for the experiment Wolfe conducted or for other studies

what is called ‘parallel’ search my not be, as now explained in the text.

line 244 derive should be derived    Done

line 275 According to the abstract of Bonnel et al  On every trial, the subject judged both pairs. Results showed that d′ increased from 0.77 with 20% allocation to 1.69 with 80%, indicating that sensitivity is modulated by attentional instructions.

This sees the opposite of what is being said here.

 This is correct about line length; the reference was to luminance increments, which is now stated. I thank the reviewer profusely for spotting this elision on my part.

Line 277. The signal detection model postulates an early sensitivity effect based on the sensory evidence and a late decision criterion effect. However, this two stage model seems out of touch with the nervous system top down interaction with sensory evidence.  Also the spatial frequency change by attention as found by Cerassco does not fit with a criterion change.

Yes, excellent point. Moreover the signal detection model also assumes stationarity (M. Treisman has a complex model which does not assume this, but it goes well beyond the presentation here). Delayed feedback based on an initial feed-forward decision could adjust the weights, and thus alter the final decision, and this might be compatible with selection as defined here and with Ahissar’s neural model (now referenced), but purely behavioral responses don’t address this, so I have (reluctantly) mention this topic only to drop it.

Line 305. This intuition is the only link between the paper and its title.

Title changed.

Line 350 I am not sure anyone does assume attention always enhances performance if it is to the wrong location of stimulus.

 I did state that validity was critical. Carrasco reported that attention to the wrong frequency band can reduce performance, and Posner that attention to the wrong location can do so, as already stated. The text seems clear to me.

 

Round 2

Reviewer 1 Report

This is fine.

Author Response

thank you for your positive evaluation.

 

Reviewer 2 Report

As I noted in my previous review I am generally inclined towards more well-specified models than simply textual accounts, but the current version of the paper gave me pause.

I take it that Equation 1 provides a general quasi-mathematical ‘model’ of information processing constrained by attentional processes.

However I now wonder about its general usefulness if its particularities have to be fleshed out on a case by case basis. Radically different formal characterisations of the parameters are discussed in the paper and in places I did wonder just how elastic the framework really is. Moreover, the rather cautionary passage in Section 5 did make me feel a little uncomfortable.

From a personal point of view, I thought the paper just came to a halt and I urge the author to think carefully about providing a clear Conclusions section in which he spells out the main point(s) and conclusion (the take home message). I can’t see this taking more than a paragraph, but currently the reader is left wondering what all the fuss is about.

 

Minor comments

Figure 1 is missing from my version of the paper.

There are also several infelicities that need working on - e.g., L. 423 ‘conforms the standard’

L. 454 ‘otoptypes’ I had to look this up and felt that I shouldn't have to. Also the findings of Fine and Reeves are counter-intuitive and not well enough described here so that the reader can get a proper understanding of what the actual effects are. I was unable to get hold of the primary reference in time for this review. I would be particularly impressed if a review section were provided of such ‘negative attentional effects’ – just to make sure that they do generalise to other paradigms and other situations.

 

 

Author Response

Reviewer 2:

As I noted in my previous review I am generally inclined towards more well-specified models than simply textual accounts, but the current version of the paper gave me pause.

I take it that Equation 1 provides a general quasi-mathematical ‘model’ of information processing constrained by attentional processes.

Equ. 1 symbolizes all the factors that I claim are needed to define ‘selection’ in ‘selective attention’ applied to discrete inputs (and discrete outputs). It is NOT a model, but a prescriptive framework for any model in which selection of discrete elements is claimed to occur. I have added the following text to clarify this motivation:

‘ Equ. 1 presents a framework for possible models of selective attention in which inputs and outputs can be described in discrete terms. As a framework, it specifies pre-requisities and definitions for ‘selection’ to have taken place, rather than being a model in itself. Eq. 1 is sufficiently general that it can be specialized to particular models in several quite different areas, as illustrated in the body of the paper.’

However I now wonder about its general usefulness if its particularities have to be fleshed out on a case by case basis. Radically different formal characterisations of the parameters are discussed in the paper and in places I did wonder just how elastic the framework really is. Moreover, the rather cautionary passage in Section 5 did make me feel a little uncomfortable.

Ideally, it is just elastic enough that no sensible use of the term ‘selective attention’ will violate the concepts and definitions employed, and there are no redundant or unnecessary concepts either. It isn’t easy to formulate a prescriptive model like this, especially for a field as diverse as attention, and I may have erred one way or the other, but that is my aim.

From a personal point of view, I thought the paper just came to a halt and I urge the author to think carefully about providing a clear Conclusions section in which he spells out the main point(s) and conclusion (the take home message). I can’t see this taking more than a paragraph, but currently the reader is left wondering what all the fuss is about.

This is a useful remark, and makes me think of how to tie up the main points in a succinct manner. The main point is that, if the paper is to prove useful, it provides a rubric for defining ‘selective attention’ and experiments devoted to testing it, and other concepts of attention that do not fit the rubric will require definitions other than ‘selective’.  I have added a summary paragraph and have made this point explicit (although it is implied by the term ‘prescriptive).

 

Minor comments

Figure 1 is missing from my version of the paper.

 I am concerned that the reviewer missed the main point of the AGM as a result. I have no idea why the figure was not included, as it is in the Word document. A very unfortunate happenstance.

There are also several infelicities that need working on - e.g., L. 423 ‘conforms the standard’

  1. 454 ‘otoptypes’ I had to look this up and felt that I shouldn't have to.

 Now defined.

 Also the findings of Fine and Reeves are counter-intuitive and not well enough described here so that the reader can get a proper understanding of what the actual effects are. I was unable to get hold of the primary reference in time for this review.

  The Journal ‘Information’ (though not the book chapter) is widely available and I don’t want to duplicate already published material. Plus, I do not regard this as a central topic; it is more or less a plea to consider how easy it is for attention to have a negative effect, or indeed a nul effect if the signal and noise are amplified equally by attention

I would be particularly impressed if a review section were provided of such ‘negative attentional effects’ – just to make sure that they do generalize to other paradigms and other situations.

. A review of evidence could be informative, but stray from the point of this paper.

NOTE:

As this is an ‘ideas’ piece, I have also added an anecdotal point as a footnote. I have spent many hundreds of hours in psychophysical experiments, and I have discussed methods with many colleagues, mostly in the field of thresholds for color sensations but also for weak tones in auditory experiments. We all agree that paying attention too carefully raises threshold; the ideal is a relaxed state of ‘attentiveness’ rather than detailed scrutiny. My explanation is that at threshold, the test stimulus appears briefly against a background, and attending too intensely amplifies the background (the ‘noise’) more than the (weak) stimulus (the ‘signal’), lowering the S/N ratio.

Round 3

Reviewer 2 Report

I have nothing of substance to note.

Back to TopTop