© David Sponder, L.E.P., BCBA, Floortime C2
Executive Director, Sponderworks Children’s Services

Referencing and Reference Points in Sensory Modalities

Relevant Points of Information that Help us Manage Uncertainty in all of the Senses


Visual Forms of Referencing

For instance, if I engage in interactions with you, I might look for the following bits of information as we go along:

When I’m in the lead…

I might look to see if you’re still there, or whether you’re listening or paying attention

I might look to see if you’re looking at the thing I’m talking about

I might look to see if you understand what I’m saying

I might look to see if you feel the same way I do; if you agree, disagree; if I’ve made you happy, angry, embarrassed, etc. (emotional referencing/intersubjectivity)

I might look to see if there is a need for a momentary ‘repair;’ or a change or adaptation needed at the moment to keep the emotional and social interaction going. 

When someone else is in the lead…

I might look to stay in sync with this person emotionally or to stay in coordination with our ongoing transactions

I might look to get the additional information and context that I get [visually] from the way you talk or act, as I can see from your nonverbal communication

I might look to know what it looks like is important to you, what is on your mind, or what you might do next by following your eyes or attention along with appraising the patterns of your behavior in the past and up until now

Referencing is not the same thing as looking though.  Looking is merely one very common form of referencing the information needed to keep inter-relating going.

However, the visual processes involved in visuo-emotional and visuo-social referencing are very important in our species.  As ground dwelling primates, we need good vision to obtain sensory information from a distance.  We were out in the open evolving on the African Savannah, so we needed to learn to spot both threat and opportunity from a distance, and to be able to do it quickly.

As creatures with the potential for amazingly complex emotional and social interaction, we have to pay attention to and read visual patterns that help us to make reasonable and ongoing updated appraisals of the intentions of others.  We have to make rapid appraisals, sometimes – or much of the time, at a speed that only the lower (reflexive, emotional, procedural) parts of our brain can accomplish.  The more densely wired, multiple and simultaneous processes that the frontal or executive parts of our brain perform and the amount of information they manage makes too slow of a system to do the work of rapid adaptation.

As cooperative breeders and hunters, we use visual information as important components of rapid communication.  Our emotional brains rapidly recognize visual patterns much faster than our “mental executives” do.

Referencing “Action Predispositions” in the Visual Domain

With experience and the shaping effect the environment has on us, visual referencing for certain patterns allows us to make rough appraisals rather rapidly of the “action predispositions” of others.

Our brains aren’t as fast as they may appear to be in so-called spontaneous, fluid and ongoing interaction.  To keep up, not only do we need to be able to make rapid appraisals of others behavioral predispositions, we have to have a motor plan ready (an idea executed in thought or action) in order to respond in a timely manner.   Recognition of any pattern helps one to expect how things are to unfold and to pre-load responses from the emotional brain into the mental “hopper” to be released by the [conscious and emotional] brain.

For instance, a warm smile sends the message that approaching is safe and welcomed or ‘keep up what you’re doing,’ and a sneer or grimace sends the opposite message – ‘back off;’ ‘stop what you’re doing’…  We use the tentative term “action predisposition” because in complex interaction, predispositions carry only a likelihood of a certain class of behavioral responses. Visual signs of emotion can be spotted from a distance. Hostile affects can warn you from a distance and you can retreat.  Warm affects invite you in.  You want to “come in” to warm invitations, because being around others gives you multiple learning opportunities.

Skinner downplayed the connection between affects and behavior.  He didn’t deny that emotions predispose behavioral responses – he just maintained that they didn’t cause them.  He is right.  People choose their actions.  They operate on their environments by selecting behaviors from their existing repertoires or inventing new ones.

Behavioral responses bubble up from the emotional brain, but must pass through cerebral filters and gating mechanisms in order to be released. Anger doesn’t cause hitting. Hitting is choice of how to deal with anger – but it is not the only choice.  That was Skinner’s point.  Mature individuals have considerable control and choice in the choices we make under various emotional predisposition states.  Never let a kid tell you he hit someone because he was angry.

Looking therefore is an important means of tuning in to the visual information available, and vision allows communicative referencing at greater distances than hearing, smell, taste or touch can.  For survival purposes, it is better to be able to detect a pattern that signals others’ threatening predisposition from a distance, rather than having to get closer to smell or touch it.  This applies to discovering opportunities as well.  Scanning the environment on a ‘regular-enough’ basis can provide rich and necessary information.

Being able to read emotional information such as affective displays helps us cooperate with each other in discovering salient cues in the environment and joining our attention on them.  I might point out to you that those footprints mean that a lion is close by.  Or, you might notice that I suddenly stopped and crouched.  You would then follow my eyes – almost without thinking about it.  Your ability to recognize that visual pattern enhances your chances of survival.

Men and women, especially women, give off subtle cues that signal to the opposite sex whether or not to approach.  This enhances our ability to find mates and ultimately to pass our genes on to another generation.

We don’t only look at faces; we look at a lot of things in order to keep up with what’s going on.  But faces carry a great amount of social/emotional information in a relatively small amount of space, and it is part of our nature as a species to regard what is called “facial/affective” communication or “facial/affective displays.”

For people with autism, faces can be confusing, overloading and overwhelming.  First of all, facial/affective displays (facial expressions) are gestalts.  The component parts involve the “emotional triangle,” the upside-down triangle starting at the top with forehead, and moving down, the eyebrows, eyes, cheeks, nose and chin.   Genetically, we have to potential to coordinate thousands of tiny muscles to form thousands of possible configurations, some of which we should recognize innately (joy, sadness, disgust, surprise, fear, anger), and the rest, more culturally defined and blended forms of these basic emotional displays we learn from watching and participating in emotional/social interaction.  Those with innate difficulties with visual gestalt thinking may avoid complex stimuli such as faces, with corresponding downstream effects of delaying or arresting emotional and social development.

Visually, we reference other patterns of the body (posture; speed and direction of movement; intensity or style of movement, etc.) when those elements are the most salient information.

Visuo-Spatial References

We visually reference seating arrangements, how close people are to each other, the direction or spatial configuration of groups and other forms of visuo-spatial information with social or emotional meaning.  For instance, we walk through a hallway and when we come out we see bleachers, a green court with a net in the middle and two people standing on either side of the net, facing each other, and with tennis rackets in our hand.  If we’re familiar with the game and the situation we would immediately appraise “tennis match” and unless we were one of the participants, we wouldn’t walk on to the court.  If the participants all of the sudden started to play football, we would be surprised to find that and it would catch our attention.

Auditory Forms of Referencing

Vision isn’t the only distal sense we favor for referencing salient information in the environment.  We also depend a great deal upon sound information.  We reference the location, directionality, tone and intensity of others’ voices, not to mention their words.  This ‘extraverbal’ information often reveals the true meaning of what someone is saying, even when the utterance implies the opposite – as in sarcasm.

Same word Communication Function/Inflection Example meaning
“Really?” Questioning/Rising inflection “Did that really happen?”
“Really.” Imperative: Answering/Neutral Inflection “Yes it really did.”
“Really!” Declarative: Subjective opinion-sharing/Descending inflection “Believe it or not, it really did.”

Spoken conversations sound like this more often than not.  People tend to speak in fragments and abbreviated communications.  This is why the instruction of language cannot be truly successful without first establish strong emotional and nonverbal substrates for total communication, along with words or signs.  Much if not most of the meaning of a verbal or written utterance can be found in its tone or its “manner.”

Written expression has a tone implied by other contextual clues in the text, and our own active matching of the text to our own experience and the sounds and inflections we’ve associated before.  But if you’ve ever misunderstood what someone else meant in an email, or someone misunderstood your intent – then you know the limitations inherent in not being able to hear the prosodic melody of speech.  That’s why they invented “emoticons.”  Some of them even move now – showing our need to add affect and movement to written language.

Audition (hearing) and vision are considered “distal senses” because they can perceive stimuli from far away.  Olfaction (smell) can also be considered a distal sense if the odor (the amount and chemical nature of particles in the air), but humans don’t have a well-developed sense of smell, so we generally have to be proximal to the source in order to detect and discriminate between odors.

Proximal Sensory References

We reference and join attentional frames with others in all available senses, including the proximal senses of touch, proprioception and vestibulation, smell, taste, and interoception (the perception of one’s own body states).  A hard slap and a caress convey different messages.  A firm squeeze can be calming and a light touch can be creepy.

I was taught that man’s firm handshake conveyed character and a limp handshake femininity.  When a man grasps my hand too hard – I think he has a problem. Those are examples of cultural meanings attached to touch and proprioception.

Strong body odor can convey a lack of hygiene.  That could be a sign of poverty, mental illness, depression, or social cluelessness.  Smelly clothing is worse.

Even though as primates we have devolved our sense of smell over evolutionary time and concentrated more on developing our distal senses, olfactory messaging still counts.  Olfactory messaging accounts for why women’s menstrual cycles tend to sync up if they live or work together – and they tend to sync up to the dominant female.  Isn’t that interesting?  Don’t ask me why that happens or what for.  The men at your worksite will usually be completely oblivious to this olfactory messaging – it is not a reference point for us.

It is easy to overlook taste as a social reference point.  But our choices of food can send many types of messages.  In our field, we deal with picky eaters all the time. Some of this has to do with oral-motor, or neuromuscular issues, but most of the time it does not.  In most cases, picky eating is a social and learned phenomenon that has its basis in genetic wiring.  An adult that pours ketchup on his Lobster Thermidor is a reference point for the opinions of others. So does ordering pork at a kosher restaurant.

It is interesting to note that we do not have voluntary recall in our proximal senses, but we do in our distal senses.  Close your eyes and try to taste or smell apple pie.  You can’t.  But you can see it in your mind’s eye.  You can also recall sound patterns – language and music – easily.  This isn’t all that important – but it highlights the relative importance of distal reference points in complex social interaction.  When you observe on of Miller’s children with Type A system forming disorder – you can really see it – and the way these children explore their worlds becomes a primary and pivotal target for teaching.

Episodic Memory: The Hidden Reference

“Episodic memory” refers to one’s memory and “take-away” from the happenings in one’s life.  Very importantly, it also involves what we learn and remember from the observations of what happens to others – either from personal witnessing or from media.  The ability to learn from, remember, and share social episodes is what accounts for the development of the ability for thinking in symbols and the “cognitive explosion” in our species that we think is only about 40 to 60,000 years old.  I’m not saying episodic memory is that young – animals have it.  What is so recent is our ability to share it with symbols. Animals cannot share their memories – we can.  We can also make meaning out of complex patterns and compact multiple integrated elements into packets of sensory, emotional and motor memory much better than other animals can.

They say that when a dog sniffs the ground or sniffs another dog’s butt, it’s equivalent to her reading the newspaper.  Maybe – but it is doubtful that smell creates visual representations in a dog, because their brains aren’t as integrated as ours.  Smells are heavily connected to their emotions, as it is with us.

Fig.  Proximal Sense Referencing

Sensory System Means/Reference Variations in Intensity Possible Meanings Examples
Touch Skin to skin contact Light Approachable Affection; Affiliative
Rough Warning Aggression

Reference Words:

Did you feel that?”  “That felt nice.  “This is soft.”  “Ouch!”
Proprioception Pressure, stretch, impact Light Approachable Affection; Affiliative
Rough Warning Aggression

Reference Words:

Is that too heavy?”  “Did you feel that bump?”  “I want a real hug.”
Smell Chemical/Airborne Loud Proximity to source; high intensity of source Pleasant (approach)
Soft Distance from source; low intensity of source Toxic (Avoid)

Reference Words:

Do you smell gas?” “Reminds me of Thanksgiving.”
Taste Chemical/Tongue Strong Too much Avoid, reduce
Weak Too little Approach; increase

Reference Words:

Can you taste the cinnamon in this pie?”  “Is it sweet enough?” “Yuk!”
Interoception Internal somatic signals Strong Urgency; Danger Do something
Weak OK No need to do anything

Reference Words:

Did that yogurt make you sick?” “That noise will give us a headache.” “I have to go to the bathroom.”

Episodic memory is really the memory of change.  In order to observe the meaning of a sequence of causes and effects, we have to be able to conserve the original and changed versions.  Episodic memory is about narrative – how events unfold and influence each other in forward and backward directions.  This is why behavioral lessons that teach children to identify the emotions depicted in still shots of individuals are a waste of time.  We’ve all seen the emotion flashcards and that chart with all the kids faces on it.  Naming those pictures does nothing to improve one’s ability to function in dynamic social, emotional interaction.  In fact, because the lesson teaches the cognitive brain, it actually impedes one’s ability to function.  People’s faces move all the time. It is the movement from one expression to another that conveys the meaning – not the frozen expression depicted in the emotion flashcards.

Jamal was happy swinging on the swings.  Jenna pushed him off.  He started crying.  Jose saw the whole thing.  The real social and emotional lesson – the episode – the narrative – must include all of those events.  A neurotypical child of 4 or 5 years old would easily learn a “lesson” from this episode, whereas an individual with autism may see them as three unconnected events and learn nothing.  All three of those children know what happened and why Jamal reacted in the way he did.

The absence of real “tracking” deprives one of the ability to learn and share narrative with others.  This is the vast majority of social learning.  Importantly, it points out why noticing and following stimuli through a series of changes is one of the most important nonverbal behaviors we can teach.

Jamal was happily swinging on the swings the next day.  Jenna approached. Jose yelled, “Watch out, Jenna’s coming.”  The Playground Supervisor overhears this – especially the alarm in Jose’s voice (which is what alerted her attention above the regular din of the playground). Jenna backs off.  Later on, Jenna trips Jamal as he takes his lunch tray to his table, and then she turns to Jose and sticks her tongue out at him.

The richness of the episodic and overt/concrete reference points creates a context that is quite remarkable when examined closely.  It also points out how social learning works and is around us all the time.  There are hundreds of opportunities to learn from these little episodes, which accounts for the equally remarkable social learning curve characteristic of neurotypicals.  It also points up how many events like this a person with autism can miss and the flatter social learning curve that is a consequence of the deficit.