No one is ever going to write a passionate ode to Canary Wharf, a corporate zone that celebrates the impact of grey lines and characterless office blocks. It’s a difficult place to make a connection with, the poster child for skyline autism. I have never worked in Canary Wharf and have never wanted to. Yet still a rebellious glimmer of emotion shines through as I glance at the cluster of skyscrapers through the window, across the water. The fog has lifted today.
It’s the second day of Code-Ken 2011, a conference for software developers. Typically, developer conferences are about learning something new to help with the day job, but that kind of model is akin to personalised Google search: you’re never confronted with ideas outside of your immediate interest. But Code-Ken 2011 is the reincarnation of a failed conference, StackOverflow DevDays 2011, whose aim was to broaden its audience’s horizons and not give them more of the same.
We don’t need coffee this morning, because the first presentation of the day is fascinating. Tom Wright, from the Human-Centred Technology (HCT) Group at the University of Sussex, is talking about his research into sensory substitution. This is the science of replacing a lost sense with another, related to – although distinct from – sensory augmentation. We’re not quite discussing Deus Ex: Human Revolution here.
Synthetic Synaesthesia
Wright touches on haptic alternatives to sight, using the torso or tongue as a “screen” where pressure pads “display” the picture on the user’s skin, but more interesting is the subject of Wright’s own research: the use of sound to replace sight.
The vOICe system, initially developed in 1991, downscales an image, strips it of colour and maps the result to a short burst of sound. Each pixel is converted to a frequency – the higher the point, the higher the frequency – with brightness used as the volume of the pixel’s sound. Each column is then played in sequence over a second.
For example, a horizontal line is heard as a pure tone while a scatter of dots is converted into a sequence of beeps. It’s obvious that a real image would generate a chaotic sound. How useful could this be?
I already see where this is going. The real world would sound like noise to the inexperienced, but the human ear already performs complex audio processing (some liken it to a Fourier transform, maths dudes) and is able to distinguish overlapping sounds with ease. Wright said that given four to six weeks of practice, users are able to interpret the real world. But the really astounding thing is that veteran users make use of the system more for the rich “visual” experience – using it to appreciate what is around them – than accomplishing tasks, transcending its functional purpose.
Interestingly, the congenitally blind are not as interested in the technology as those who have lost their sight. Those that do have a go relate a problem with occlusion; the blind develop a spatial understanding through touch that is more three-dimensional than sight affords and are confused that the technology cannot convey the rear detail of an object.
Play The Sound
It’s a great presentation which Wright delivers in a conversational, informal manner, the opposite of what might be expected from an academic speaker (speaking as an ex-academic). But Wright goes one further and challenges the audience.
Two victims volunteer themselves and agree to a head-to-head contest. They listen to sounds and draw what they think they’ve heard – the audience gets to vote on who was closest to the actual image. But this game isn’t about the two at the front, it’s about the whole room. We are all trying to decrypt the sound-compressed imagery.
The latest buzzblurb doing the rounds is gamification – but far too much gamification feels like thinly-veiled Skinner boxes, earning achievements or score in return for being a good Pavlovian dog. The sad thing about this is that the act of play itself – rather than competition overdosed on metrics – can be enough in itself to confer benefits. I’m sure Jane McGonigal has covered this in great detail, but I’m frightened of her book Reality Is Broken because it contains the phrase “epic win” which gives my brain linguistic indigestion.
It’s in these small moments like Wright’s audience challenge where I can see the good in gamification. Speakers have been trying to make their talks entertaining from the year dot because if they don’t… the audience will vote with their eyelids. It’s just like Badger Commander put it in one of his previous comments here: “pretty much everything we do in life is some way a game.” Audience participation is just a type of game that helps us to remember what the speaker said.
I glance at the cluster of skyscrapers through the window, across the water. The fog has lifted today.
Ground Control To Major Tom
Okay, short coda. Tom Wright needs you. Test subjects who have lost their sight simply do not grow on trees and, if they did, it would likely pose a freaky ethical problem that might have been examined on an episode of Ally McBeal. Once someone has acquired the skill of interpreting images over audio, it’s difficult to use that individual in another trial. To mitigate this, Wright often uses fully-sighted subjects to help steer the research prior to proper trials.
If you have some time and want to help, you can submit yourself for online testing to assist Wright’s research. Just contact Tom Wright via e-mail to get the ball rolling. I’m already signed up. It’s like a blood drive without the blood. Or driving.
Download my FREE eBook on the collapse of indie game prices an accessible and comprehensive explanation of what has happened to the market.
Sign up for the monthly Electron Dance Newsletter and follow on Twitter!
I wonder if color is left out for the system’s sake or the user’s. If images could be frozen/paused then there could be an interesting alternative to photography (or even memory).
That is really, really cool. I am now imagining a community of people who have learned to use this technology to see producing sound art that can only be appreciated by each other.
@BeamSplashX: From what I understand the “bandwidth” of aural input is far less that sight; there’s only so much information that can be packed in. The resolution is also quite coarse – the vOICe is 176 by 144. There’s also the question of how exactly you’d encode colour. One of the complexities with this approach is that there’s no obvious mapping between image and sound.
@Switchbreak: That’s a mighty interesting thought. I’ll have to ask Tom if there is anything like that.
I remember reading somewhere that new born babies get there senses muddled up and can interpret all the stimuli through there other sense organs. Might of been a load of dribble I made up in my head but that’s always a risk I suppose.
Nice scenery.
“the poster child for skyline autism.”
Best sentence ever.
The research sounds interesting, a bit like – as you note – synaesthesia, in which concepts are associated with sensory inputs. I’ll be very interested to see where Dr Wright’s experiments go. It could be a wonderful new step toward restoration of sight.
@BAshment. I went looking for some info to back up what you think you read but instead came upon this frightener of a line: “Failure to stimulate his senses can have disastrous effects on his physical and psychological growth and development. Happily, you, as loving parents, know how to provide the right kind of sensory input for your baby.” Jesus, talk about trying to scare the shit out of parents. I stopped looking for information after this point.
@Steerpike: Oh cool, did I best “the Americans could have done anything to me while I was asleep”?? I don’t think Tom is a Dr. yet – but no doubt soon. A lot of people have been using vOICe for a while – it’s cheap and uses existing components, you can run it off mobile devices – and what Tom has been trying to do is improve on the initial implementation, find the optimal resolution, timing, sounds etc. There was talk that a hybrid model was probably where these devices are headed, i.e. haptic + audio input at the same time. Tom did bring up DXHR after the presentation – in reference to the DXHR Eyeborg doc.