"MIT Researchers Use Image Dataset to Improve AI's Peripheral Vision Abilities"

Peripheral vision allows us to perceive shapes and objects that are not directly in our line of sight, although with less detail. This skill expands our visual range and can be beneficial in various situations, such as detecting a car approaching from the side.

In contrast to humans, artificial intelligence (AI) lacks peripheral vision. Equipping computer vision models with this ability could improve their ability to detect potential hazards or predict if a human driver would notice an oncoming object.

To move in this direction, researchers at MIT have created an image dataset that enables them to simulate peripheral vision in machine learning models. They discovered that training models with this dataset enhanced their ability to identify objects in the visual periphery, although still not as well as humans.

Their findings also showed that neither the size of objects nor the level of visual complexity in a scene significantly affected the AI's performance, unlike with humans.

Understanding this disparity may aid in the development of machine learning models that perceive the world more similarly to humans. Aside from enhancing driver safety, such models could also be used to design displays that are easier for people to view.

Furthermore, a deeper understanding of peripheral vision in AI models could assist researchers in predicting human behavior, according to lead author Anne Harrington MEng '23. Co-authors include graduate student Mark Hamilton, postdoc Ayush Tewari, research manager Simon Stent from the Toyota Research Institute, and senior authors William T. Freeman, the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science, and Ruth Rosenholtz, principal research scientist in the Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL).

The team will present their research at the International Conference on Learning Representations (ICLR 2024).

To understand peripheral vision, extend your arm in front of you and raise your thumb—the area around your thumbnail is visible to your fovea, a small depression in the center of your retina that allows for the sharpest vision. Everything else in your visual range is in your peripheral vision. As the point of focus moves farther away from the fovea, the visual cortex represents the scene with less detail and accuracy.

Many existing approaches to incorporating peripheral vision in AI models blur the edges of images to simulate this loss of detail, but the loss of information that occurs in the optic nerve and visual cortex is more complex.

To address this issue, the MIT researchers began with a method used to simulate peripheral vision in humans, known as the texture tiling model. This technique transforms images to represent the loss of visual information in humans.

To make this model more accurate, the researchers modified it so it could transform images in a more flexible manner, without prior knowledge of where the person or AI would direct their gaze.

Using this modified technique, the team generated a vast dataset of transformed images that appear more textural in specific areas, simulating the loss of detail experienced when a human looks further into the periphery.

They then trained multiple computer vision models on this dataset and compared their performance to that of humans on an object detection task. Participants were shown pairs of identical images, except one had a target object located in the periphery. Each participant was asked to select the image with the target object.

The researchers discovered that training models from scratch with their dataset resulted in the most significant performance improvements, enhancing their ability to identify and recognize objects. Fine-tuning a model with their dataset, a process that involves tweaking a pre-trained model to perform a new task, resulted in smaller performance gains.

However, in all cases, the machines were not as skilled as humans, and their performance did not follow the same patterns as humans.

The researchers intend to continue exploring these differences, with the goal of developing a model that can predict human performance in the visual periphery. This could result in AI systems that warn drivers of potential hazards they might not see. They also hope that their publicly available dataset will inspire other researchers to conduct additional studies on computer vision.

Steven Russell
Steven Russell is a proficient entity from the Technology field. He completed Master’s Degree in Computer Science and Technology. He was engaged in the formation and administration of computational systems in his previous firm. He is associated with Industry News USA from last 2 years. Due to his command over the technology field, he has become the head of the Technology section in very less time period. “Latest gadgets” is the thing that attracts Steven the most.