Researchers at Carnegie Mellon University’s Robotics Institute have developed a tool called FRIDA, which is a robotic arm with a paintbrush attached to it. The tool leverages artificial intelligence (AI) to work together with humans on art projects.
The team is set to present the research titled “FRIDA: A Collaborative Robot Painter With a Differentiable, Real2Sim2Real Planning Environment” at the 2023 IEEE International Conference on Robotics and Automation in May.
Peter Schaldenbrand is a Ph.D. student in the Robotics Institute at the School of Computer Science. He works with FRIDA and explores AI and creativity.
“There’s this one painting of a frog ballerina that I think turned out really nicely,” he said. “It is really silly and fun, and I think the surprise of what FRIDA generated based on my input was really fun to see.”
FRIDA is an acronym for Framework and Robotics Initiative for Developing Arts. It is named after Frida Kahlo.
The research was led by Schalderbrand, along with RI faculty members Jean Oh and Jim McCaam, and it has enticed students and researchers from all around CMU.
Collaborative Tool Not Artist
Users can guide FRIDA by inputting a text description, submitting other works of art to inspire its style, or uploading a photograph and asking it to paint a representation of it. The team is also testing other inputs, such as audio.
“FRIDA is a robotic painting system, but FRIDA is not an artist,” Schalderbrand continued. “FRIDA is not generating the ideas to communicate. FRIDA is a system that an artist could collaborate with. The artist can specify high-level goals for FRIDA and then FRIDA can execute them.”
To paint an image, the robot uses AI models that are comparable to those powering OpenAI’s ChatGPT and DALL-E 2, which produce text or an image in response to a prompt. FRIDA simulates how it would paint an image with brush strokes and utilizes machine learning to assess its progress as it works.
The end products of FRIDA are whimsical and impressionistic. The brushstrokes are bold and lack the precision that is frequently sought in robotic endeavors.
“FRIDA is a project exploring the intersection of human and robotic creativity,” McCann added. “Frida is using the kind of AI models that have been developed to do things like caption images and understand scene content and applying it to this artistic generative problem.”
FRIDA uses AI and machine learning several times during its art-making process. First, it spends an hour or more learning how to use its paintbrush. Then, it employs vision-language models that have been trained on huge datasets pairing text and images scraped from the internet, such as OpenAI’s Contrastive Language-Image Pre-Training (CLIP), to understand the input.
One of the most significant technical challenges in producing a physical image is reducing the simulation-to-real gap, which is the disparity between what FRIDA creates in simulation and what it paints on the canvas. FRIDA uses an idea known as real2sim2real, where the robot’s actual brush strokes are used to train the simulator to reflect and mimic the physical capabilities of the robot and painting materials.
FRIDA’s team now aims to address some of the limitations in current large vision-language models by continually refining the ones they use. They fed the models headlines from news articles to provide them with a sense of what was happening in the world and further trained them on images and text that are more representative of diverse cultures to avoid an American or Western bias.