The Allen Institute for Artificial Intelligence (AI2) recently made two significant announcements, revealing Ali Farhadi as its new CEO and showcasing the groundbreaking Phone2Proc robot training approach. With an emphasis on open and transparent AI research, AI2 aims to drive fundamental advancements in various fields through the power of artificial intelligence.
AI2 proudly welcomed Ali Farhadi, an esteemed AI researcher, executive, and Forbes Top 5 AI entrepreneur, as its new CEO. Farhadi’s impressive background includes being a professor at the University of Washington and founding the AI2 Computer Vision team. He co-founded Xnor.ai, a deep-learning startup that was acquired by Apple Inc. in 2020. Having returned to AI2 from Apple, Farhadi is determined to propel the institute’s mission forward and establish a new era of open and trusted AI development.
Phone2Proc to simplify robot training
At the Conference on Computer Vision and Pattern Recognition (CVPR), AI2 presented Phone2Proc, an innovative robot training approach. This method involves scanning a physical space, generating a simulated environment, and procedurally creating thousands of scenes to train robots for a wide range of scenarios. Phone2Proc allows for faster and more effective training, enabling robots to perform in diverse settings, such as rooms with rearranged furniture, cluttered workspaces, or floors covered in toys.
Kiana Ehsani, a dedicated researcher at AI2, is at the forefront of the Phone2Proc effort. Trained under Farhadi at the Paul G. Allen School of Computer Science, Ehsani specializes in computer vision and embodied AI for training agents to interact with their surroundings. In an interview, she shared insights into the advancements Phone2Proc brings to the field of robotics.
Enhancing intelligent behavior in robots
Ehsani acknowledges that while AI has made significant progress, robust robots capable of working in any environment are still a challenge. To address this, she emphasizes the importance of training models in different environments and pretraining to achieve better generalization. By sampling trajectories and generating numerous scenes through Phone2Proc, AI2 aims to improve training models and facilitate agents’ ability to adapt to various real-world scenarios.
Ehsani envisions a simplified approach to robot training with Phone2Proc. The idea is to use a smartphone to capture a room’s video, which is then used to automatically generate an environmental template. Through procedural generation, the template encompasses specific characteristics of the house and environment, enabling safer, cost-effective, and faster training. AI2’s methods have demonstrated significant improvements in training robots to handle practical challenges such as lighting changes, furniture movement, and people’s presence.
Transitioning from simulation to reality
Considering the rise of the metaverse concept, Ehsani addresses the issue of accounting for random objects in simulations. Phone2Proc solves this challenge by utilizing scans of physical spaces to identify walls and objects’ 3D bounding boxes. This data enables the creation of templates with randomized lighting, textures, and object placements, all achieved without human intervention. Moreover, the library of assets within AI2’s Objaverse provides annotations for semantic understanding and object affordances, facilitating realistic training scenarios.
While AI2’s current focus does not include simulated people or dynamic figures, Ehsani highlights the importance of training robots to adapt to changes in their environment. By introducing different objects during each run of an experiment, agents learn to update their path planning and maneuver around obstacles. The aim is to enable robots to navigate corridors or other areas with multiple possible routes, enhancing their ability to operate effectively in dynamic settings.
Expanding robot training capabilities
AI2’s efforts in training robots extend to open-source LoCoBots designed for navigation. From locating specific objects within a household to mobile manipulation tasks involving object movement and rearrangement, AI2 explores various use cases in simulated environments. The integration of 3D models into the Unity development platform, alongside a bridging Python package, allows for seamless training in both simulated and real-world scenarios. The physics engine within Unity accurately simulates object interactions, promoting safe and effective learning.
As AI2 continues to advance its research, Ehsani acknowledges the challenge of scaling object libraries. While currently, the institute has developed a diverse range of object types for training purposes, they recognize the need for further expansion. Collaboration with the Objaverse project, which offers a vast array of object annotations, ensures that AI2 can leverage its work to enhance visual diversity and scalability.
The Allen Institute for Artificial Intelligence, under the leadership of CEO Ali Farhadi, is committed to advancing open and transparent AI research. The introduction of Phone2Proc revolutionizes robot training by utilizing simulated environments to prepare agents for real-world challenges. With a focus on improving generalization and adapting to dynamic environments, AI2’s research efforts are driving the development of robust and intelligent robotics.