DeepMind at NeurIPS ’23

SeniorTechInfo
0 Min Read

Research

Published

Towards more multimodal, robust, and general AI systems

Next week marks the start of the 37th annual conference on Neural Information Processing Systems (NeurIPS), the largest artificial intelligence (AI) conference in the world. NeurIPS 2023 will be taking place December 10-16 in New Orleans, USA. Teams from across Google DeepMind are presenting more than 180 papers at the main conference and workshops. We’ll be showcasing demos of our cutting-edge AI models for global weather forecasting, materials discovery, and watermarking AI-generated content. There will also be an opportunity to hear from the team behind Gemini, our largest and most capable AI model. Here’s a look at some of our research highlights:

Multimodality: language, video, action

UniSim is a universal simulator of real-world interactions.

Generative AI models can create paintings, compose music, and write stories. But however capable these models may be in one medium, most struggle to transfer those skills to another. We delve into how generative abilities could help to learn across modalities. In a spotlight presentation, we show that diffusion models can be used to classify images with no additional training required. Diffusion models like Imagen classify images in a more human-like way than other models, relying on shapes rather than textures. What’s more, we show how just predicting captions from images can improve computer-vision learning. Our approach surpassed current methods on vision and language tasks, and showed more potential to scale. More multimodal models could give way to more useful digital and robot assistants to help people in their everyday lives. In a spotlight poster, we create agents that could interact with the digital world like humans do — through screenshots, and keyboard and mouse actions. Separately, we show that by leveraging video generation, including subtitles and closed captioning, models can transfer knowledge by predicting video plans for real robot actions. One of the next milestones could be to generate realistic experience in response to actions carried out by humans, robots, and other types of interactive agents. We’ll be showcasing a demo of UniSim, our universal simulator of real-world interactions. This type of technology could have applications across industries from video games and film, to training agents for the real world.

Building safe and understandable AI

An artist’s illustration of artificial intelligence (AI). This image depicts AI safety research. It was created by artist Khyati Trehan as part of the Visualising AI project launched by Google DeepMind.

When developing and deploying large models, privacy needs to be embedded at every step of the way. In a paper recognized with the NeurIPS best paper award, our researchers demonstrate how to evaluate privacy-preserving training with a technique that is efficient enough for real-world use. For training, our teams are studying how to measure if language models are memorizing data – in order to protect private and sensitive material. In another oral presentation, our scientists investigate the limitations of training through “student” and “teacher” models that have different levels of access and vulnerability if attacked. Large Language Models can generate impressive answers, but are prone to “hallucinations”, text that seems correct but is made up. Our researchers raise the question of whether a method to find a fact stored location (localization) can enable editing the fact. Surprisingly, they found that localization of a fact and editing the location does not edit the fact, hinting at the complexity of understanding and controlling stored information in LLMs. With Tracr, we propose a novel way of evaluating interpretability methods by translating human-readable programs into transformer models. We’ve open sourced a version of Tracr to help serve as a ground-truth for evaluating interpretability methods.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *