Amazon Lab126

Amazon Lab126 at the Science Hub are teams of researchers working together to design and engineer AI hardware for consumer electronic devices.

Projects coming soon

Combining Virtual and Real Imagery through Deep Inverse Rendering

We seek to combine real video imagery with synthetic imagery, which has critical applications in visual effects, product placement, and augmented reality. Examples include the seamless insertion of additional objects in video frames, the replacement of an object by another one, the editing of materials, etc. Traditional methods to achieve seamless results have relied on heavy pipelines and tedious work by visual effect artists (match move, light probe, rotoscoping, 3D modeling, etc.). In this project, we seek to dramatically simplify the process and even to fully automate it. Funded by Prime Video.

This work is a collaboration between MIT faculty Fredo Durand and Amazon researchers Ahmed Saad, Maxim Arap and Mohamed Omar.

Cross-Modal Representation Learning in Videos

In this work, we propose to develop a novel framework based on cross-modal contrastive pretraining to efficiently learn a unified model for zero-shot video, audio and text understanding. In particular, we will continue to develop a progressive self-distillation method, which has shown impressive gains in efficiency and robustness of image-based vision-language models, in the context of video understanding. A model that effectively learns to align corresponding data across these three input modalities constitutes a “swiss-army knife” of discriminative perceptual classifiers with potential to emerge as the “foundation” model backbone powering downstream approaches to video analytics, scene detection and content moderation for compliance with regulatory needs. Funded by Prime Video.

This work is a collaboration between MIT researcher Aude Oliva and graduate student Alex Andonian, and Amazon researchers Raffay Hamid, Natalie Strobach, Mohamed Omar, Maxim Arap and Ahmed Saad.

Amazon Lab 126