According to Cointelegraph, social media giant Meta has introduced two artificial intelligence (AI) models for content editing and generation. The first model, Emu Video, is capable of generating video clips based on text and image inputs, leveraging Meta's previous Emu model. The second model, Emu Edit, focuses on image manipulation, promising more precision in image editing. Although still in the research stage, Meta believes these models have potential use cases for creators, artists, and animators.
Emu Video was trained using a factorized approach, dividing the training process into two steps to allow the model to be responsive to different inputs. This approach enables the model to efficiently generate video conditioned on both text and generated images. Emu Video can animate images based on a text prompt, using only two diffusion models to generate 512x512 four-second long videos at 16 frames per second. Emu Edit, on the other hand, allows users to perform various image manipulation tasks, such as removing or adding backgrounds, color and geometry transformations, and local and global editing. Meta trained Emu Edit using a dataset of 10 million synthesized images, each with an input image, a description of the task, and the targeted output image.
Meta's Emu model was trained using 1.1 billion pieces of data, including photos and captions shared by users on Facebook and Instagram. However, regulators are closely scrutinizing Meta's AI-based tools, leading the company to adopt a cautious deployment approach. Meta recently announced that it would not allow political campaigns and advertisers to use its AI tools to create ads on Facebook and Instagram, although the platform's general advertising rules do not specifically address AI.