Published Nov 20, 2023

Visual Generative AI Ecosystem Challenges with Richard Zhang - 656

Richard Zhang, a senior research scientist at Adobe Research, delves into the multifaceted challenges of the visual generative AI ecosystem, covering deepfake detection, data attribution, model customization, and aligning AI outputs with human perception. He emphasizes the need for innovative tools and metrics to enhance the adaptability and effectiveness of generative AI technologies.
Episode Highlights
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) logo

Popular Clips

Episode Highlights

  • User Customization

    Richard Zhang discusses the importance of enhancing user interactions with AI systems, emphasizing the need for dynamic and personalized customization. He highlights the limitations of current text-based interfaces and suggests a spectrum of interaction methods, such as style transfer and personal object integration, to provide users with more control over AI-generated content 1. Zhang explains that while text-to-image models like DALL-E 2 offer foundational capabilities, they lack the detailed control creators need 2.

    We want to have some sort of permanent state that allows you to iterate with it kind of meaningfully.

    ---

    He envisions a future where creators can iteratively refine AI outputs, making the process more intuitive and efficient.

       

    Model Tools

    Zhang introduces model customization tools like custom diffusion, which allow users to modify AI models by integrating personal content or removing specific concepts. This method, described as a form of network surgery, enables targeted changes without compromising the model's overall integrity 3. He also discusses the challenges of generalizing detection tools across different generative models, emphasizing the need for adaptable solutions to keep pace with evolving AI technologies 4.

    We want to do the removal, but we also want to be careful. Like, we don't want to blow out, like, all the other types of painting styles that are in the model.

    ---

    These advancements aim to empower users with greater control and flexibility in their interactions with AI systems.

Related Episodes