Published Mar 1, 2024

762: Gemini 1.5 Pro, the Million-Token-Context LLM — with Jon Krohn (@JonKrohnLearns)

Jon Krohn delves into the transformative potential of Google's Gemini Pro 1.5, a groundbreaking million-token language model with multimodal features, highlighting the challenges it faces and the innovative paths forward for enhancing its performance in the AI industry.

Episode Highlights

Topics covered

Popular Clips

Episode Highlights

Contextual Challenges

Jon Krohn explores the contextual challenges faced by the Gemini Pro 1.5 language model, highlighting its struggle with accurately processing information. He shares an instance where the algorithm failed to provide correct timestamps for a video, producing fabricated content instead. This issue arises because the model doesn't process audio from uploaded videos, leading to hallucinated outputs. Krohn notes, "It turns out everything that Gemini 1.5 pro output was hallucinated, completely made up and done very confidently." 1 2

Solution Approaches

To address these limitations, Jon considers potential solutions, such as combining video and audio analysis. He suggests identifying visual cues, like smiling, and cross-referencing them with audio to enhance the model's accuracy. This approach could mitigate the model's current shortcomings, as he states, "The algorithm does work very well, as long as you, you're not expecting to get audio results from the video alone." 2

Related Episodes

666: GPT-4 — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
670: LLaMA: GPT-3 performance, 10x smaller — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
761: Gemini Ultra: How to Release an AI Product for Billions of Users — with Google's Lisa Cohen
Answers 383 questions
704: Jon’s “Generative A.I. with LLMs” Hands-on Training — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
756: AlphaGeometry: AI is Suddenly as Capable as the Brightest Math Minds — with @JonKrohnLearns
Answers 383 questions
788: Multi-Agent Systems: How Teams of LLMs Excel at Complex Tasks — with @JonKrohnLearns
Answers 383 questions
728: Use Contrastive Search to get Human-Quality LLM Outputs — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
SDS 438: Artificial General Intelligence — with Jon Krohn
Answers 383 questions
684: Get More Language Context out of your LLM — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
806: Llama 3.1 405B: The First Open-Source Frontier LLM — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
748: The Five Levels of AGI — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
772: In Case You Missed It in March 2024 — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
702: Llama 2 — It's Time to Upgrade your Open-Source LLM — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
768: Is Claude 3 Better than GPT-4? — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions
854: The Six Epochs of Intelligence Evolution — with Jon Krohn (@JonKrohnLearns)
Answers 383 questions

Dexa/Super Data Science: ML & AI Podcast with Jon Krohn

762: Gemini 1.5 Pro, the Million-Token-Context LLM — with Jon Krohn (@JonKrohnLearns)

Topics covered

Popular Clips

Algorithmic Insights

Gemini 1.5 Pro Insights

Token Context Insights

Gemini 1.5 Pro Insights

Algorithmic Challenges

AI's Exponential Growth

Gemini 1.5 Insights

Episode Highlights

Algorithmic Challenges

Contextual Challenges

Solution Approaches

AI Industry Impact

Gemini Pro Capabilities

Related Episodes

666: GPT-4 — with Jon Krohn (@JonKrohnLearns)

670: LLaMA: GPT-3 performance, 10x smaller — with Jon Krohn (@JonKrohnLearns)

761: Gemini Ultra: How to Release an AI Product for Billions of Users — with Google's Lisa Cohen

704: Jon’s “Generative A.I. with LLMs” Hands-on Training — with Jon Krohn (@JonKrohnLearns)

756: AlphaGeometry: AI is Suddenly as Capable as the Brightest Math Minds — with @JonKrohnLearns

788: Multi-Agent Systems: How Teams of LLMs Excel at Complex Tasks — with @JonKrohnLearns

728: Use Contrastive Search to get Human-Quality LLM Outputs — with Jon Krohn (@JonKrohnLearns)

SDS 438: Artificial General Intelligence — with Jon Krohn

684: Get More Language Context out of your LLM — with Jon Krohn (@JonKrohnLearns)

806: Llama 3.1 405B: The First Open-Source Frontier LLM — with Jon Krohn (@JonKrohnLearns)

748: The Five Levels of AGI — with Jon Krohn (@JonKrohnLearns)

772: In Case You Missed It in March 2024 — with Jon Krohn (@JonKrohnLearns)

702: Llama 2 — It's Time to Upgrade your Open-Source LLM — with Jon Krohn (@JonKrohnLearns)

768: Is Claude 3 Better than GPT-4? — with Jon Krohn (@JonKrohnLearns)

854: The Six Epochs of Intelligence Evolution — with Jon Krohn (@JonKrohnLearns)

762: Gemini 1.5 Pro, the Million-Token-Context LLM — with Jon Krohn (@JonKrohnLearns)

Topics covered

Popular Clips

Episode Highlights

Algorithmic ChallengesJon Krohn examines the challenges and potential solutions for Google's Gemini Pro 1.5, focusing on its context handling and multimodal capabilities. He highlights the model's current limitations and suggests innovative approaches to improve its performance.

Algorithmic Challenges

Contextual Challenges

Solution Approaches

AI Industry ImpactJon Krohn explores the transformative potential of Google's Gemini Pro 1.5, a million-token LLM that marks a new era in AI technology. With its extensive context window and multimodal functionalities, it signifies a major advancement in data science.

AI Industry Impact

Gemini Pro CapabilitiesJon Krohn explores Google's Gemini Pro 1.5, a revolutionary million-token language model transforming AI. Its advanced features, including a vast context window and multimodal capabilities, mark a significant leap in data science.

Gemini Pro Capabilities

Related Episodes