Model Deployment Challenges

Chip discusses the underrated aspect of monitoring deployed systems and highlights the bottleneck of slow inference time, especially with large models like GPT-2. Reducing inference time could be crucial for companies to break even in the current economy.