Document Fusion Models

Patrick discusses how document fusion models predict the next token in a document by fusing queries and documents early on. Keith explores the efficiency and complexity of different model architectures. The fusion in decoder approach improves cross-attention and performance in language models.