Published Dec 19, 2023

Open source, on-disk vector search with LanceDB

Dive into the revolutionary world of LanceDB with Chang She as he explores its impact on generative AI, practical industry applications, and unique technology, featuring its open-source, on-disk vector search capabilities that redefine semantic search and data management.
Episode Highlights
Practical AI logo

Popular Clips

Episode Highlights

  • Technical Features

    LanceDB stands out in the crowded field of vector databases with its unique technical features. explains that LanceDB is embedded, running in-process in both Python and JavaScript, and features a new storage layer through its columnar format. This allows for enhanced data management and scalability, separating compute from storage, which is crucial for handling large-scale data efficiently 1. shares that LanceDB's origins were not in vector databases but in serving companies building computer vision infrastructure, highlighting its evolution and adaptability 2.

    We have a totally new storage layer through LanceDB columnar format. What this allows us to do is add data management features on top of the index.

    ---

    These features make LanceDB a cost-effective and scalable solution for developers and companies working with large datasets.

       

    Storage & Indexing

    LanceDB's columnar format and disk-based vector indices are key to its scalability and efficiency. describes how the columnar format allows for selective data access, fetching only necessary columns for queries, which enhances performance 3. This structure, paired with disk-based indices, enables the separation of compute and storage, allowing for efficient data processing even on modest hardware 4.

    It's all about the separation of compute and storage. And that's only possible if you have the right underlying data architecture for storing vectors and the data itself.

    ---

    This architecture supports large-scale data operations, making LanceDB suitable for diverse applications across various programming environments.

Related Episodes