Published Feb 14, 2023

653: Efficiently Glean-ing Insights from Vast Data Warehouses — with Carlos Aguilar

Explore the insights of Glean's founder Carlos Aguilar as he reveals how their innovative data platform fills gaps in business intelligence, leverages cutting-edge technologies, and thrives on a diverse team culture, while diving into the fascinating applications of genetic algorithms.
Episode Highlights
Super Data Science: ML & AI Podcast with Jon Krohn logo

Popular Clips

Episode Highlights

  • Data Tools

    Carlos Aguilar, founder and CEO of Glean, explains how their platform leverages existing data warehousing tools to provide powerful analytics. By utilizing technologies like Snowflake and Google BigQuery, Glean performs live computations and data profiling without extracting data, ensuring fast and efficient performance 1. Additionally, DuckDB serves as an in-process columnar database, enabling isolated testing and lightweight demos without relying on external dependencies 2. This approach allows Glean to maintain a lightweight transformation library while still delivering robust analytics capabilities.

    We're just relying on that data warehouse.

    ---

       

    Visualization

    Glean's visualization tools, such as D3.js, enable users to create sophisticated visual representations of their data. Carlos highlights the importance of scalability and compatibility with advanced analytics functions in data warehouses like BigQuery and Snowflake 3. The platform's lightweight semantic layer allows for quick setup and easy integration, making it accessible for both technical and non-technical users 4. This ensures that users can efficiently explore and visualize data without extensive configuration.

    Glean tries to have that sort of lightweight semantic layer.

    ---

       

    Integration

    Glean's APIs and integration strategies are designed to facilitate seamless data exploration and insights extraction. The platform supports collaborative data insights by providing tools for both technical and non-technical stakeholders, enabling them to work together effectively 5. DuckDB's integration with Apache Arrow APIs further enhances Glean's capabilities, allowing for fast and efficient data serialization and computation 2. This combination of tools and strategies ensures that Glean can cater to diverse user needs while maintaining high performance.

    We try to create incredible tools for each of those types of stakeholders.

    ---

Related Episodes