Machine learning in your database

Topics covered
Popular Clips
Episode Highlights
Capabilities
PostgresML introduces a groundbreaking approach by integrating machine learning capabilities directly within the Postgres database. explains that users can train and deploy models using SQL, simplifying the process for those familiar with Postgres but not necessarily with machine learning frameworks like TensorFlow or PyTorch 1. The platform supports vector operations, crucial for tasks like NLP, allowing for sophisticated model applications directly within the database 2. emphasizes the importance of these features, noting that data scientists often deal with complex data transformations before model training 3.
You don't really need to know what the difference is between a support vector machine and a gradient boosted tree model is. You pick the one with the best score and you move on with whatever your business is.
---
This integration allows for seamless machine learning operations, reducing the need for extensive data movement and external processing.
  Â
Origins
The creation of PostgresML stemmed from the founders' experiences at Instacart, where they faced challenges with scaling machine learning infrastructure. shares how his journey began with transitioning Instacart's data systems to more scalable architectures, which laid the groundwork for PostgresML 4. recounts his role in building these systems, emphasizing the need for efficient data handling and processing capabilities 5. Their collaboration led to innovations that simplified complex data workflows, ultimately inspiring the development of PostgresML 6.
We were getting large enough that we needed to move out of a monolithic rails app into more of a distributed architecture that would be horizontally scalable.
---
This evolution highlights the importance of adaptable data solutions in rapidly growing tech environments.
  Â
Challenges
Integrating machine learning with databases presents unique challenges, which PostgresML aims to address. describes the difficulties faced during the pandemic, where existing systems struggled under increased load, prompting a shift to Postgres-based solutions 7. This transition revealed inefficiencies in traditional data pipelines, highlighting the need for more robust and scalable infrastructures 8. notes that PostgresML facilitates essential data transformations, allowing users to clean and prepare data directly within the database 9.
You got to have some kind of like, you know, mathematical operations on your data. Like you have to like, be able to transform things.
---
These capabilities streamline the integration process, making machine learning more accessible and efficient.
  Â
Vision
Looking ahead, the creators of PostgresML envision a future where machine learning workflows are simplified and more accessible. and both emphasize the importance of reducing complexity in ML processes, allowing smaller teams to maintain high-quality production standards 10. They aim to create a system where machine learning engineers can focus on their core tasks without being bogged down by infrastructure concerns 11.
I want machine learning engineers to do machine learning that they actually enjoy, as opposed to figuring out how to, like, how to load balance the service.
---
This vision underscores their commitment to enhancing the usability and efficiency of machine learning tools.
Related Episodes


scikit-learn & data science you own
Answers 383 questions

Machine learning at small organizations
Answers 383 questions

Testing ML systems
Answers 383 questions

Data science for intuitive user experiences
Answers 383 questions

Operationalizing ML/AI with MemSQL
Answers 383 questions

Applied NLP solutions & AI education
Answers 383 questions

Artificial intelligence at NVIDIA
Answers 383 questions

The path towards trustworthy AI
Answers 383 questions

Killer developer tools for machine learning
Answers 383 questions

Generative models: exploration to deployment
Answers 383 questions

So you have an AI model, now what?
Answers 383 questions

From notebooks to Netflix scale with Metaflow
Answers 383 questions

Answering recent AI questions from Quora
Answers 383 questions

AI code that facilitates good science
Answers 383 questions

Putting AI in a box at MachineBox
Answers 383 questions
