Episode 193: Apache Mahout

Topics covered
Popular Clips
Questions from this episode
- Asked by 100 people
- Asked by 32 people
- Asked by 19 people
Episode Highlights
Fraud Detection
Machine learning has become a cornerstone in fraud detection and analytics, offering businesses a powerful tool to combat fraudulent activities. explains that the growth in machine learning is driven by the availability of vast amounts of data and the computational power to process it. This enables companies to better understand user behavior and refine their systems to prevent fraud. He notes, "Fraud analytics that's been using machine learning techniques for quite a number of years...generally speaking, they do pretty well there" 1. The adaptability of machine learning models is crucial, as they must be periodically reevaluated to stay effective against evolving threats 2.
Recommendations
Recommendation systems leverage machine learning to enhance user experience by suggesting relevant content or products. describes collaborative filtering as a key technique, which involves either user-based or item-based similarity to make recommendations. He explains, "Collaborative filtering is more of just simply mechanism...if everybody is buying some particular book and you're similar...then we should recommend that book to you" 3. The flexibility of these systems allows for various distance measures, such as Euclidean or cosine similarity, to be used in determining user likeness 4.
NLP
Natural language processing (NLP) is another area where machine learning excels, particularly in text categorization and understanding. highlights the use of open-source tools like Mahout and Lucene to solve NLP problems. He mentions his book, "Taming Text," which serves as an engineer's introduction to NLP and machine learning 5. Machine learning in NLP involves organizing data into consumable formats, such as classifying news articles into categories like sports or politics 6.
Related Episodes


Episode 479: Luis Ceze on the Apache TVM Machine Learning Compiler
Answers 383 questions

Episode 157: Hadoop with Philip Zeyliger
Answers 383 questions

SE-Radio-Episode-286-Katie-Malone-Intro-to-Machine-Learning
Answers 383 questions
Episode 115: Architecture Analysis
Answers 383 questions

Episode 493: Ram Sriharsha on Vectors in Machine Learning
Answers 383 questions

Episode 191: Massively Open Online Courses
Answers 383 questions

Episode 206: Ken Collier on Agile Analytics
Answers 383 questions

Episode 398: Apache Kudu with Adar Leiber Dembo
Answers 383 questions

Episode 188: Requirements in Agile Projects
Answers 383 questions

Episode 395: Katharine Jarmul on Security and Privacy in Machine Learning
Answers 383 questions

Episode 22: Feedback
Answers 383 questions

549-william-falcon-optimizing-deep-learning-models
Answers 383 questions

Episode 436: Apache Samza with Yi Pan
Answers 383 questions

Episode 127: Usability with Joachim Machate
Answers 383 questions

Episode 116: The Semantic Web with Jim Hendler
Answers 383 questions













