Kirill Eremenko is a Data Science coach and lifestyle entrepreneur. The goal of the Super Data Science podcast is to bring you the most inspiring Data Scientists and Analysts from around the World to help you build your successful career in Data Science. Data is growing exponentially and so are salaries of those who work in analytics. This podcast can help you learn how to skyrocket your analytics career. Big Data, visualization, predictive modeling, forecasting, analysis, business processes, statistics, R, Python, SQL programming, tableau, machine learning, hadoop, databases, data science MBAs, and all the analytcis tools and skills that will help you better understand how to crush it in Data Science.
760: Humans Love A.I.-Crafted Beer
AI-crafted beer, machine learning for passion projects, and self-taught data science: Jon Krohn and Beau Warren’s hotly anticipated, data-driven, punny lager Krohn&Borg is finally given a taste test in this week’s Five-Minute Friday. Heading to the Species X brewery in Columbus, Ohio, Jon Krohn and Beau Warren launched the beer that had been predicted, optimized and developed by a machine-learning model.
Additional materials: www.superdatascience.com/760
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
2/23/2024 • 6 minutes, 31 seconds
759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko
Encoders, cross attention and masking for LLMs: SuperDataScience Founder Kirill Eremenko returns to the SuperDataScience podcast, where he speaks with Jon Krohn about transformer architectures and why they are a new frontier for generative AI. If you’re interested in applying LLMs to your business portfolio, you’ll want to pay close attention to this episode!
This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), by Oracle NetSuite business software (netsuite.com/superdata), and by Intel and HPE Ezmeral Software Solutions (http://hpe.com/ezmeral/chatbots). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• How decoder-only transformers work [15:51]
• How cross-attention works in transformers [41:05]
• How encoders and decoders work together (an example) [52:46]
• How encoder-only architectures excel at understanding natural language [1:20:34]
• The importance of masking during self-attention [1:27:08]
Additional materials: www.superdatascience.com/759
2/20/2024 • 1 hour, 43 minutes, 13 seconds
758: The Mamba Architecture: Superior to Transformers in LLMs
Explore the groundbreaking Mamba model, a potential game-changer in AI that promises to outpace the traditional Transformer architecture with its efficient, linear-time sequence modeling.
Additional materials: www.superdatascience.com/758
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
2/16/2024 • 8 minutes, 12 seconds
757: How to Speak so You Blow Listeners' Minds, with Cole Nussbaumer Knaflic
Explore mind-blowing storytelling with Cole Nussbaumer Knaflic in this episode. Audience favorite and author of "Storytelling with You," Cole returns to share essential tips for crafting impactful presentations, emphasizing narrative construction and audience engagement. Learn how to effectively communicate data and stories, enhancing your presentations with insights from a leading expert in the field.
This episode is brought to you by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• How to become a confident communicator [11:59]
• How to get rid of filler words [26:32]
• How facts alone can't make a strong impact [41:44]
• Cole's overview of her book Storytelling with You [55:19]
• How to craft an effective presentation [1:00:24]
• Common mistakes in virtual presentations [1:09:48]
• Cole's virtual presentation setup [1:15:33]
• Cole's next book Daphne Draws Data [1:20:23]
Additional materials: www.superdatascience.com/757
2/13/2024 • 1 hour, 29 minutes, 3 seconds
756: AlphaGeometry: AI is Suddenly as Capable as the Brightest Math Minds
AlphaGeometry, intuitive AI, and geometric deduction: In this week’s Five-Minute Friday, Super Data Science host Jon Krohn looks into developments from DeepMind, Google’s ground-breaking AI lab, and explores how this is a critical step towards a future of broadly accessible AI solutions across scientific disciplines.
Additional materials: www.superdatascience.com/756
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
2/9/2024 • 8 minutes, 45 seconds
755: Brewing Beer with A.I., with Beau Warren
ChatGPT applications and data-driven beer: Beer brewer and Super Data Science regular listener Beau Warren talks to Jon Krohn about the wonders of “sweaty ales”, how to brew beer with data, and how to get started on creative machine learning projects even without a degree in data science.
This episode is brought to you by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• About Species X [06:31]
• How to become a certified beer taster [12:37]
• How Beau checks the quality of his beer [25:01]
• Beau and Jon’s machine learning project [38:02]
• About genetic algorithms [52:35]
• How to get creativity out of LLMs [1:24:46]
Additional materials: www.superdatascience.com/755
2/6/2024 • 1 hour, 35 minutes, 43 seconds
754: A Code-Specialized LLM Will Realize AGI, with Jason Warner
Explore the future of coding with poolside co-founder and CEO Jason Warner as he explores the potential of code-specialized LLMs and their revolutionary impact on the developer's role. Tune in for insights on the shift towards an AI-led development paradigm.
Additional materials: www.superdatascience.com/754
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
2/2/2024 • 37 minutes, 1 second
753: Blend Any Programming Languages in Your ML Workflows, with Dr. Greg Michaelson
Explore the future of collaborative ML workflows in this engaging episode with Dr. Greg Michaelson, Co-Founder of Zerve. Dr. Michaelson introduces the groundbreaking Zerve IDE and Pypelines project, addressing the critical gap in AutoML for commercial use and pinpointing why many A.I. projects don't meet their objectives. Gain insights into steering AI initiatives towards success and enhancing project communication, all in this insightful session.
This episode is brought to you by Oracle NetSuite business software (https://netsuite.com/superdata), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• Why Zerve IDE is so sorely needed [04:50]
• Pypelines: AutoML open-source in python [30:00]
• Why most commercial A.I. projects fail and how to ensure they succeed [47:45]
• How AutoML will impact the role of the data scientist [53:21]
• Greg's background as a pastor and working at DataRobot [1:03:40]
• How to develop impressive communication and storytelling skills [1:16:16]
Additional materials: www.superdatascience.com/753
1/30/2024 • 1 hour, 26 minutes, 20 seconds
752: AI is Disadvantaging Job Applicants, But You Can Fight Back
Jon Krohn interviews Hilke Schellmann about the ethics of recruitment algorithms, the field’s current state of play, and what can be improved about AI used in recruiting.
Additional materials: www.superdatascience.com/752
Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.
1/26/2024 • 50 minutes, 56 seconds
751: How to Found and Fund Your Own A.I. Startup, with Dr. Rasmus Rothe
Venture capital and AI, and how to succeed with an AI company in 2024: Rasmus Rothe, Cofounder of Merantix, speaks to Jon Krohn about the Merantix campus in Berlin, how a venture capitalist identifies the best AI startups, the surefire ways for AI company founders to raise venture capital, and the jobs that are most and least vulnerable to disruption by automation.
This episode is brought to you by Oracle NetSuite business software (netsuite.com/superdata), by QuickChat customized AI assistants (https://quickchat.ai), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• How Merantix started [05:17]
• How does Merantix work and how to apply for funding [08:19]
• How to secure AI funding [21:02]
• How AI companies can prove competitiveness [33:46]
• Ensuring AI regulation [41:17]
• How AI will change the future of work [56:56]
Additional materials: www.superdatascience.com/751
1/23/2024 • 1 hour, 18 minutes, 29 seconds
750: How A.I. is Transforming Science
Explore the transformative power of AI in science. Jon Krohn reviews the groundbreaking AI-driven discoveries at MIT and beyond, showcasing how AI is reshaping various scientific fields, from pharmaceuticals to climate science, and pondering the balance between AI's capabilities and human ingenuity.
Additional materials: www.superdatascience.com/750
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
1/19/2024 • 9 minutes, 41 seconds
749: Data Science for Clean Energy, with Emily Pastewka
Data science for clean energy takes center stage as Emily Pastewka from Palmetto joins Jon Krohn this week, exploring innovative paths to a sustainable future. This episode covers the impact of AI on smart energy choices, the creation of a smart grid, and the wide array of professionals required to bring cleantech data solutions to life.
This episode is brought to you by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• Emily on her Master's in Deep Learning [08:20]
• Using AI to solve clean energy challenges at Palmetto [17:22]
• The different roles needed to solve cleantech problems [27:33]
• How econometrics impacts consumer decision-making [38:56]
• How Emily manages high-performing teams [56:30]
• The tools and technologies that drive small teams [1:06:58]
Additional materials: www.superdatascience.com/749
1/16/2024 • 1 hour, 16 minutes, 54 seconds
748: The Five Levels of AGI
Artificial General Intelligence gets a new definition: This episode introduces Google DeepMind’s paper, “Levels of AGI: Operationalizing Progress on the Path to AGI”. Hear how its authors have organized narrow and general AI into hierarchical categories defined by human capability, from Level 0 (no AI) and Level 1 (equal to or somewhat better than an unskilled human) to Level 5 (able to outperform 100% of humans). A scary thought? Or a vision of a better future? Host Jon Krohn details the strengths of this research in this Five-Minute Friday.
Additional materials: www.superdatascience.com/748
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
1/12/2024 • 11 minutes, 20 seconds
747: Technical Intro to Transformers and LLMs, with Kirill Eremenko
Attention and transformers in LLMs, the five stages of data processing, and a brand-new Large Language Models A-Z course: Kirill Eremenko joins host Jon Krohn to explore what goes into well-crafted LLMs, what makes Transformers so powerful, and how to succeed as a data scientist in this new age of generative AI.
This episode is brought to you by Intel and HPE Ezmeral Software Solutions (https://hpe.com/ezmeral/chatwithyourdata), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• Supply and demand in AI recruitment [08:30]
• Kirill and Hadelin's new course on LLMs, “Large Language Models (LLMs), Transformers & GPT A-Z” [15:37]
• The learning difficulty in understanding LLMs [19:46]
• The basics of LLMs [22:00]
• The five building blocks of transformer architecture [36:29]
- 1: Input embedding [44:10]
- 2: Positional encoding [50:46]
- 3: Attention mechanism [54:04]
- 4: Feedforward neural network [1:16:17]
- 5: Linear transformation and softmax [1:19:16]
• Inference vs training time [1:29:12]
• Why transformers are so powerful [1:49:22]
Additional materials: www.superdatascience.com/747
1/9/2024 • 2 hours, 6 minutes, 31 seconds
746: A Continuous Calendar for 2024
Jon’s continuous calendar for 2024 is here! Now in an updated format, learn about its unique layout and benefits, and how it can revolutionize your planning for the new year.
Additional materials: www.superdatascience.com/746
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
1/5/2024 • 2 minutes, 49 seconds
745: 2024 Data Science Trend Predictions
2024 data science trends take the spotlight in this special episode, where Jon joins Sadie St. Lawrence to analyze last year's predictions and delve into the emerging technologies reshaping the field. From AI hardware accelerators to the transformative role of large language models, this episode is a treasure trove of insights for anyone interested in the future of data science.
This episode is brought to you by CloudWolf (www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• Reviewing predictions for 2023 [05:56]
• Sadie's trend predictions for 2024 [20:49]
• 1: Hardware evolution [21:17]
• 2: LLMOS [35:30]
• 3: Slow-thinking model [48:18]
• 4: Tool consolidation [54:46]
• 5: Workforce Upheaval [58:06]
• Jon's predictions [1:06:26]
• 1: AI bubble bursting [1:08:11]
• 2: Breakthroughs in Edge AI [1:12:22]
• Sadie on her productivity planner [1:17:50]
Additional materials: www.superdatascience.com/745
1/2/2024 • 1 hour, 30 minutes, 9 seconds
744: To a Peaceful 2024
2023: A year of great movement and change. Technological developments have rocketed generative AI’s capabilities into the stratosphere of possibilities for future approaches to work, health, and play. Host Jon Krohn recognizes the benefits we have seen over the past year, discusses the important role we all have in ensuring ethics remains at the core of AI development and use, and he ends the year with a musical surprise for his listeners!
Additional materials: www.superdatascience.com/744
Interested in sponsoring a SuperDataScience Podcast episode? Visit http://passionfroot.me/superdatascience for sponsorship information.
12/29/2023 • 6 minutes, 13 seconds
743: How to Integrate Generative A.I. Into Your Business, with Piotr Grudzień
Chatbots, large language models and generative AI: Founder of Quickchat AI Piotr Grudzień believes the key to any successful AI platform is to ensure it can be tailored to a company’s specific needs. He speaks to host Jon Krohn about helping clients generate realistic and satisfying conversations that help their customer base find what they need quickly.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit http://passionfroot.me/superdatascience for sponsorship information.
In this episode you will learn:
• About Quickchat AI and how it works [02:46]
• How to successfully set up a conversational AI [23:58]
• What “temperature” is in the context of AI [38:38]
• How the LLM landscape has changed in recent years [40:24]
• The future of generative AI [57:43]
• The advantages of an AI accelerator [1:09:38]
Additional materials: www.superdatascience.com/743
12/26/2023 • 1 hour, 19 minutes, 20 seconds
742: Happy Holidays from All of Us
Join us on a brief journey through the AI world in 2023. A year ago, GPT-3.5 crafting our holiday message was a marvel, but now, with GPT-4's arrival, we're seeing an even more astounding evolution in AI. As we wave goodbye to the trend of generative AI, the Super Data Science Podcast team is bringing a personal touch back. Tune in for our heartfelt Happy Holidays message and a big thank you to all our listeners for your unwavering support.
Additional materials: www.superdatascience.com/742
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/22/2023 • 2 minutes, 24 seconds
741: How to Visualize Data Effectively, with Prof. Alberto Cairo
Data visualization remains at the forefront as Dr. Alberto Cairo from the University of Miami guides us beyond numerical figures, exploring the art of weaving compelling narratives through data. In his book, "The Art of Insight," he reveals the varied motivations driving visualization experts and highlights the serene, meditative process inherent in crafting visualizations. Emphasizing the fusion of scientific principles and personal style for effective data communication, Dr. Cairo also discusses with Jon the impending impact of AI on both interactive and static graphics.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, by HPE Ezmeral Software Solutions (https://hpe.com/ezmeral/chatwithyourdata), and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Alberto's book, The Art of Insight [04:07]
• How to transform data into engaging visuals [07:06]
• What it takes to enter in a meditation-like flow state when creating visualizations [11:21]
• How balancing the science of visualization with one’s personal style [29:29]
• The importance of Smart Brevity for great data visualizations [37:32]
• How data visualization can drive social change [42:31]
• How diversity in designers enriches the field [52:07]
• The future of data visualizations [59:10]
Additional materials: www.superdatascience.com/741
12/19/2023 • 1 hour, 18 minutes, 12 seconds
740: Q*: OpenAI's Rumored AGI Breakthrough
Sam Altman’s exit and rehiring, AGI, and OpenAI’s Q*: In this week’s Five-Minute Friday, Jon Krohn peeks behind the curtains of OpenAI, where development of the world’s first model that can solve complex, nonlinear logical problems, Q*, might be well underway. This episode casts light on the rumors behind OpenAI’s Q*, what its emergence could mean for the future of AI, and the controversies already surrounding an agent that has not yet reached the market.
Additional materials: www.superdatascience.com/740
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/15/2023 • 11 minutes, 15 seconds
739: AI is Eating Biology and Chemistry, with Dr. Ingmar Schuster
AI Protein design, machine learning and cancer care, and pharmaceuticals: At Exazyme, CEO and Co-Founder Ingmar Schuster uses AI to design proteins. He speaks with Jon Krohn about their wider applications in pharmaceuticals and chemistry, how Kernel methods make the design of synthetic biological catalysts more efficient, and when to use shallow machine learning over deep learning.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• On designing proteins with AI [03:14]
• Designing proteins at Exazyme [08:22]
• About the kernel methods [18:10]
• The importance of human-led approaches in protein research [35:44]
• Europe’s focus on AI regulation [43:45]
• Deep vs shallow in AI [59:35]
• How a background in academia helps with entrepreneurship [1:09:17]
Additional materials: www.superdatascience.com/739
12/12/2023 • 1 hour, 21 minutes, 42 seconds
738: Engineering Biomaterials with Generative AI, with Dr. Pierre Salvy
Bioengineering and Generative AI converge under the visionary leadership of Dr. Pierre Salvy at Cambrium GmbH, propelling material science into uncharted territories. He sits down with Jon Krohn live at Merantix A.I. Campus in Berlin to discuss how he's transforming material design, exemplified by his swift development of NovaColl, a vegan collagen crafted within two years.
Additional materials: www.superdatascience.com/738
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/8/2023 • 22 minutes, 43 seconds
737: scikit-learn's Past, Present and Future, with scikit-learn co-founder Dr. Gaël Varoquaux
scikit-learn co-founder Gaël Varoquaux and Jon Krohn are live at the historic Sorbonne in Paris, where they discuss the evolution of scikit-learn. From its origins as a memory-efficient Python implementation of support vector machines to its present-day status as a pivotal resource in machine learning, Gaël paints a vivid picture of its remarkable growth. Join us for a glimpse into scikit-learn's evolution, the realm of open-source collaboration, and the transformative power of data-driven insights in today's dynamic data landscape.
This episode is brought to you by Gurobi (gurobi.com/sds), the Decision Intelligence Leader, by Data Universe (https://datauniverse2024.com), the out-of-this-world data conference, and by CloudWolf (www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The early beginnings and growth of scikit-learn [05:34]
• Development principles of scikit-learn [18:05]
• How to apply scikit-learn to your ML problem [21:16]
• Resource-efficiency and scikit-learn development [25:32]
• How to contribute to an open-source project like scikit-learn yourself [38:21]
• The future of scikit-learn [51:13]
• Gaël on the social-impact data projects in his Soda lab [1:02:33]
• Why domain expertise and statistical rigor are more important than ever [1:11:24]
Additional materials: www.superdatascience.com/737
12/5/2023 • 1 hour, 30 minutes, 4 seconds
736: How to Officially Certify your AI Model, with Jan Zawadzki
AI certification and EU regulation: Jan Zawadzki, CTO and CO Managing Director of Certif.ai, talks to Jon Krohn about the future of certification for AI startups and keeping within rigorous international regulations.
Additional materials: www.superdatascience.com/736
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/1/2023 • 14 minutes, 43 seconds
735: A.I. Product Management, with Google DeepMind's Head of Product, Mehdi Ghissassi
Artificial General Intelligence, AlphaGo, and Google DeepMind: Jon Krohn speaks to Mehdi Ghissassi, Director of Product Management at Google DeepMind, about the ethics and social impact of AI, keeping up with AI releases with safety in mind, and other pressing AI problems that keep him awake at night. In this episode, Mehdi and Jon also take a broader look at the current AI landscape, the opportunities for AI investors and startups, and what AI product managers need to get ahead.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• How DeepMind seeks to ‘solve intelligence’ [05:14]
• The impact of AGI’s capabilities on medicine [16:37]
• How the general public might come to apply future AI systems [28:09]
• How working on product development for Africa has shaped Mehdi’s perspective on AI’s potential and challenges [37:17]
• How to stay on top of rapid changes in AI [39:17]
• What investors look for in AI startups [59:16]
• Tips for product managers [1:03:34]
Additional materials: www.superdatascience.com/735
11/28/2023 • 1 hour, 21 minutes, 44 seconds
734: Humanoid Robot Soccer, with the Dutch RoboCup Team
Robot Soccer takes center stage as Jon Krohn and Dário Catarrinho, Secretary of the Dutch Nao Team and an AI student at the University of Amsterdam, discuss the intricate machine learning that enables robots to navigate the field, make decisions in real-time, respond to sound, and compete against each other in a gripping display of skill and strategy.
Additional materials: www.superdatascience.com/734
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
11/24/2023 • 17 minutes, 7 seconds
733: OpenAssistant: The Open-Source ChatGPT Alternative, with Dr. Yannic Kilcher
Yannic Kilcher, a leading ML YouTuber and DeepJudge CTO, teams up with Jon Krohn this week to delve into the open-source ML community, the technology powering Yannic’s Swiss-based startup, and the significant implications of adversarial examples in ML. Tune in as they also unpack Yannic's approach to tracking ML research, future AI prospects and his startup challenges.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• About OpenAssistant project [03:39]
• Alignment issues in open-source vs closed-source [08:36]
• Alternative formulas vital for crafting superior LLMs [20:29]
• Strategies to foster open-source LLM ecosystems [27:07]
• Yannic's pioneering work in legal document processing at DeepJudge [31:31]
• Comprehensive overview of adversarial examples [1:04:02]
• The future AI's landscape [1:18:08]
• Startup challenges [1:25:35]
Additional materials: www.superdatascience.com/733
11/21/2023 • 1 hour, 40 minutes, 19 seconds
732: Data Science for Astronomy, with Dr. Daniela Huppenkothen
It’s cloudy with a chance of machine learning models at the University of Amsterdam’s astronomy department, as Jon Krohn meets guest Daniela Huppenkothen for a wide-ranging discussion about building instrumentation for telescopes, collecting data from outer space and how to sort astronomy’s problem of enormous amounts of data.
Additional materials: www.superdatascience.com/732
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
11/17/2023 • 44 minutes, 27 seconds
731: A.I. Agents Will Develop Their Own Distinct Culture, with Nell Watson
Ethics and machine intelligence pioneer Nell Watson speaks to host Jon Krohn about the differences between AI ethics and AI safety, how crying wolf may result in future complications for AI development and the importance of ensuring IEEE standards to mitigate and regulate AI risks. She also touches on what she considers a “second Enlightenment”, in which we may start to form intimate relationships with AI—to both parties’ benefit.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• AI ethics and AI safety [05:30]
• How "moving fast" could break the world [18:07]
• The shifting relationship between humans and machines [29:54]
• International ethics standards, and their review process [52:10]
• Current and future ethical standards [1:05:31]
• Building a universal basic income with AI [1:19:23]
Additional materials: www.superdatascience.com/731
11/14/2023 • 1 hour, 28 minutes, 54 seconds
730: How GitHub Operationalizes AI for Teamwide Collaboration and Productivity
In this episode, Kyle Daigle, COO of GitHub, joins Jon Krohn to discuss the transformative impact of generative AI tools like GitHub Copilot. Learn how these tools streamline software development, enhance collaboration, and accelerate code reviews. Discover innovative approaches to collaboration and innersourcing, reshaping the future of teamwork in the digital age.
Additional materials: www.superdatascience.com/730
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
11/10/2023 • 18 minutes, 59 seconds
729: Universal Principles of Intelligence (Across Humans and Machines), with Prof. Blake Richards
Dr. Blake Richards discusses the world of AI and human cognition this week. Learn about the essence of intelligence, the ways AI research informs our understanding of the human brain, and discover the potential future scenarios where AI and humanity might intersect.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Blake's research and his take on intelligence [09:56]
• How we can evaluate progress in artificial general intelligence [15:54]
• Blake's thoughts on biomimicry [20:57]
• Why Blake thinks the fears regarding AI are overdone [25:38]
• The most effective strategies to mitigate AI fears without hindering innovation [35:31]
• What steps can we take to ensure that AI supports human flourishing [45:23]
• The importance of interpreting neuroscience data through the lens of ML [55:08]
• Backpropagation, gradient descent and the brain [1:17:32]
Additional materials: www.superdatascience.com/729
11/7/2023 • 1 hour, 46 minutes, 11 seconds
728: Use Contrastive Search to get Human-Quality LLM Outputs
Learn how to achieve human-like outputs from LLMs in this week’s Five-Minute Friday with Jon Krohn. Understand the various current methods available to decode and generate text, as well as the differences between them. Find out about greedy search, beam search, sampling, and contrastive search, and how you can use them to create incredibly useful LLMs.
Additional materials: www.superdatascience.com/728
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
11/3/2023 • 5 minutes, 44 seconds
727: Unmasking A.I. Injustice, with Dr. Joy Buolamwini
Coded bias, intersectionality in AI, and computer vision: Founder of the Algorithmic Justice League Joy Buolamwini talks to host Jon Krohn about the impact of exclusion and inclusion in datasets, the need to address intersectionality when identifying racial, age, or gender-based prejudice in machine learning tools, protections for artists and creative practitioners against AI, and the role that AI may have in combating systemic racism.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What coded bias is [06:49]
• The problem with bias in machine learning datasets [18:41]
• The Incoding Movement [42:08]
• About the Pilot Parliaments Benchmark [52:07]
• Ethics and the future of AI [1:20:10]
• The potential for AI to end systemic racism [1:32:59]
Additional materials: www.superdatascience.com/727
10/31/2023 • 1 hour, 45 minutes, 21 seconds
726: Seven Factors for Successful Data Leadership
Ben Jones, CEO of Data Literacy, discusses the seven crucial components of effective data leadership. From ethics to technology and fostering a data-centric culture, Jones provides actionable insights and practical examples. Tune in to empower your organization with purposeful and ethical data strategies from day one.
Additional materials: www.superdatascience.com/726
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
10/27/2023 • 32 minutes, 55 seconds
725: Neuroscience + Machine Learning, with Google DeepMind's Dr. Kim Stachenfeld
Dr. Kim Stachenfeld, Research Scientist at Google DeepMind and Affiliate Professor at Columbia University, delves into the realms of AI and neuroscience as she discusses computer-based simulations of the human brain, the efficiency of language in compression, and the neuroscience theories shaping the future of artificial intelligence. Discover the secrets behind memory formation, cognitive enhancement, and the potential of Artificial General Intelligence (AGI) in this thought-provoking episode.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by ODSC (https://odsc.com), the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The importance of simulations in the context of human intelligence [05:44]
• The basic approach to simulating human intelligence or physical systems [09:30]
• Will simulations help us realize AGI? [37:21]
• The cross-disciplinary potential of LLMs [40:20]
• The special role of our brain’s hippocampus in memory formation [1:05:15]
• Kim's research on reinforcement learning and neural representation [1:15:02]
• Compression in representation learning [1:38:51]
• What skills should an aspiring computational neuroscientist hone [1:50:30]
Additional materials: www.superdatascience.com/725
10/24/2023 • 1 hour, 58 minutes, 56 seconds
724: Decoding Speech from Raw Brain Activity, with Dr. David Moses
In this Friday episode, host Jon Krohn talks to UCSF’s David Moses about BRAVO (Brain-Computer Interface Restoration of Arm and Voice), a study led by Edward Chang and Karunesh Ganguly that helps patients who have lost the ability to speak to communicate once again via a speech neuroprosthesis. Postdoctoral engineer David Moses, who is a part of BRAVO, reveals the data and machine learning models that help BRAVO predict the words and facial expressions that a paralyzed patient is trying to form via their brain activity, crucially helping patients to communicate with medical practitioners and loved ones.
Additional materials: www.superdatascience.com/724
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
10/20/2023 • 42 minutes, 30 seconds
723: Mathematical Optimization, with Jerry Yurchisin
Mathematical optimization should be known to every data scientist: Jon Krohn speaks to Jerry Yurchisin, Data Science Strategist at Gurobi, the decision-making technology and best-kept secret of 80% of America’s leading enterprises.
This episode is brought to you by the Zerve data science dev environment (https://zerve.ai), the Decision Intelligence Leader, by ODSC (https://odsc.com), the Open Data Science Conference, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What mathematical optimization is [04:27]
• How Gurobi solver works [29:01]
• How to use Gurobi with Python [36:08]
• Coding and algebra resources [41:14]
• When to use mathematical optimization and machine learning together [54:23]
• Using mathematical optimization in natural language processing [1:01:00]
Additional materials: www.superdatascience.com/723
10/17/2023 • 1 hour, 37 minutes, 33 seconds
722: AI Emits Far Less Carbon Than Humans (Doing the Same Task)
This episode delves into an intriguing research paper from top institutions like UC Irvine and MIT, analyzing the carbon emissions of AI-driven writing and illustrating versus traditional human methods. The findings might surprise you. Is AI the more eco-friendly option? Listen now to explore this compelling intersection of technology and sustainability.
Additional materials: www.superdatascience.com/722
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
10/13/2023 • 7 minutes, 45 seconds
721: Quantum Machine Learning, with Dr. Amira Abbas
Dr. Amira Abbas, Quantum Computing Researcher at the University of Amsterdam, explores the captivating world of Quantum Machine Learning. Learn about the distinct characteristics of qubits and the vital processes of Quantum ML. For those keen on exploring further, Amira offers noteworthy ML tools suggestions to kickstart your journey in Quantum Computing.
This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, by ODSC (https://odsc.com), the Open Data Science Conference, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Quantum computing vs classical computing [03:42]
• What is quantum entanglement [11:45]
• What is a qubit [15:07]
• The best problems for quantum ML [30:08]
• Three distinct steps in quantum ML and its potential [39:06]
• Quantum neural networks [49:03]
• What Amira's working on at the moment [1:10:20]
• How to get started in quantum ML [1:21:06]
• Amira's recommended ML tools for quantum computing [1:30:39]
Additional materials: www.superdatascience.com/721
10/10/2023 • 1 hour, 42 minutes, 47 seconds
720: OpenAI’s DALL-E 3, Image Chat and Web Search
DALL-E may be playing second fiddle to Midjourney no longer with OpenAI’s latest model for generative AI art, DALL-E 3. Host Jon Krohn breaks down the newest model’s capabilities to go beyond producing incredible artistic images, and that follows your written brief to the letter.
Additional materials: www.superdatascience.com/720
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
10/6/2023 • 12 minutes, 5 seconds
719: Computational Mathematics and Fluid Dynamics, with Prof. Margot Gerritsen
In this episode, Margot Gerritsen and Jon Krohn discuss the fundamentals of computational mathematics and its application in studying fluid dynamics. Margot also talks about how her synesthesia led to a lifelong interest in math, using computational mathematics to predict airflow, and why it is so important that underrepresented groups in data science become more visible through organizations like Women in Data Science.
This episode is brought to you by the Zerve data science dev environment (https://zerve.ai), by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by ODSC (https://odsc.com), the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• About computational mathematics and its relation to data science [03:19]
• Margot’s current research into emissions simulation [15:05]
• Computational Mathematics: Real-World Applications [33:18]
• The importance of wind tunnels in testing designs [47:54]
• The beauty of linear algebra [1:05:59]
• Synesthesia: Seeing Numbers as Colors [1:16:33]
• About Women in Data Science [1:24:59]
Additional materials: www.superdatascience.com/719
10/3/2023 • 1 hour, 47 minutes, 47 seconds
718: ChatGPT Custom Instructions: A Major, Easy Hack for Data Scientists
Elevate your ChatGPT game with a useful custom instruction. Tune in to hear Jon’s trick for maximizing ChatGPT’s potential.
Additional materials: www.superdatascience.com/718
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
9/29/2023 • 5 minutes, 2 seconds
717: Overcoming Adversaries with A.I. for Cybersecurity, with Dr. Dan Shiebler
Dr. Dan Shiebler, Head of ML at Abnormal Security, joins Jon Krohn this week and unveils the intricacies of cybercrime detection and email protection, and the role of AI in future challenges.
This episode is brought to you by Grafbase (https://grafbase.com), the unified data layer, by ODSC (https://odsc.com/), the Open Data Science Conference, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The heuristic and “intermediate” ML models that they develop at Abnormal Security [07:08]
• How Dan uses LLMs at Abnormal Security [15:46]
• How false negatives are individually the biggest classification error to avoid in cybersecurity [20:49]
• How head-to-head competitor analysis helps refine models [34:34]
• Resilient ML in cybersecurity [38:36]
• Abnormal Security’s routine for updating their models [52:37]
• AI's impact on the urban world [1:09:57]
• How to stay updated in data science and AI [1:13:46]
Additional materials: www.superdatascience.com/717
9/26/2023 • 1 hour, 20 minutes, 48 seconds
716: Happiness and Life-Fulfillment Hacks
Jon Krohn's 94-year-old grandmother, Annie, who's bursting with life and wisdom, shares her recipe to lifelong happiness and how relationships and daily intentions play an integral role. Annie also shares her curious take on modern technology. Get inspired by her infectious joy and perspective on life.
Additional materials: www.superdatascience.com/716
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
9/22/2023 • 13 minutes, 52 seconds
715: Make Better Decisions with Data, with Dr. Allen Downey
Join us as Dr. Allen Downey, renowned author and professor, shares insights from his upcoming book 'Probably Overthinking It,' breaking down underused techniques like Survival Analysis, explaining common paradoxes, and discussing the dynamic Overton Window.
This episode is brought to you by the Zerve data science dev environment (https://zerve.ai), by Modelbit (https://modelbit.com), for deploying models in seconds, and by Grafbase (https://grafbase.com), the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Why interpreting data is not always easy [06:21]
• What is Survival Analysis [15:32]
• Preston's Paradox [22:09]
• Are you Normal? [36:52]
• How to better prepare for rare “Black Swan” events [42:48]
• What is an Overton Window? [53:06]
• What is the base rate fallacy? [1:23:31]
• How to protect yourself from biased samples [1:33:39]
• Simpson’s Paradox [1:42:43]
Additional materials: www.superdatascience.com/715
9/19/2023 • 1 hour, 55 minutes, 46 seconds
714: Using A.I. to Overcome Blindness and Thrive as a Data Scientist
In this Friday episode, guest Tim Albiges explores with host Jon Krohn how people with blindness can have a lucrative and fulfilling career in data science, how Tim’s PhD thesis applied machine learning to help diagnose chronic respiratory diseases, and the communication tools that blind people can use to live a full and independent life.
Additional materials: www.superdatascience.com/714
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
9/15/2023 • 36 minutes, 49 seconds
713: Llama 2, Toolformer and BLOOM: Open-Source LLMs with Meta's Dr. Thomas Scialom
Artificial General Intelligence, RLHF’s application in AI, and how entrepreneurs can enter the AI industry: Meta’s AI Research Scientist Thomas Scialom gives us behind-the-scenes insights into developing Llama 2 and what’s in the works for Llama 3. With host Jon Krohn, he discusses the future of Artificial General Intelligence, why the Galactica science-focused LLM was taken down, and what he learned from it.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au), by Grafbase (https://grafbase.com), the unified data layer, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Llama 2: Behind the Scenes of Today’s Top Open-Source LLM [05:04]
• Responsible use of Llama 2 [15:26]
• Toolformer: LLM That Learns How to Use External Tools [24:57]
• Galactica: The Science-Specific LLM and Why It Was Brought Down [36:57]
• Is AGI Around the Corner? [57:03]
• Advice for AI entrepreneurs [1:05:46]
• How Thomas develops and manages large-scale AI projects [1:14:42]
Additional materials: www.superdatascience.com/713
9/12/2023 • 1 hour, 25 minutes, 35 seconds
712: Code Llama
Code Llama might just be starting the revolution for how data scientists code. In this Five-Minute Friday, host Jon Krohn investigates the suite of models under the free-to-use Code Llama and how to find the best fit for your project’s needs.
Additional materials: www.superdatascience.com/712
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
9/8/2023 • 6 minutes, 48 seconds
711: Image, Video and 3D-Model Generation from Natural Language, with Dr. Ajay Jain
In this episode, host Jon Krohn explores with his guest Ajay Jain, Co-Founder of Genmo.ai, how creative general intelligence could take the video industry by storm. They also discuss the models that got Genmo to this point, the applications of NeRF, and how understanding human psychology is so essential to developing models that output high-fidelity video.
This episode is brought to you by the Zerve data science dev environment (https://zerve.ai), by Grafbase (https://grafbase.com), the unified data layer, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• About Genmo.ai and the term “creative general intelligence” [03:47]
• Why Ajay started Genmo.ai [09:26]
• The increased performance of multimodal models [21:12]
• All about Denoising Diffusion Probabilistic Models (DDPMs) [31:03]
• The application of Neural Radiance Fields (NeRF) [55:26]
• Predicting pedestrian behavior at Uber [1:01:50]
• How to save money in the process of training models [1:12:42]
Additional materials: www.superdatascience.com/711
9/5/2023 • 1 hour, 26 minutes, 3 seconds
710: LangChain: Create LLM Applications Easily in Python
Discover the power of Large Language Models with Kris Ograbek as he unravels the intricacies of LangChain and showcases a chatbot in action, all while putting our host Jon Krohn in the hot seat!
Additional materials: www.superdatascience.com/710
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
9/1/2023 • 1 hour, 3 minutes, 13 seconds
709: Big A.I. R&D Risks Reap Big Societal Rewards, with Meta's Dr. Laurens van der Maaten
Meta's Senior Research Director, Dr. Laurens van der Maaten, takes center stage to unravel the captivating realm of AI innovation. Learn about his groundbreaking contributions, including pioneering the t-SNE dimensionality reduction technique and harnessing AI for novel protein synthesis, climate change mitigation, and wearable materials simulation. Join us to explore the transformative power of AI across diverse domains and gain a glimpse into its future societal implications.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au), by Modelbit (https://modelbit.com), for deploying models in seconds, and by Grafbase (https://grafbase.com), the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Large-scale learning of image recognition models on web data [05:05]
• Evolutionary Scale Modeling protein models [16:45]
• Fighting climate change by building an A.I. model [29:49]
• The CrypTen privacy-preserving ML framework [38:36]
• Concerns about adversarial examples [53:25]
• Laurens’ t-SNE algorithm [58:56]
• How to make a big impact [1:07:25]
Additional materials: www.superdatascience.com/709
8/29/2023 • 1 hour, 20 minutes, 39 seconds
708: ChatGPT Code Interpreter: 5 Hacks for Data Scientists
On this week’s Five-Minute Friday, host Jon Krohn gives five reasons why he is so excited about ChatGPT’s Code Interpreter and walks listeners through its capabilities with a practical example.
Additional materials: www.superdatascience.com/708
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
8/25/2023 • 22 minutes, 45 seconds
707: Vicuña, Gorilla, Chatbot Arena and Socially Beneficial LLMs, with Prof. Joey Gonzalez
LLM Vicuña, Chatbot Arena, and the race to increase LLM context windows: This episode’s guest Joey Gonzalez talks to Jon Krohn about developing models and platforms that leverage and improve LLMs, as well as the future of AI development and access.
This episode is brought to you by the AWS Insiders Podcast (https://pod.link/1608453414), by Modelbit (https://modelbit.com), for deploying models in seconds, and by Grafbase (https://grafbase.com), the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Vicuña: How the revolutionary LLM came to be [03:35]
• Chatbot Arena: The leading LLM leaderboard [09:47]
• Trusting LLM results [17:54]
• Gorilla: The open-source ChatGPT plugin alternative [32:13]
• About LMSYS and long context windows [47:48]
• Open- vs closed-source LLMs: Which is better? [1:01:39]
• Aqueduct [1:16:49]
• Founding GraphLab [1:27:02]
• How AI will positively impact society in the coming decades [1:33:23]
Additional materials: www.superdatascience.com/707
8/22/2023 • 1 hour, 47 minutes, 15 seconds
706: Large Language Model Leaderboards and Benchmarks
In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena.
Additional materials: www.superdatascience.com/706
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
8/18/2023 • 33 minutes, 27 seconds
705: Feeding the World with ML-Powered Precision Agriculture
Join Jon Krohn as he chats with Syngenta Group's Feroz Sheikh, Jeremy Groeteke, and Thomas Jung about the digital revolution in agriculture. Learn how data science is evolving farming, from precision techniques to global food solutions. A compelling blend of tech meets nature.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What is precision agriculture? [09:43]
• What is computational agronomy? [12:30]
• How Syngenta helps growers optimize yields [21:37]
• How to bridge the gap between R&D and out in the real world [33:58]
• What is generative chemistry? [37:52]
• How generative chemistry accelerates the discovery of new compounds [41:55]
• How you could make a big social impact in agriculture with data science [56:22]
• How to go about designing ML models for agriculture [1:00:27]
Additional materials: www.superdatascience.com/705
8/15/2023 • 1 hour, 29 minutes, 11 seconds
704: Jon’s “Generative A.I. with LLMs” Hands-on Training
Take on the world of GPT and learn to develop your own, commercially successful Large Language Models (LLMs) with Jon Krohn’s comprehensive, guided training video for generative AI. Get to grips with the technology, learn which tools to use, and find out how to get an eye for business-viable models with Jon’s (ad-)free educational video.
Additional materials: www.superdatascience.com/704
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
8/11/2023 • 4 minutes, 54 seconds
703: How Data Happened: A History, with Columbia Prof. Chris Wiggins
Statistics history, interdisciplinarity, and data and society. Chris Wiggins talks with Jon Krohn about the power dynamics of data, the transformation of the field of biology through data-driven approaches to genetic sequencing, and the New York Times’ data science team’s cutting-edge approach to accommodating its tech stack.
This episode is brought to you by the AWS Insiders Podcast (https://pod.link/1608453414) and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The importance of the humanities in data science [09:18]
• How data science “rearranges” power [17:19]
• An overview of How Data Happened [20:36]
• The controversial nature of Bayes theorem [29:16]
• Why we need to consider data ethics [34:00]
• How biology came to adopt data science into its field [45:44]
• The data science tech stack at the New York Times [49:18]
Additional materials: www.superdatascience.com/703
8/8/2023 • 1 hour, 9 minutes, 20 seconds
702: Llama 2 — It's Time to Upgrade your Open-Source LLM
This week, Jon Krohn is examining Meta's newly released open-source large language model, Llama 2, highlighting its commercial prospects, immense capacity, model variety, and unique 'time awareness' feature. He also discusses its innovative two-stage RLHF approach that enhances its performance.
Additional materials: www.superdatascience.com/702
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
8/4/2023 • 10 minutes, 56 seconds
701: Generative A.I. without the Privacy Risks (with Prof. Raluca Ada Popa)
Dr. Raluca Ada Popa, renowned computer scientist, entrepreneur, and President of Opaque Systems, joins Jon Krohn to share her insights on securely interacting with AI APIs like OpenAI's GPT-4, the pros and cons of open vs. closed-source AI development, and the seamless operation of compute pipelines across multiple clouds.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What is a confidential computing platform? [04:31]
• How to get started with confidential computing [12:10]
• The challenges of confidential computing and LLMs [21:11]
• How to safeguard your data while using commercial LLMs like GPT-4 [38:00]
• Open-source vs closed-source [52:28]
• Raluca's PreVail cybersecurity company [1:01:50]
• Combining entrepreneurship and academic career [1:04:03]
• DARE Program [1:10:39]
Additional materials: www.superdatascience.com/701
8/1/2023 • 1 hour, 21 minutes, 27 seconds
700: "The Dream of Life" by Alan Watts
Yoga and Hindu mythology: This special episode continues the thread of our centenary episodes, SDS 500: Yoga Nidra with Jes Allen and SDS 600: Yoga Nidra Practice with Steve Fazzari, which talked through guided meditation techniques to help improve posture, sleep, and expand consciousness. Inspired by these sessions, host Jon Krohn explores Hindu mythology via Alan Watts’ “The Dream of Life”.
Additional materials: www.superdatascience.com/700
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
7/28/2023 • 4 minutes, 31 seconds
699: The Modern Data Stack, with Harry Glaser
Model deployment, data warehouse options for running models, and how to best leverage BI tools: Harry Glaser and Jon Krohn discuss Modelbit’s capabilities to automate ML models from notebooks into production-ready models, reducing the time and effort in ‘translating’ information from one mode to another. Harry’s conversation with host Jon Krohn expanded on the importance of automating this task, and how developments in ML modeling have widened access to entire teams to analyze data, whatever their level of expertise.
This episode is brought to you by the AWS Insiders Podcast (https://pod.link/1608453414). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What the modern data stack is [03:28]
• Version control for data scientists [13:30]
• CI/CD, load balancing and logging [20:38]
• Snowflake vs. Redshift [30:10]
• How tools like Looker and Tableau help monitor models [35:26]
Additional materials: www.superdatascience.com/699
7/25/2023 • 50 minutes, 46 seconds
698: How Firms Can Actually Adopt A.I., with Rehgan Avon
Company-wide AI adoption can take a lot of persuasion. Rehgan Avon talks to host Jon Krohn about why AI has become necessary for forward-thinking businesses and the steps to implement AI in an institution so that everyone benefits.
Additional materials: www.superdatascience.com/698
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
7/21/2023 • 27 minutes, 42 seconds
697: The (Short) Path to Artificial General Intelligence, with Dr. Ben Goertzel
AI visionary and CEO of SingularityNET Dr. Ben Goertzel provides a deep dive into the possible realization of Artificial General Intelligence (AGI) within 3-7 years. Explore the intriguing connections between self-awareness, consciousness, and the future of Artificial Super Intelligence (ASI) and discover the transformative societal changes that could arise.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au), the AWS Insiders Podcast (https://pod.link/1608453414), and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Decentralized and benevolent AGI [03:13]
• The SingularityNET ecosystem [13:10]
• Dr. Goertzel's vision for realizing AGI - combining DL with neuro-symbolic systems, genetic algorithms and knowledge graphs [25:50]
• How reaching AGI will trigger Artificial Super Intelligence [38:51]
• Dr. Goertzel's approach to AGI using OpenCog Hyperon [42:34]
• Why Dr. Goertzel believes AGI will be positive for humankind [53:07]
• How to ensure the AGI is benevolent [1:06:43]
• How AGI or ASI may act ethically [1:13:50]
Additional materials: www.superdatascience.com/697
7/18/2023 • 1 hour, 27 minutes, 12 seconds
696: Brain-Computer Interfaces and Neural Decoding, with Prof. Bob Knight
Jon Krohn welcomes Professor Dr. Bob Knight to explore human intelligence, the prefrontal cortex, and the transformative potential of brain implants for data collection. Discover the pivotal role of machine learning in treating Parkinson's and delve into exciting future advancements.
Additional materials: www.superdatascience.com/696
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
7/14/2023 • 1 hour, 2 minutes, 45 seconds
695: NLP with Transformers, feat. Hugging Face's Lewis Tunstall
What are transformers in AI, and how do they help developers to run LLMs efficiently and accurately? This is a key question in this week’s episode, where Hugging Face’s ML Engineer Lewis Tunstall sits down with host Jon Krohn to discuss encoders and decoders, and the importance of continuing to foster democratic environments like GitHub for creating open-source models.
This episode is brought to you by the AWS Insiders Podcast (https://pod.link/1608453414), by https://WithFeeling.ai, the company bringing humanity into AI, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What a transformer is, and why it is so important for NLP [04:34]
• Different types of transformers and how they vary [11:39]
• Why it’s necessary to know how a transformer works [31:52]
• Hugging Face’s role in the application of transformers [57:10]
• Lewis Tunstall’s experience of working at Hugging Face [1:02:08]
• How and where to start with Hugging Face libraries [1:18:27]
• The necessity to democratize ML models in the future [1:25:25]
Additional materials: www.superdatascience.com/695
7/11/2023 • 1 hour, 38 minutes, 4 seconds
694: CatBoost: Powerful, efficient ML for large tabular datasets
Modeling tabular data and spreadsheets doesn’t have to be tedious with CatBoost’s open-source tree-boosting algorithm. CatBoost does what it says on the tin, blending categories with boosting that allows you to train your models faster and handle large datasets for ML tasks across multiple GPUs. In this week’s Five-Minute Friday, host Jon Krohn gets to grips with the technical components of CatBoost that give it the speed and accuracy so acclaimed by its users.
Additional materials: www.superdatascience.com/694
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
7/7/2023 • 7 minutes, 59 seconds
693: YOLO-NAS: The State of the Art in Machine Vision, with Harpreet Sahota
Harpreet Sahota, a data science expert and deep learning developer at Deci AI, joins Jon Krohn to explore the fascinating realm of object detection and the revolutionary YOLO-NAS model architecture. Discover how machine vision models have evolved and the techniques driving compute-efficient edge device applications.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au), by https://WithFeeling.ai, the company bringing humanity into AI, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What is machine vision? [07:02]
• Object detection and YOLO architectures [13:00]
• Deci's YOLO-NAS: Optimal object detection model architecture [23:39]
• Developer Relations [1:00:16]
• Harpreet's 'top-down' approach to learning Deep Learning [1:06:50]
Additional materials: www.superdatascience.com/693
7/4/2023 • 1 hour, 20 minutes, 15 seconds
692: Lossless LLM Weight Compression: Run Huge Models on a Single GPU
Join Jon as he navigates listeners through the innovative SpQR approach—a cutting-edge, lossless LLM weight compression technique that harnesses the power of quantization. Tune in as Jon delves into the four steps behind this groundbreaking method in this week's episode.
Additional materials: www.superdatascience.com/692
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
6/30/2023 • 7 minutes, 39 seconds
691: A.I. Accelerators: Hardware Specialized for Deep Learning
GPUs vs CPUs, chip design and the importance of chips in AI research: This highly technical episode is for anyone who wants to learn what goes into chip development and how to get into the competitive industry of accelerator design. With advice from expert guest Ron Diamant, Senior Principal Engineer at AWS, you’ll get a breakdown of the need-to-know technical terms, what chip engineers need to think about during the design phase and what the future holds for processing hardware.
This episode is brought to you by Posit, the open-source data science company (https://posit.co), by the AWS Insiders Podcast (https://pod.link/1608453414), and by https://WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What CPUs and GPUs are [05:29]
• The differences between accelerators used for deep learning [14:31]
• Trainium and Inferentia: AWS's A.I. Accelerators [22:10]
• If model optimizations will lead to lower demand for hardware to process them [43:14]
• How a chip designer goes about production [48:34]
• Breaking down the technical terminology for chips (accelerator interconnect, dynamic execution, collective communications) [55:29]
• The importance of AWS Neuron, a software development kit [1:15:42]
• How Ron got his foot in the door with chip design [1:26:40]
Additional materials: www.superdatascience.com/691
6/27/2023 • 1 hour, 34 minutes, 34 seconds
690: How to Catch and Fix Harmful Generative A.I. Outputs
Krishna Gade, the founder and CEO of Fiddler.AI, discusses the challenges faced by Large Language Models (LLMs) in Generative AI, including inaccuracies, biases, and privacy risks. He emphasizes the importance of monitoring to build trust in AI and highlights Fiddler's explainability algorithms and pre-built bias detection tools as vital solutions.
Additional materials: www.superdatascience.com/690
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
6/23/2023 • 26 minutes, 14 seconds
689: Observing LLMs in Production to Automatically Catch Issues
Arize's Amber Roberts and Xander Song join Jon Krohn this week, sharing invaluable insights into ML Observability, drift detection, retraining strategies, and the crucial task of ensuring fairness and ethical considerations in AI development.
This episode is brought to you by Posit, the open-source data science company (https://posit.co), by AWS Inferentia (go.aws/3zWS0au), and by Anaconda, the world's most popular Python distribution (https://superdatascience.com/anaconda). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What is ML Observability [05:07]
• What is Drift [08:18]
• The different kinds of model drift [15:31]
• How frequently production models should be retrained? [25:15]
• Arize's open-source product, Phoenix [30:49]
• How ML Observability relates to discovering model biases [50:30]
• Arize case studies [57:13]
• What is a developer advocate [1:04:51]
Additional materials: www.superdatascience.com/689
6/20/2023 • 1 hour, 18 minutes, 1 second
688: Six Reasons Why Building LLM Products Is Tricky
Prompt injection, prompt engineering, context windows, and more: In this week’s Five-Minute Friday, Jon explains why anyone looking to build their own product leveraging LLMs should stop to consider these and three more issues before jumping in. Phillip Carter first outlined these six issues in his article “All the Hard Stuff Nobody Talks About when Building Products with LLMs”.
Additional materials: www.superdatascience.com/688
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
6/16/2023 • 14 minutes, 10 seconds
687: Generative Deep Learning, with David Foster
Autoencoders, transformers, latent space: Learn the elements of generative AI and hear what data scientist David Foster has to say about the potential for generative AI in music, as well as the role that world models play in blending generative AI with reinforcement learning.
This episode is brought to you by Posit, the open-source data science company (https://posit.co), by Anaconda, the world's most popular Python distribution (superdatascience.com/anaconda), and by https://WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Generative modeling vs discriminative modeling [04:21]
• Generative AI for Music [13:12]
• On the threats of AI [23:15]
• Autoencoders Explained [38:36]
• Noise in Generative AI [48:11]
• What CLIP models are (Contrastive Language-Image Pre-training) [54:07]
• What World Models are [1:00:40]
• What a Transformer is [1:11:14]
• How to use transformers for music generation [1:19:50]
Additional materials: www.superdatascience.com/687
6/13/2023 • 1 hour, 46 minutes, 33 seconds
686: Open-Source "Responsible A.I." Tools, with Ruth Yakubu
Mircosoft’s Ruth Yakubu joins Jon Krohn to discuss Responsible AI principles and the open-source Responsible AI Toolbox, allowing users to assess their models for fairness, inclusiveness, privacy, explainability, accountability, and reliability before deployment.
Additional materials: www.superdatascience.com/686
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
6/9/2023 • 29 minutes, 58 seconds
685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake
Richmond Alake, a Machine Learning Architect at Slalom Build, sits down with Jon to share real-time ML insights, tools and career experiences for a high-energy and high impact episode. From his work at Slalom Build to his two AI startups, discover the software choices, ML tools, and front-end development techniques used by a leader in the field.
This episode is brought to you by Posit, the open-source data science company (https://posit.co), by AWS Inferentia (go.aws/3zWS0au), and by https://WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What is a Machine Learning Architect? [03:09]
• Richmond's startups [12:07]
• Why Richmond started a podcast [29:51]
• Richmond's new course on feature stores [38:05]
• Why Richmond produces data science content [43:25]
• Why All Data Scientists Should Write [51:30]
Additional materials: www.superdatascience.com/685
6/6/2023 • 1 hour, 6 minutes, 19 seconds
684: Get More Language Context out of your LLM
Open-source LLMs, FlashAttention and generative AI terminology: Host Jon Krohn gives us the lift we need to explore the next big steps in generative AI. Listen to the specific way in which Stanford University’s “exact attention” algorithm, FlashAttention, could become a competitor for GPT-4’s capabilities.
Additional materials: www.superdatascience.com/684
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
6/2/2023 • 5 minutes, 49 seconds
683: Contextual A.I. for Adapting to Adversaries, with Dr. Matar Haller
Monitoring malicious, user-generated content; contextual AI; adapting to novel evasion attempts: Matar Haller speaks to Jon Krohn about the challenges of identifying, analyzing and flagging malicious information online. In this episode, Matar explains how contextual AI and a “database of evil” can help resolve the multiple challenges of blocking dangerous content across a range of media, even those that are live-streamed.
This episode is brought to you by Posit, the open-source data science company (posit.co), by Anaconda, the world's most popular Python distribution (superdatascience.com/anaconda), and by https://WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• How ActiveFence helps its customers to moderate platform content [05:36]
• How ActiveFence finds extreme social media users trying to evade detection [16:32]
• How to monitor live-streaming content and analyze it for dangerous material [29:13]
• The technologies ActiveFence uses to run its platform [35:54]
• Matar’s experience of the Insight Fellows Program (Data Science Fellowship) [40:28]
• Leadership opportunities for women in STEM [1:00:41]
• Israel’s R&D edge for AI [1:13:19]
Additional materials: www.superdatascience.com/683
5/30/2023 • 1 hour, 20 minutes, 35 seconds
682: Business Intelligence Tools, with Mico Yuk
In this week's episode, Mico Yuk, host of 'Analytics on Fire', joins Jon Krohn to share her effective business intelligence and analytics framework, BIDS, for persuading key decision makers. She crowns one "power" tool as the analytics king and discusses emerging tools that could challenge its dominance. Tune in for unapologetic insights on future and current BI trends and happenings from the world of BI and analytics.
Additional materials: www.superdatascience.com/682
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
5/26/2023 • 27 minutes, 36 seconds
681: XGBoost: The Ultimate Classifier, with Matt Harrison
Unlock the power of XGBoost by learning how to fine-tune its hyperparameters and discover its optimal modeling situations. This and more, when best-selling author and leading Python consultant Matt Harrison teams up with Jon Krohn for yet another jam-packed technical episode! Are you ready to upgrade your data science toolkit in just one hour? Tune-in now!
This episode is brought to you by Pathway, the reactive data processing framework (pathway.com/?from=superdatascience), by Posit, the open-source data science company (posit.co), and by Anaconda, the world's most popular Python distribution (superdatascience.com/anaconda). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Matt's book ‘Effective XGBoost’ [07:05]
• What is XGBoost [09:09]
• XGBoost's key model hyperparameters [19:01]
• XGBoost's secret sauce [29:57]
• When to use XGBoost [34:45]
• When not to use XGBoost [41:42]
• Matt’s recommended Python libraries [47:36]
• Matt's production tips [57:57]
Additional materials: www.superdatascience.com/681
5/23/2023 • 1 hour, 12 minutes, 1 second
680: Automating Industrial Machines with Data Science and the Internet of Things (IoT)
Industrial machinery’s dependence on data science, tech stacks to build IoT platforms, and transitioning from data science to product: This week’s Friday episode with Allegra Alessi explores the minutiae of product ownership for the Internet of Things at packaging company Bobst. Join host Jon Krohn and his guest as they unpack how the IoT is leading factory production.
Additional materials: www.superdatascience.com/680
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
5/19/2023 • 30 minutes, 25 seconds
679: The A.I. and Machine Learning Landscape, with investor George Mathew
Generative AI, MLOps, and making smart investments in AI: This week’s episode is critical listening for AI investors and generative AI creators. AI investor George Mathew talks with host Jon Krohn about the emerging generative AI stack, the critical elements of MLOps to ensure a scalable model, and the tools developers can use for a saleable product.
This episode is brought to you by Posit, the open-source data science company (posit.co), by AWS Inferentia (https://go.aws/3zWS0au), and by Anaconda, the world's most popular Python distribution (superdatascience.com/anaconda). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Venture capital’s role in the technology startup ecosystem [05:59]
• How RLHF helps UI become more intuitive [12:53]
• The four layers of the generative AI stack [34:16]
• The risks for generative AI business founders and investors [46:50]
• How MLOps drive best practices and help implementation [56:33]
• The importance of PLG (Product Lead Growth) [1:04:15]
• How generative AI tools will impact the labor market [1:17:34]
Additional materials: www.superdatascience.com/679
5/16/2023 • 1 hour, 34 minutes, 14 seconds
678: StableLM: Open-source "ChatGPT"-like LLMs you can fit on one GPU
StableLM, the new family of open-source language models from the brilliant minds behind Stable Diffusion is out! Small, but mighty, these models have been trained on an unprecedented amount of data for single GPU LLMs. This week, Jon breaks down the mechanics of this model–see you there!
Additional materials: www.superdatascience.com/678
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
5/12/2023 • 11 minutes, 39 seconds
677: Digital Analytics with Avinash Kaushik
How does one use marketing analytics to drive business success? Avinash Kaushik, Chief Strategy Officer at Croud and former Sr. Director of Global Strategic Analytics at Google joins Jon Krohn live for an exciting episode that covers the transformative power of AI, his 'four clusters of intent' framework and the value of hands-on data tools.
This episode is brought to you by Pathway, the reactive data processing framework (https://pathway.com/?from=superdatascience), by Posit, the open-source data science company (https://posit.co), and by Anaconda, the world's most popular Python distribution (https://superdatascience.com/anaconda). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What is a chief strategy officer? [3:55]
• Brand vs performance analytics [7:23]
• Incrementality-centric marketing [32:53]
• Avinash's time at Google [37:54]
• How to maintain human-touch with AI [48:58]
• Four clusters of intent framework [1:11:28]
• Avinash's most significant career challenges [1:17:18]
Additional materials: www.superdatascience.com/677
5/9/2023 • 1 hour, 27 minutes, 54 seconds
676: The Chinchilla Scaling Laws
Chinchilla AI, and fine-tuning proprietary tasks with large language models: On this week’s Five-Minute Friday, host Jon Krohn outlines the principles of the Chinchilla Scaling Laws, the incredible power of models such as Cerebras-GPT based on these laws, and the impact of scaling on the number of viable applications and commercial use cases.
Additional materials: www.superdatascience.com/676
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
5/5/2023 • 13 minutes, 27 seconds
675: Pandas for Data Analysis and Visualization
Wrangling data in Pandas, when to use Pandas, Matplotlib or Seaborn, and why you should learn to create Python packages: Jon Krohn speaks with guest Stefanie Molin, author of Hands-On Data Analysis with Pandas.
This episode is brought to you by Posit, the open-source data science company (https://posit.co), and by AWS Inferentia (https://go.aws/3zWS0au). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The advantages of using pandas over other libraries [07:55]
• Why data wrangling in pandas is so helpful [12:05]
• Stefanie’s Data Morph library [24:27]
• When to use pandas, matplotlib, or seaborn [33:45]
• Understanding the ticker module in matplotlib [36:48]
• Where data analysts should start their learning journey [40:08]
• What it’s like being a software engineer at Bloomberg [51:19]
Additional materials: www.superdatascience.com/675
5/2/2023 • 1 hour, 8 minutes, 40 seconds
674: Parameter-Efficient Fine-Tuning of LLMs using LoRA (Low-Rank Adaptation)
Models like Alpaca, Vicuña, GPT4All-J and Dolly 2.0 have relatively small model architectures, but they're prohibitively expensive to train even on a small amount of your own data. The standard model-training protocol can also lead to catastrophic forgetting. In this week's episode, Jon explores a solution to these problems, introducing listeners to Parameter-Efficient Fine-Tuning (PEFT) and the leading approach: Low-Rank Adaptation (LoRA).
Additional materials: www.superdatascience.com/674
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
4/28/2023 • 5 minutes, 27 seconds
673: Taipy, the open-source Python application builder
Vincent Gosselin, CEO and co-founder of Taipy, an open-source Python library, joins Jon Krohn to discuss how to accelerate productivity in Python and build scalable, reusable, and maintainable data pipelines. Gosselin shares his breadth of wisdom honed over his decades-long AI career.
This episode is brought to you by Pathway, the reactive data processing framework (https://pathway.com/?from=superdatascience), and by Posit, the open-source data science company (https://posit.co/academy). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The Taipy library functionality [2:59]
• The future of data pipelines [21:40]
• Common trends of companies that are successful at adopting data pipelines [28:31]
• How no-code and low-code trends impact the data science lifecycle [33:00]
• How Vincent chose the programming languages that underpin Taipy [41:40]
• Common trends on how companies manage their data to learn from it [45:06]
• Vincent's perspective on AI winters [51:03]
Additional materials: www.superdatascience.com/673
4/25/2023 • 1 hour, 12 minutes, 1 second
672: Open-source "ChatGPT": Alpaca, Vicuña, GPT4All-J, and Dolly 2.0
Get started with language models: Learn about the commercial-use options available for your business in this week’s Five-Minute Friday, where host Jon Krohn discusses four models that have many of the capabilities of ChatGPT and can run at a fraction of the cost.
Additional materials: www.superdatascience.com/672
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
4/21/2023 • 16 minutes, 50 seconds
671: Cloud Machine Learning
Get to grips with AWS, Azure, Google Cloud Platform on this week’s episode. Host Jon Krohn speaks with Kirill Eremenko and Hadelin de Ponteves about CloudWolf, a cloud computing educational platform that prepares students for certification in AWS (Amazon Web Services). Find out why an accreditation in cloud computing could be the safest investment for your data science career.
This episode is brought to you by Posit, the open-source data science company (https://posit.co/academy), and by AWS Inferentia (https://aws.amazon.com/ec2/instance-types/inf2/?trk=bbd10c3f-c200-4629-bca8-adf6ad324c9e&sc_channel=el). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• About CloudWolf [07:04]
• Why learning the cloud is important for data scientists [09:12]
• Is learning cloud computing complex? [22:30]
• Essential AWS services [28:31]
• Database options on AWS [33:47]
• How to run analytics on AWS [40:58]
• Why an AWS certification is so helpful [56:35]
Additional materials: www.superdatascience.com/671
4/18/2023 • 1 hour, 3 minutes, 13 seconds
670: LLaMA: GPT-3 performance, 10x smaller
How does Meta AI's natural language model, LLaMa compare to the rest? Based on the Chinchilla scaling laws, LLaMa is designed to be smaller but more performant. But how exactly does it achieve this feat? It's all done by training a small model for a longer period of time. Discover how LLaMa compares to its competition, including GPT-3, in this week's episode.
Additional materials: www.superdatascience.com/670
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode, Jon Krohn welcomes Adrian Kosowski, Co-Founder and Chief Product Officer at Pathway, who shares insights on streaming data processing and reactive data processing, and how they're shaping the future of machine learning. Tune in now for an unforgettable episode.
This episode is brought to you by Posit, the open-source data science company (https://posit.co/academy), and by AWS Inferentia (https://aws.amazon.com/ec2/instance-types/inf2/?trk=bbd10c3f-c200-4629-bca8-adf6ad324c9e&sc_channel=el). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• About Pathway's reactive data processing framework [04:45]
• Reactive data processing use cases [17:08]
• What is the difference between batch and streaming processing [33:18]
• Transformers in data engineering and data streaming [53:44]
• The benefits of Adrian's technical background as a CPO [1:04:17]
• Adrian's responsibilities and favorite tools as a CPO [1:15:25]
• Emerging ML approaches and tools for startups [1:28:49]
Additional materials: www.superdatascience.com/669
4/11/2023 • 1 hour, 40 minutes, 59 seconds
668: GPT-4: Apocalyptic stepping stone?
AI risks, RLHF, and inner alignment: GPT stands to give the business world a major boost. But with everyone racing either to develop products that incorporate GPT or use it to carry out critical tasks, what dangers could lie ahead in working with a tool that applies essentially unknowable means (inner alignments) to reach its goals? This week’s guest Jérémie Harris speaks with Jon Krohn about the essential need for anyone working with GPT to understand the impact of a system comprising inner alignments that cannot – and may never – be fully understood.
Additional materials: www.superdatascience.com/668
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
4/7/2023 • 55 minutes, 47 seconds
667: Harnessing GPT-4 for your Commercial Advantage
GPT-4, augmenting human tasks with AI, and using GPT-4 commercially: Vin Vashishta speaks to host Jon Krohn about how to leverage GPT-4 and outperform your competitors in both speed and value. Learn how GPT-4 has outmatched its predecessors – and many skilled workers – in this latest iteration of large language models.
This episode is brought to you by Pathway, the reactive data processing framework (https://pathway.com/?from=superdatascience), by Posit, the open-source data science company (https://posit.co/academy), and by epic LinkedIn Learning instructor Keith McCormick(linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Using GPT-4 to screen for jobs [06:26]
• A framework for improving systems with GPT [13:32]
• Teaming, tooling and collaborating with GPT-4 [29:58]
• How to accelerate data science with generative A.I. [45:36]
• How to prepare for opportunities with GPT-4 [52:09]
Additional materials: www.superdatascience.com/667
4/4/2023 • 1 hour, 4 minutes, 30 seconds
666: GPT-4
GPT-4 has landed! But how well does it compare to GPT-3.5? Tune in to hear Jon stack its performance against its predecessor–the results might just blow your mind.
Additional materials: www.superdatascience.com/666
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
3/31/2023 • 11 minutes, 51 seconds
665: How to be both socially impactful and financially successful in your data career
Angel investor and data science consultant Josh Wills sits down with Jon Krohn to discuss his former roles (Google, Slack, and Cloudera) and the essential skills for engineering scalable machine learning projects.
This episode is brought to you by Pathway, the reactive data processing framework (www.pathway.com/?from=superdatascience), and by epic LinkedIn Learning instructor Keith McCormick(https://linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Josh's 'Data Engineering for Machine Learning' course [06:50]
• Contextual bandits [10:52]
• Data quality and monitoring [16:45]
• The “infinite loop of sadness” in data product development [25:12]
• Josh’s definition of a data scientist [30:02]
• Josh's role at WeaveGrid [37:36]
• Management-Track vs Independent Contributor [48:47]
• Josh's work on the Covid pandemic [1:06:46]
• Josh’s favorite tech stack [1:11:13]
Additional materials: www.superdatascience.com/665
3/28/2023 • 1 hour, 27 minutes, 44 seconds
664: MIT Study: ChatGPT Dramatically Increases Productivity
Can ChatGPT make us better and faster in our work, and is it the future or just another fad? In this episode, Jon Krohn delves into a new study from MIT about the tool’s potential productivity for white-collar tasks.
Additional materials: www.superdatascience.com/664
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
3/24/2023 • 5 minutes, 15 seconds
663: Astonishing CICERO negotiates and builds trust with humans using natural language
NLP, transformer architectures, and machines beating humans at their own game: Jon Krohn talks to Alexander H. Miller about his work in building a machine that can outsmart humans in the game of Diplomacy by engineering powers of persuasion and collusion to its own advantage.
This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick(linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Training a natural language model to interact with Diplomacy players [05:07]
• Processing speeds for a Diplomacy bot [29:32]
• Using transformer architectures [37:25]
• How Diplomacy AI actually works [43:25]
• CICERO's potential real-world applications [55:28]
• How to R&D an AI project [59:27]
• How to become an AI Research Manager [1:06:12]
Additional materials: www.superdatascience.com/663
3/21/2023 • 1 hour, 17 minutes, 29 seconds
662: The Most Popular SuperDataScience Podcast Episodes of 2022
Our list of the top 10 SuperDataScience podcast episodes for 2022 is here. From Pandas to causality, AI breakthroughs and data storytelling, these were your most popular episodes of the year gone by.
Additional materials: www.superdatascience.com/662
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
3/17/2023 • 7 minutes, 50 seconds
661: Designing Machine Learning Systems
Chip Huyen, co-founder of Claypot AI and author of O'Reilly's best-selling "Designing Machine Learning Systems" is here to share her expertise on designing production-ready machine learning applications, the importance of iteration in real-world deployment, and the critical role of real-time machine learning in various applications. Technical listeners like data scientists and machine learning engineers will definitely enjoy this one!
This episode is brought to you by Pathway, the reactive data processing framework (https://www.pathway.com/?from=superdatascience), and by epic LinkedIn Learning instructor Keith McCormick(linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Why Chip wrote 'Designing Machine Learning Systems' [08:58]
• How Chip ended up teaching at Stanford [13:18]
• About Chip's book 'Designing Machine Learning Systems' [21:12]
• What makes ML feel like magic [30:53]
• How to align business intent, context, and metrics with ML [37:55]
• The lessons Chip learned about training data [42:03]
• Chip's secrets to engineering good features [53:19]
• How Chip optimizes her productivity [1:07:48]
Additional materials: www.superdatascience.com/661
3/14/2023 • 1 hour, 16 minutes, 42 seconds
660: Five Ways to Use ChatGPT for Data Science
ChatGPT is well-known for its potential to disrupt the writing industry, but in what other, perhaps less explored, ways can we use the tool? In this episode, Jon Krohn outlines five critical ways that ChatGPT can augment a data scientist’s work. From generating code to acting as a translation tool for programming languages, listen in to hear why ChatGPT could become a vital part of every data scientist’s toolkit.
Additional materials: www.superdatascience.com/660
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
3/10/2023 • 3 minutes, 53 seconds
659: Open-Source Tools for Natural Language Processing
NLP practitioners: this episode is for you. From the awareness of linguistic elements and annotation to getting the necessary people in the room, Vincent Warmerdam presents to Jon Krohn a recipe for a successful project and the open-source NLP tools to get there.
This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (https://linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• How Vincent came to work with De Speld [08:57]
• Vincent’s role at Explosion [18:59]
• How users can apply spaCy [21:46]
• Prodigy: Annotate training data more efficiently with scripts [26:28]
• How to manage “skill anxiety” with Calmcode [32:32]
• How Vincent fixed bad labels [42:47]
• The value of understanding linguistics for NLP [54:42]
• How to constrain artificial stupidity [1:02:38]
Additional materials: www.superdatascience.com/659
3/7/2023 • 1 hour, 20 minutes, 57 seconds
658: How to Build Data and ML Products Users Love
What makes data products popular? Brian T. O'Neill, Founder and Principal of Designing for Analytics, returns to the podcast to help us crack the code on building data products that people love.
Additional materials: www.superdatascience.com/658
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
3/3/2023 • 35 minutes, 42 seconds
657: How to Learn Data Engineering
Data engineering educator Andreas Kretz joins Jon Krohn for a 1-hour primer that covers everything you need to know about the most in-demand role in data. From skills to tools, problem-solving processes and more, growing your knowledge of data engineering only improves your marketability, so tune in today if you're ready to future-proof your data career.
This episode is brought to you by Glean (https://glean.io), the platform for data insights fast, and by epic LinkedIn Learning instructor Keith McCormick (https://linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Why learn data engineering? [06:55]
• What is data engineering? [08:08]
• What sets Senior Data Engineers apart from junior ones? [13:57]
• The must-know data-engineering tools [20:26]
• The right path to learn data engineering [44:24]
• Are certifications worth it? [51:46]
• The future of data engineering [55:24]
• Andreas's career challenges [58:48]
Additional materials: www.superdatascience.com/657
2/28/2023 • 1 hour, 9 minutes, 33 seconds
656: A.I. Talent and the Red-Hot A.I. Skills
How to attract an AI recruiter’s attention: In this episode, Jon Krohn and Tribe AI CEO Jaclyn Rice Nelson break down the key ingredients needed to make a Tribe AI recruiter say “yes!” Get Jaclyn’s top tips for forward-thinking AI talent, the skills you need to learn, and the in-demand roles on Tribe’s list of clients.
Additional materials: www.superdatascience.com/656
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2/24/2023 • 41 minutes, 51 seconds
655: AI ROI: How to get a profitable return on an AI-project investment
Transparent data science, profitable AI, and what’s missing from a data science education: Pandata’s Data Scientist in Residence Keith McCormick and Jon Krohn discuss how “insights” can never be the end product of a data science project, how to ensure you have a specific goal at the start of a project that is related to revenue, and why there is so much miscommunication between data scientists and their clients. Exclude the C-suite at your peril!
This episode is brought to you by Glean (https://glean.io), the platform for data insights, fast. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What an Executive Data Scientist in Residence is [05:27]
• What A.I. transparency is and how it relates to the field of Explainable A.I. (XAI) [17:34]
• How companies can ensure they profit from AI projects [36:47]
• Possible organization structures for data science teams to be profitable [1:02:41]
• The current gaps in data science education [1:09:58]
Additional materials: www.superdatascience.com/655
2/21/2023 • 1 hour, 43 minutes, 22 seconds
654: Mike Wimmer: The 14-Year-Old A.I. Entrepreneur
14-year-old AI prodigy Mike Wimmer joins Jon Krohn to discuss his latest projects. Whether he's using AI to help conserve the world's coral reefs or launching his new IOT-based company, Mike is an endless source of inspiration in the field of AI.
Additional materials: www.superdatascience.com/654
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2/17/2023 • 45 minutes, 25 seconds
653: Efficiently Glean-ing Insights from Vast Data Warehouses
Carlos Aguilar, the founder and CEO of Glean, a data exploration and visualization platform, knows a thing or two about starting and growing a tech startup. After recently raising a $7 million seed round, he sits down with Jon Krohn to dive into the makings of his platform and shares tips for building a great founding team and how to delight early adopters.
In this episode you will learn:
• How Glean extracts actionable insights from their client's data warehouses [06:48]
• What sets Glean apart from other platforms [12:43]
• Glean's software stack [14:43]
• Glean's recent fundraising journey [24:56]
• The essential characteristics of a founding team [30:53]
• How Carlos founded Glean [36:56]
• Carlos's former role at Flatiron Health [40:49]
• How Carlos created a robotic painter [48:57]
Additional materials: www.superdatascience.com/653
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2/14/2023 • 57 minutes, 25 seconds
652: A.I. Speech for the Speechless
MedTech, communications technology and computer vision: In this Five-Minute Friday, Jon Krohn investigates the technology that allows patients who have lost their ability to speak via medical ventilation to communicate clearly.
Additional materials: www.superdatascience.com/652
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2/10/2023 • 6 minutes, 17 seconds
651: The Intentional Use of Color in Data Communication
Data visualizations, color theories and color inclusivity: In this episode, Kate Strachnyi and host Jon Krohn discuss how color can make or break your data visuals, ways to make your charts and graphs more inclusive through color, and how Kate developed the tools and techniques to nail color for your data stories in her latest book, ColorWise: A Data Storyteller’s Guide to the Intentional Use of Color.
In this episode you will learn:
• What a “data storyteller” is [11:01]
• Why color use should always be intentional [12:52]
• Is color always necessary in data visualization? [29:41]
• Color selection tips for your data visuals [31:19]
• Three-color scales [34:54]
• How to respect individual cultures in the color choices you make [38:25]
• Best tools for data visualization [54:35]
Additional materials: www.superdatascience.com/651
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2/7/2023 • 1 hour, 16 minutes, 54 seconds
650: SparseGPT: Remove 100 Billion Parameters but Retain 100% Accuracy
SparseGPT is a noteworthy one-shot pruning technique that can halve the size of large language models like GPT-3 without adversely affecting accuracy. In this episode, Jon Krohn provides an overview of this development and explains its commercial and environmental implications.
Additional materials: www.superdatascience.com/650
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2/3/2023 • 7 minutes, 47 seconds
649: Introduction to Machine Learning
Looking for a short primer on Machine Learning concepts? SDS Founder Kirill Eremenko and AI expert Hadelin de Ponteves are back, joining Jon Krohn to review essential ML concepts. From classification errors to logistic regression, feature scaling, the elbow method and more. The popular data science instructors also introduce their latest course: Machine Learning in Python: Level 1.
In this episode you will learn:
• Kirill and Hadelin's new course [17:34]
• Supervised vs unsupervised learning [26:23]
• False positives and false negatives [31:21]
• Logistic regression [43:00]
• Holding out a set of test data [46:39]
• Feature scaling [52:45]
• The Adjusted R-Squared metric [59:44]
• The five assumptions of linear regression [1:05:12]
• The Elbow Method [1:11:41]
Additional materials: www.superdatascience.com/649
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
1/31/2023 • 1 hour, 22 minutes, 25 seconds
648: VALL-E: Uncannily Realistic Voice Imitation from a 3-Second Clip
Text-to-speech gets a groundbreaking update with Microsoft’s VALL-E. On this Five-Minute Friday, Jon Krohn investigates how the Microsoft team modeled their tool to replicate natural human speech using just three seconds of a person’s voice.
Additional materials: www.superdatascience.com/648
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
1/27/2023 • 9 minutes, 51 seconds
647: Is Data Science Still Sexy?
Knowledge management, trust of AI, and job automation: Tom Davenport speaks with Jon Krohn about the organizational obstacles to adopting AI, and why the C-suite also needs to learn how to handle data.
This episode is brought to you by Kolena (https://kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Cognitive bias in understanding AI [14:13]
• How AI will augment rather than replace human workers [24:27]
• OpenAI and regulatory action [35:13]
• Jobs that might be at risk of being automated [39:57]
• The potential of citizen science in accumulating and analyzing data [1:02:18]
• How AI will change the game for the C-suite [1:15:17]
Additional materials: www.superdatascience.com/647
1/24/2023 • 1 hour, 36 minutes, 44 seconds
646: ChatGPT: How to Extract Commercial Value Today
Are you still wondering how to get the most out of ChatGPT's game-changing technology? In this week's Five-Minute Friday guest episode, Jon Krohn sits down with longtime friend and e-commerce entrepreneur Zack Weinberg, to discuss the practical applications of this incredible AI tool.
Additional materials: www.superdatascience.com/646
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
1/20/2023 • 34 minutes, 3 seconds
645: Machine Learning for Video Games
Machine learning, security and Call of Duty collide this week as Jon Krohn sits down with Carly Taylor, Lead Machine Learning Engineer for Activision's COD franchise to discuss the importance of low-latency, the future of gaming and her favorite software packages.
This episode is brought to you by Kolena (https://kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The relationship between data science and cyber security [4:49]
• The importance of low-latency for an optimal gaming experience [9:15]
• The future of gaming [18:13]
• Carly's thoughts on the Metaverse [25:43]
• Carly’s favorite operating systems, software packages, and keyboards [30:27]
• How to transition from a quantitative academic background into data science [45:28]
• Why Carly is called the “Rebel Data Scientist” [53:27]
• How to file a patent [57:21]
Additional materials: www.superdatascience.com/645
1/17/2023 • 1 hour, 15 minutes, 13 seconds
644: A Framework for Big Life Decisions
Love and money matter in this week’s Five-Minute Friday, as Stanford University’s Myra Strober sits down with Jon Krohn to talk about her latest book, Money and Love, coauthored with Abby Davisson. In this unorthodox take on thinking with your head versus your heart, Myra and Abby address the life-changing impact that money and love have on each other and how to rethink this relationship to make better decisions.
Additional materials: www.superdatascience.com/644
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
1/13/2023 • 16 minutes, 31 seconds
643: A.I. for Medicine
AI prediction tools for antibodies and using statistics to prepare healthcare systems for pandemics: host Jon Krohn speaks with Chief Scientist of Biologics AI for Exscientia Charlotte Deane about the variety of potential partnerships between medicine and machine learning.
This episode is brought to you by Kolena (https://kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What does Biologics AI mean? [03:48]
• How to use AI to predict protein structures [07:37]
• What antibodies are [14:00]
• Personalized Medicine is slow but A.I. can speed it up [24:36]
• The future of predicting 4D protein structures [44:30]
• Applications of machine learning during the pandemic [53:27]
Additional materials: www.superdatascience.com/643
1/10/2023 • 1 hour, 20 minutes, 33 seconds
642: Continuous Calendar for 2023
Looking to shake up your data science productivity in 2023? Switching to a continuous calendar can make all the difference. Jon Krohn shares his new calendar with those taking their yearly, monthly and daily planning to the next level.
Additional materials: www.superdatascience.com/642
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
1/6/2023 • 2 minutes, 55 seconds
641: Data Science Trends for 2023
The top data science trends of 2023 are here. Sadie St. Lawrence joins Jon Krohn to share annual predictions on the future of AI. From the data mesh to multimodal models like ChatGPT, tune in to discover what's next.
This episode is brought to you by Kolena (https://kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• A recap of 2022 predictions [5:22]
• Our data science trend predictions for 2023:
- Data as a product [23:36]
- Multimodal A.I. models [32:26]
- The data mesh [42:49]
- Privacy & AI Trust [50:54]
- Environmental Sustainability [54:37]
• Sadie's goals for 2023 [1:16:04]
Additional materials: www.superdatascience.com/641
1/3/2023 • 1 hour, 31 minutes, 39 seconds
640: What I Learned in 2022
From AI trends to rediscovering how fun it is to work with colleagues ‘in person’, host Jon Krohn wraps up the year’s best SuperDataScience content and looks ahead to another year of interviews with the data science community’s brightest stars.
Additional materials: www.superdatascience.com/640
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/30/2022 • 37 minutes
639: Simplifying Machine Learning
Learning Python for beginners is made fun on Mariya Sha’s YouTube and Discord channels, on which she posts hacks, breakdowns and tutorials on everything to do with the world’s most important programming language. If you’re continually frustrated by the high base level at which many ML and Python courses seem to begin, this episode is a great jumping-off point for you.
This episode is brought to you by Kolena (https://kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Why Mariya was first interested in learning Python [04:44]
• The positive potential for future AI applications [12:02]
• Useful broadcasting software [23:09]
• The importance of productivity hacking in data science [34:13]
• The ethical problems of web scraping [38:45]
• Mariya’s favorite Python libraries [53:48]
• What excites Mariya about the future of NLP [1:13:53]
• Mariya’s favorite software tools [1:15:23]
Additional materials: www.superdatascience.com/639
12/27/2022 • 1 hour, 41 minutes, 4 seconds
638: ChatGPT Holiday Greeting
OpenAI's ChatGPT helps us generate a special holiday greeting this week. Tune in to hear the festive message that this impressive natural language generating algorithm churned out as we close out the year.
Additional materials: www.superdatascience.com/638
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/23/2022 • 3 minutes, 37 seconds
637: How to Influence Others with Your Data
It's all about data visualization this week as Jon Krohn welcomes Ann K. Emery, data visualization designer and owner of Depict Data Studio, to the show. If you want to learn data viz best practices, tips and tricks and reporting how-tos, make some time to tune in today!
This episode is brought to you by Kolena (https://kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What data storytelling is [03:40]
• Pinpoints of data visualization [10:38]
• Best practices for data visualization [23:41]
• Surprising spreadsheet tricks [30:51]
• When static dashboards are more effective than interactive ones [43:30]
• Ann's top tips for presenting data in a slideshow [48:07]
Additional materials: www.superdatascience.com/637
12/20/2022 • 1 hour, 7 minutes, 58 seconds
636: The Equality Machine
Digital literacy and data bias: Can one reduce or even eradicate the other? Law professor Orly Lobel speaks with SDS host Jon Krohn about Orly’s latest book, The Equality Machine, which offers an optimistic look into the future of AI and data mining.
Additional materials: www.superdatascience.com/636
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/16/2022 • 22 minutes, 13 seconds
635: The Perils of Manually Labeling Data for Machine Learning Models
Hand labeling data and information bias: Jon Krohn speaks with Watchful CEO Shayan Mohanty about the pitfalls of data analysis when bias comes into the equation (spoiler alert: it always does), the importance of the Chomsky hierarchy in data management, and the importance of simulation engines for returning real-time results to users.
This episode is brought to you by Iterative (https://iterative.ai), your mission control center for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Why bias in general is good [04:06]
• The arguments against hand labeling [09:47]
• How Shayan solves the problem of labeling at his company [24:26]
• Misconceptions concerning hand-labeled data [43:25]
• What the Chomsky hierarchy is [52:38]
• Watchful’s high-performance simulation engine [1:04:51]
• What Shayan looks for in his new hires [1:08:15]
Additional materials: www.superdatascience.com/635
12/13/2022 • 1 hour, 18 minutes, 31 seconds
634: Model Error Analysis
Data scientist and author Serg Masís joins Jon Krohn for a Five-Minute Friday episode that touches on model error analysis. Learn how this process can improve your models and discover a helpful tool that expedites this critical process.
Additional materials: www.superdatascience.com/634
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/9/2022 • 6 minutes, 56 seconds
633: Responsible Decentralized Intelligence
This week's episode is all about Responsible Decentralized Intelligence as award-winning professor and tech entrepreneur, Dawn Song, joins Jon Krohn to help us explore this exciting topic in-depth.
This episode is brought to you by Iterative (https://iterative.ai), your mission control center for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What is decentralized intelligence? [3:46]
• Dawn’s Responsible Data Economy collaboration with Meta AI [11:31]
• How homomorphic encryption, differential privacy, and multi-party computation can work together [16:22]
• How PrivateSQL makes differential privacy easy to use [22:54]
• The relationship between deep learning and federated learning [37:55]
• What is a responsible data economy [42:13]
Additional materials: www.superdatascience.com/633
12/6/2022 • 53 minutes, 56 seconds
632: Liquid Neural Networks
Liquid neural networks are a type of bio-inspired machine learning set to make a huge impact in the field of data analytics. On this week’s Five-Minute Friday, Jon Krohn speaks with Pathway.com Co-Founder Dr. Adrian Kosowski about the development of this new type of network and what this means for the future of data.
Additional materials: www.superdatascience.com/632
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
12/2/2022 • 10 minutes, 46 seconds
631: Data Analytics Career Orientation
Interview success, funny memes about data, and stakeholder management: Jon Krohn speaks with Luke Barousse, a full-time YouTuber who produces content to help aspiring data scientists. First, Jon and his guest go underwater to find out how data science can help you while working on a submarine before they emerge onto Luke’s YouTube channel. There, he discloses all the helpful hacks for data science beginners—with a generous helping of humor! As founder of MacroFit, a data-driven company that helps with meal planning, Luke is no stranger to portion sizes…
This episode is brought to you by Iterative (https://iterative.ai), your mission control center for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Where Luke gets his inspiration for making YouTube videos [04:46]
• How Luke got into creating comedy skits [08:21]
• Luke’s favorite Python libraries for web scraping [14:41]
• Incorrect assumptions that aspiring data scientists make [15:54]
• The best time to use Power BI [19:15]
• The biggest mistakes Luke made in his data science career [22:17]
• Luke’s experience as a submariner and how it helped him in his data analyst career [38:13]
• The must-have skills for entry-level data analyst roles [43:46]
Additional materials: www.superdatascience.com/631
11/29/2022 • 58 minutes, 52 seconds
630: Resilient Machine Learning
Jon Krohn sits with Dr. Dan Shiebler at the Open Data Science Conference (ODSC) to dive into the critical components of building resilient machine learning.
Additional materials: www.superdatascience.com/630
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
11/25/2022 • 6 minutes, 12 seconds
629: Software for Efficient Data Science
Has the term developer advocacy ever left you scratching your head? This week data science developer advocate for JetBrains, Dr. Jodie Burchell, joins Jon Krohn to shed light on her responsibilities and why it's a role you might want to consider. Jodie also dives into building reproducible data science workflows and the keys to working effectively with real-world data.
This episode is brought to you by Iterative (https://iterative.ai), the open-source company behind DVC. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Jodie’s background in psychology [2:19]
• Jodie's tips for real-world data preparation [6:52]
• Tour JetBrains' developer tools: PyCharm, DataSpell and Datalore [10:38]
• What is a data science developer advocate? [38:44]
• The books that Jodie's co-authored [46:15]
• Jodie's favorite Python libraries [58:30]
• How to have reproducible data science workflows [1:01:33]
Additional materials: www.superdatascience.com/629
11/22/2022 • 1 hour, 11 minutes, 16 seconds
628: The Critical Human Element of Successful A.I. Deployments
On this episode of Five-Minute Friday, Jon Krohn speaks from the Open Data Science Conference (ODSC). There, he sits down with author and data scientist Keith McCormick to discuss the conference’s key trend: learning the importance of trust in the relationship between humans and algorithms.
Additional materials: www.superdatascience.com/628
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
11/18/2022 • 5 minutes, 6 seconds
627: AutoML: Automated Machine Learning
Jon Krohn speaks with Erin LeDell, H2O.ai’s Chief Machine Learning Scientist. They investigate how AutoML supercharges the data science process, the importance of admissible machine learning for an equitable data-driven future, and what Erin’s group Women in Machine Learning & Data Science is doing to increase inclusivity and representation in the field.
This episode is brought to you by Datalore (https://datalore.online/SDS), the collaborative data science platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The H2O AutoML platform Erin developed [07:43]
• How genetic algorithms work [19:17]
• Why you should consider using AutoML? [28:15]
• The “No Free Lunch Theorem” [33:45]
• What Admissible Machine Learning is [37:59]
• What motivated Erin to found R-Ladies Global and Women in Machine Learning and Data Science [47:00]
• How to address bias in datasets [57:03]
Additional materials: www.superdatascience.com/627
11/15/2022 • 1 hour, 30 minutes, 57 seconds
626: Subword Tokenization with Byte-Pair Encoding
Word tokenization, character tokenization and subword tokenization go head-to-head this week as Jon Krohn delivers a mini-bootcamp on the NLP-related process.
Additional materials: www.superdatascience.com/626
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
11/11/2022 • 6 minutes, 42 seconds
625: Analyzing Blockchain Data and Cryptocurrencies
Chainalysis' Director of Research, Kim Grauer joins Jon Krohn to explore the state of economic-data analysis on the blockchain.
This episode is brought to you by Datalore (https://datalore.online/SDS), the collaborative data science platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Kim's role as Director of Research [5:02]
• The unique real-time economic-data analytics of the blockchain [13:07]
• How ML can predict patterns of criminal activity on the blockchain [18:56]
• Interesting use cases of ML for crime investigation [29:37]
• The tools and approaches Kim uses daily [47:44]
• The future of crypto, blockchains, and data science [50:54]
• Why a data science bootcamp helps people break into data science [53:42]
Additional materials: www.superdatascience.com/625
On this week’s Five-Minute Friday, Jon Krohn investigates Imagen Video, Google’s latest model for making video art out of text prompts. Recently published, this text-to-image converter now competes against already strong competitors on the scene like DALL-E 2. Unlike DALL-E 2, it returns moving images or time-based media. Tune in to hear Jon explain the technology that made Imagen Video the tech giant’s shiniest new tool to date.
Additional materials: www.superdatascience.com/624
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
11/4/2022 • 7 minutes, 27 seconds
623: Data Analyst, Data Scientist, and Data Engineer Career Paths
Jon Krohn speaks with Shashank Kalanithi, the man who makes a sport out of YouTube and data analytics out of sports. Listen in as he talks about how he got started producing YouTube videos on data science, the essential differences between data science roles, and how data could shape the future of the sports industry.
This episode is brought to you by Datalore (https://datalore.online/SDS), the collaborative data science platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What motivated Shashank to start his YouTube channel [04:31]
• The must-have technical skills for every data scientist [16:59]
• The soft skills needed for data science [20:52]
• The differences between data analyst, data scientist and data engineer [24:26]
• How data are currently being applied in the sports industry [38:38]
• The “needs” divide between digital native and traditional companies [45:34]
Additional materials: www.superdatascience.com/623
11/1/2022 • 1 hour, 11 minutes, 17 seconds
622: Burnout: Causes and Solutions
Is burnout on the horizon for you and your team? Christina Maslach, author of the new book "The Burnout Challenge," joins Jon Krohn to help us identify the common signs of looming burnout while steering us in a healthier direction.
Additional materials: www.superdatascience.com/622
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
10/28/2022 • 24 minutes
621: Blockchains and Cryptocurrencies: Analytics and Data Applications
Cryptocurrency and blockchain take center stage this week as we welcome Chief Economist at Chainalysis, Philip Gradwell, to discuss the data science applications in this exciting field.
This episode is brought to you by Datalore (https://datalore.online/SDS), the collaborative data science platform, by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts, and by Bunch (superdatascience.com/bunch), the AI driven leadership coach. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What the role of a chief economist entails [5:50]
• What are blockchains and cryptocurrency? [8:23]
• How analyzing cryptocurrencies differs from established fiat currencies [12:48]
• Philip's work at Chainalysis [26:07]
• Philip's crypto data analytics pipeline [34:48]
• How Philip develops data products for a wide range of users [46:18]
• How the blockchain facilitates innovative computing and machine learning technologies [51:52]
• What Philip looks for in the data scientists he hires [1:04:59]
Additional materials: www.superdatascience.com/621
What’s your secret to superb audio recognition? Whisper it. We mean that literally—Whisper is the latest in OpenAI’s growing suite of models aimed to benefit humanity. On this episode of Five-Minute Friday, host Jon Krohn reviews OpenAI’s latest model, Whisper. This tool will vastly improve the way human speech is recognized and converted to text. Jon gets under the hood to show how the team managed to get such a powerfully accurate recognition model. Listen to the episode and find out how you can try it yourself, for free!
Additional materials: www.superdatascience.com/620
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
10/21/2022 • 6 minutes, 34 seconds
619: Tools for Deploying Data Models into Production
Jon Krohn speaks with Erik Bernhardsson, the man who invented Spotify’s original music recommendation system. They address the different ways to interview a data science candidate, how to deploy a data model into the cloud, and the approach he took that made Spotify go from a digital music startup to an AI-driven streaming giant.
This episode is brought to you by Datalore (https://datalore.online/SDS), the collaborative data science platform, by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts, and by Bunch (superdatascience.com/bunch), the AI driven leadership coach. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The data problem that Erik’s company Modal Labs solves [04:32]
• Erik’s prolific blogging career [09:15]
• Opportunities for making data teams more efficient and productive [14:42]
• Erik’s views on interviewing data scientists and software developers [20:18]
• Erik’s tips and tricks for data science interviewees [31:35]
• How Erik built Spotify’s original music recommendation system [38:58]
• Applying vectors to other tools, and opportunities for working with vectors [47:45]
• Using Annoy to search across vectors [50:57]
• Building Python module Luigi for Spotify [55:20]
• The tools that Erik loves to work with [1:06:23]
Additional materials: www.superdatascience.com/619
10/18/2022 • 1 hour, 20 minutes, 33 seconds
618: The Joy of Atelic Activities
Telic and atelic activities take center stage this week as Jon Krohn contemplates how our daily actions contribute to our overall sense of fulfillment.
Additional materials: www.superdatascience.com/618
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
10/14/2022 • 3 minutes, 45 seconds
617: Causal Modeling and Sequence Data
Dr. Sean Taylor, Co-Founder and Chief Scientist of Motif Analytics, joins Jon Krohn this week for yet another perspective on causal modeling. Tune in for a great conversation that covers large-scale causal experimentation, Information Systems, Bayesian parameter searches, and more.
This episode is brought to you by Datalore (https://datalore.online/SDS), the collaborative data science platform, and by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Sean on his new venture, Motif Analytics [4:23]
• The relationship between causality and sequence analytics [15:26]
• Sean's data science work at Lyft [22:21]
• The key investments for large-scale causal experimentation [27:25]
• Why and when is causal modeling helpful [32:34]
• Causal modeling tools and recommendations [36:52]
• Facebook's Prophet automation tool for forecasting [40:02]
• What Sean looks for in data science hires [50:57]
• Sean on his PhD in Information Systems [53:34]
Additional materials: www.superdatascience.com/617
10/11/2022 • 1 hour, 10 minutes, 33 seconds
616: The Four Requirements for Expertise (beyond the 10,000 Hours)
10,000 hours of study: Will it make you an expert? On this episode of Five-Minute Friday, host Jon Krohn explores whether increasing your skills is just a numbers game or if there is more to becoming proficient in your area of interest, whether that’s flute playing or data wrangling.
Additional materials: www.superdatascience.com/616
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
10/7/2022 • 5 minutes, 58 seconds
615: How to Ace Your Data Science Interview
“Being a great data scientist” and “being great at a data science interview” are not one and the same. Jon Krohn speaks with Nick Singh about how to strengthen your interviewee skills, and how you can even beat out more senior competition to land a coveted data science role.
This episode is brought to you by Datalore (https://datalore.online/SDS), the collaborative data science platform, and by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Nick’s inspiration for writing his bestselling book, Ace the Data Science Interview [06:21]
• Why Nick believes in being a work generalist [12:37]
• How DataLemur supports emerging data scientists for free [15:43]
• Why Nick started DataLemur off the back of his book [21:31]
• Portfolio essentials for any data scientist [22:36]
• The three most common things data scientists get wrong at the interview [24:33]
• How data science introverts can shift their mindset about self-promotion [37:58]
• Great responses to end your data science interview on the right foot [42:21]
Additional materials: www.superdatascience.com/615
10/4/2022 • 54 minutes, 47 seconds
614: Thriving on Information Overload
World-leading futurist, author and entrepreneur, Ross Dawson joins us for the first of our extended Five-Minute Friday episodes. As information overwhelm becomes increasingly unavoidable, Dawson is here to share the five powers from his new book 'Thriving on Overload', to help us transition from overwhelm into abundance.
Additional materials: www.superdatascience.com/614
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
9/30/2022 • 33 minutes, 47 seconds
613: Causal Machine Learning
Dr. Emre Kiciman, Senior Principal Researcher at Microsoft Research joins the podcast to share his world-leading knowledge on causal machine learning.
This episode is brought to you by Datalore (https://datalore.online/SDS), the collaborative data science platform, and by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• What is causal machine learning? [5:52]
• Causal machine learning vs correlational machine learning [10:10]
• Emre’s DoWhy open-source library [16:17]
• The four key steps of causal inference [21:24]
• How and why Emre’s key steps of causal inference will impact ML [26:36]
• Emre's thoughts on the future of causal inference and AGI [34:09]
• How Emre leverages social media data to solve social problems [38:36]
• What's next for Emre's research [46:02]
• The software tools Emre highly recommends [55:16]
• What he looks for in the data science researchers he hires [58:45]
Additional materials: www.superdatascience.com/613
9/27/2022 • 1 hour, 11 minutes, 54 seconds
612: More Guests on Fridays
Some exciting changes are coming to our popular Five-Minute Friday series! From longer episodes to new guests, tune in to hear what's next.
Additional materials: www.superdatascience.com/612
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
9/23/2022 • 3 minutes, 19 seconds
611: Open-Ended A.I.: Practical Applications for Humans and Machines
Dr. Ken Stanley, a world-leading expert on Open-Ended AI and author of the genre-bending book "Why Greatness Cannot be Planned," joins Jon Krohn for a discussion that has the potential to shift your entire view on life. Tune in now to learn more about the complex topics of genetic ML algorithms, the Objective Paradox, Novelty Search, and so much more.
This episode is brought to you by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• Ken on his book 'Why Greatness Cannot Be Planned" and the Objective Paradox [4:15]
• The Novelty Search approach [24:14]
• How open-ended algorithms like Novelty Search can be stopped from doing something potentially dangerous [1:00:00]
• The future of open-ended AI and its intimate relationship with Artificial General Intelligence [1:07:34]
• Ken's new company [1:13:34]
• How AI could transform life for humans in the coming decades [1:18:29]
Additional materials: www.superdatascience.com/611
9/20/2022 • 1 hour, 30 minutes, 58 seconds
610: Who Dares Wins
On this episode of Five-Minute Friday, host Jon Krohn shares his life motto, “Who dares, wins”, and the sentiment behind it: that to get anywhere in life, it is first necessary to try. Jon believes that “daring”, in this instance, simply means taking action when we have a good idea or when a new opportunity becomes available. Listen to the end for constructive advice on how to be daring in your own life right now.
Additional materials: www.superdatascience.com/610
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
9/16/2022 • 5 minutes, 46 seconds
SDS 609: Data Mesh
Jon Krohn speaks with Zhamak Dehghani, the empathetic technologist who coined the term “data mesh”. They explore what a data mesh is, and how its approach toward secure interconnectivity will help solve a roster of data-led business problems.
This episode is brought to you by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:
• The importance of data meshes [3:29]
• How standardizing database interfaces helps tech giants like Amazon [6:40]
• Current challenges with data meshes [9:33]
• How data meshes give users the freedom to work with data [17:09]
• The missing piece of the puzzle for data meshes [22:11]
• How data meshes connect with the metaverse and Web3 [33:18]
• The times when data meshes aren’t fit for purpose [42:24]
Additional materials: www.superdatascience.com/609
9/13/2022 • 50 minutes, 41 seconds
SDS 607: Inferring Causality
Dr. Jennifer Hill, Professor of Applied Statistics at New York University, joins Jon this week for a discussion that covers causality, correlation, and inference in data science.
This episode is brought to you by Pachyderm, the leader in data versioning and MLOps pipelines and by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts.
In this episode you will learn:
• How causality is central to all applications of data science [4:32]
• How correlation does not imply causation [11:12]
• What is counterfactual and how to design research to infer causality from the results confidently [21:18]
• Jennifer’s favorite Bayesian and ML tools for making causal inferences within code [29:14]
• Jennifer’s new graphical user interface for making causal inferences without the need to write code [38:41]
• Tips on learning more about causal inference [43:27]
• Why multilevel models are useful [49:21]
Additional materials: www.superdatascience.com/607
9/9/2022 • 1 hour, 13 minutes, 12 seconds
SDS 608: Daily Habit #11: Assigning Deliverables
Company meetings should be held to solve problems. So, why do we often feel like the weekly stand-ups and check-ins are a waste of everyone’s time? On this episode of Five-Minute Friday, host Jon Krohn brings his habit-making practices into the dreaded meeting room. Make every meeting productive and positive with his five-step method for assigning deliverables.
Additional materials: www.superdatascience.com/608
9/8/2022 • 3 minutes, 53 seconds
SDS 606: Four Thousand Weeks
Four thousand weeks equate to roughly 80 years—a lifetime for those of us lucky enough to get there. What do we choose to do with this time? How can we stop ourselves from feeling like time in general is slipping away? In this episode, host Jon Krohn reviews the book Four Thousand Weeks: Time Management for Mortals by journalist Oliver Burkeman. He outlines how he has personally benefited from this essential reflection on our thirst for productivity and efficiency.
Additional materials: www.superdatascience.com/606
9/2/2022 • 6 minutes, 1 second
SDS 605: Upskilling in Data Science and Machine Learning
Kian Katanforoosh, CEO of Workera and Lecturer at Stanford University, joins Jon Krohn to reveal the tools, frameworks, and machine learning models that power his platform and remote team.
In this episode you will learn:
• What a skills intelligence platform is [3:11]
• How mentorship can be life-changing [7:45]
• Four ways that ML drives Kian’s skills intelligence platform [10:57]
• Kian's day-to-day responsibilities as the CEO of Workera [21:00]
• What frameworks and software languages Kian and his team selected for building their platform and why [24:20]
• What Kian looks for in the data scientists and software engineers he hires [31:48]
• Kian’s Stanford Deep Learning class and mentors [34:58]
• How Kian’s passion for EdTech began [42:47]
Additional materials: www.superdatascience.com/605
8/30/2022 • 58 minutes, 43 seconds
SDS 604: Ignition: A Landmark Nuclear Fusion Milestone is Achieved
During this week's Five-Minute Friday episode features, Jon explores recent groundbreaking developments in nuclear fusion –ignition–and what that signals for the future.
Additional materials: www.superdatascience.com/604
8/26/2022 • 5 minutes, 49 seconds
SDS 603: Geospatial Data and Unconventional Routes into Data Careers
Christina Stathopoulos, Analytical Lead for Waze and Adjunct Professor at IE Business School, joins the podcast to shed light on her work with geospatial data and how she nurtured an entire data career while abroad in Spain.
In this episode you will learn:
• Christina's tips on navigating an unconventional path into a data career [3:05]
• Geospatial data and open-source packages for working with it [10:08]
• Guidance to help women and other underrepresented groups to thrive in tech [22:28]
• The hard and soft skills most essential to success in a data role today [39:26]
• Christina’s #bookaweekchallenge and the top data-centric book recommendations [43:28]
Additional materials: www.superdatascience.com/603
8/23/2022 • 56 minutes, 18 seconds
SDS 602: We Are Living in Ancient Times
Inspired by a quote from by science fiction writer, Teresa Nielsen Hayden, Jon Krohn reflects on the notion of living in ancient times and the machine learning-related implications that arise from this perspective.
Additional materials: www.superdatascience.com/602
8/19/2022 • 3 minutes, 29 seconds
SDS 601: Venture Capital for Data Science
This week, Sarah Catanzaro, General Partner at Amplify Partners joins Jon for an episode that dives into the venture capital side of data science. Learn how to fund your data science business idea, take note of what start-ups can do to survive or raise capital in the current economic climate, and discover how to break into the field of venture capital yourself.
In this episode you will learn:
• Angel vs. venture capital vs. private equity investment [7:27]
• How early-stage investment is made prior to a firm having product-market fit [14:33]
• How to pick winners in early-stage investments [28:08]
• Tricks to accelerating from a data science idea to obtaining funding [36:21]
• Observational causal inference [44:01]
• How to get involved in venture capital [47:37]
Additional materials: www.superdatascience.com/601
8/16/2022 • 56 minutes, 28 seconds
SDS 600: Yoga Nidra Practice with Steve Fazzari
Rest and relaxation await as Steve Fazzari joins us this week for a special edition of the podcast! Tune in for a rejuvenating session of Yoga Nidra led beautifully by the expert.
Additional materials: www.superdatascience.com/600
8/12/2022 • 34 minutes, 33 seconds
SDS 599: MLOps: Machine Learning Operations
This week, Mikiko Bazeley, Senior Software Engineer at Mailchimp joins the podcast to share her in-depth knowledge of MLOps: Machine Learning Operations. Tune in to hear her discuss what it entails, why it's so critical for the efficiency of any data science team, and the most important tools you need to master for career success in this field.
In this episode you will learn:
• What MLOps is [11:40]
• Mikiko’s role at Mailchimp and why MLOps is critical for the efficiency of any data science team [27:11]
• The three most important MLOps tools [32:15]
• The six most essential MLOps skills for data scientists [47:01]
• The key factors Mikiko looks when hiring engineers [1:07:31]
• Mikiko’s productivity tricks for balancing software engineering, content creation, and her athletic pursuits [1:13:20]
Additional materials: www.superdatascience.com/599
8/9/2022 • 1 hour, 21 minutes, 24 seconds
SDS 598: Getting Kids Excited about STEM Subjects
Ben Taylor makes a fourth appearance on Five-Minute Friday to discuss the best ways to introduce STEM to children. Tune in to hear the many ways in which he thinks STEM education will evolve in the future.
Additional materials: www.superdatascience.com/598
8/5/2022 • 11 minutes, 46 seconds
SDS 597: A.I. Policy at OpenAI
Dr. Miles Brundage, Head of Policy Research at OpenAI, joins Jon Krohn this week to discuss AI model production, policy, safety, and alignment. Tune in to hear him speak on GPT-3, DALL-E, Codex, and CLIP as well.
In this episode you will learn:
• Miles’ role as Head of Policy Research at OpenAI [4:35]
• OpenAI's DALL-E model [7:20]
• OpenAI's natural language model GPT-3 [30:43]
• OpenAI's automated software-writing model Codex [36:57]
• OpenAI’s CLIP model [44:01]
• What sets AI policy, AI safety, and AI alignment apart from each other [1:07:03]
• How A.I. will likely augment more professions than it displaces them [1:12:06]
Additional materials: www.superdatascience.com/597
8/2/2022 • 1 hour, 23 minutes, 17 seconds
SDS 596: The A.I. Platforms of the Future
Ben Taylor returns for a third Five-Minute Friday episode! This week, he looks ahead and digs into what we can expect from the A.I. platforms of the future.
Additional materials: www.superdatascience.com/596
7/29/2022 • 7 minutes, 28 seconds
SDS 595: Data Engineering 101
Tune in as Joe Reis and Matt Housley, co-founders of Ternary Data and co-authors of the book “Fundamentals of Data Engineering” join Jon Krohn to discuss major undercurrents across the data engineering lifecycle, and their top tools and techniques.
In this episode you will learn:
• What is data engineering? [3:55]
• Why Joe and Matt identify as “recovering data scientists” [6:12]
• What kinds of people tend to become data scientists vs. data engineers [10:38]?
• Key components of Joe and Matt’s book [26:31]
• Major undercurrents across the data engineering lifecycle [28:26]
• The most under-utilized tool in a data engineer's toolbox [34:39]
• How there are tradeoffs in any data pipeline latency considerations, but faster is typically the default assumption [38:55]
• Joe and Matt’s favorite data engineering tools and techniques [43:39]
Additional materials: www.superdatascience.com/595
7/26/2022 • 1 hour, 19 minutes, 29 seconds
SDS 594: Why CEOs Care About A.I. More than Other Technologies
This week, Jon Krohn and A.I. industry veteran Ben Taylor discuss the driving factors that push CEOs to prioritize A.I. over other technologies.
Additional materials: www.superdatascience.com/594
7/22/2022 • 5 minutes, 20 seconds
SDS 593: The Real-World Impact of Cross-Disciplinary Data Science Collaboration
Jon welcomes Professor Philip Bourne, Founding Dean of the School of Data Science at the University of Virginia to discuss his biomedical data science research, the importance of open-source and open-access within the industry and the data science skills you need to succeed today.
In this episode you will learn:
• Why Philip founded a School of Data Science [6:08]
• How computing and data science have evolved across academic departments [15:55]
• The improvements needed in higher education [26:44]
• The most important data science skills for academia and industry and the 4+1 model [36:49]
• Philip’s biomedical data science research and its fascinating practical applications [43:24]
• The essential roles of open-source code and open-access publishing in data science [1:01:27]
Additional materials: www.superdatascience.com/593
7/19/2022 • 1 hour, 21 minutes, 38 seconds
SDS 592: How to Sell a Multimillion Dollar A.I. Contract
In this episode, Jon Krohn welcomes A.I. industry veteran Ben Taylor to discuss how to sell multimillion dollar A.I. contracts. Tune in to hear why trust and proof of value are some of the critical steps in his sales process.
Additional materials: www.superdatascience.com/592
7/15/2022 • 3 minutes, 23 seconds
SDS 591: Simulations and Synthetic Data for Machine Learning
Mars Buttfield-Addison, PhD Candidate at the University of Tasmania, joins Jon Krohn for a high-energy episode covering everything from Machine Learning simulations to Swift, space junk, and more!
In this episode you will learn:
• What simulations and synthetic data are, and why they can be invaluable for real-life applications [5:47]
• How simulated bots can solve any problem [9:07]
• Practical uses of simulated data [21:49]
• Why the mobile operating system language Swift is interesting for A.I. [25:46]
• Why it's critical to track the amount of junk in space [35:47]
• Whether programming or statistical skills are more important in data science [47:05]
• What it’s like creating video games in a "secret" games lab [56:45]
• Why you might want to do a data science internship in industry before pursuing in academia [ 1:01:54]
Additional materials: www.superdatascience.com/591
7/12/2022 • 1 hour, 14 minutes, 56 seconds
SDS 590: Artificial General Intelligence is Not Nigh (Part 2 of 2)
In this episode, Jon continues his two-part series on artificial general intelligence (AGI) and why we are unlikely to realize it anytime soon. Listen in as Jon reviews Meta's Yann LeCun's seven-part perspective on the topic.
Additional materials: www.superdatascience.com/590
7/8/2022 • 5 minutes, 56 seconds
SDS 589: Narrative A.I. with Hilary Mason
Hilary Mason, Co-Founder and CEO of Hidden Door, joins Jon Krohn for a live discussion that explores narrative A.I., emerging ML techniques, and how her OSEMN data science process developed.
In this episode you will learn:
• How narrative A.I. can assist creativity [5:14]
• How to build ML products that have no quantitative error function to optimize [10:31]
• How to ensure creative A.I. systems do not output non-sense or explicit content [16:58]
• Hilary's OSEMN data science process [21:05]
• The emerging ML technique she’s most excited about [24:58]
• What it takes to be successful as CEO of an early-stage A.I. company [27:20]
• What she looks for in engineering hires [32:28]
• How she’s hopeful A.I. will transform our lives for the better in the decades to come [38:48]
Additional materials: www.superdatascience.com/589
7/5/2022 • 56 minutes, 28 seconds
SDS 588: Artificial General Intelligence is Not Nigh
In this episode, Jon kicks off a two-part series that sees him explore the popular topic of artificial general intelligence and why it might–or might not–be only a few years away. Listen in as Jon explains the several reasons why he doesn't believe that AGI is nigh.
Additional materials: www.superdatascience.com/588
7/1/2022 • 5 minutes, 52 seconds
SDS 587: Data Engineering for Data Scientists
Mark Freeman, Senior Data Scientist at Humu, joins Jon Krohn to talk about all things data engineering and offers listeners some critical tips for their data science career journey – from what it takes to get promoted to his number one tip for getting hired at a fast-growing capital-backed startup.
In this episode you will learn:
• How Humu leverages data and machine learning to improve workplace behaviors [10:38]
• What is data engineering? [14:21]
• What it takes to get promoted into more senior data science roles [20:55]
• The differences between junior, senior, and staff data scientists [30:21]
• Mark’s top tools for data extraction, modeling, and pipeline engineering [37:08]
• Mark’s number one tip for getting hired at a fast-growing venture capital-backed startup [53:10]
• Why all data scientists should be interested in Web3 [1:11:53]
Additional materials: www.superdatascience.com/587
6/28/2022 • 1 hour, 25 minutes, 9 seconds
SDS 586: Daily Habit #10: Limit Social Media Use
In this episode, Jon dives into the popular topic of social media and its impact on his productivity. Tune in to hear how minimizing the use of social media can positively impact your days, mental health and work.
Additional materials: www.superdatascience.com/586
6/24/2022 • 4 minutes, 59 seconds
SDS 585: PyMC for Bayesian Statistics in Python
In this episode, Dr. Thomas Wiecki, Core Developer of the PyMC Library and CEO of PyMC Labs, joins Jon for a masterclass in Bayesian statistics. Tune in to hear about PyMC, and discover why Bayesian statistics can be more powerful and interpretable than any other data modeling approach.
In this episode you will learn:
• What Bayesian statistics is [7:30]
• Why Bayesian statistics can be more powerful and interpretable than any other data modeling approach [17:20]
• How PyMC was developed [20:41]
• Commercial applications of Bayesian stats [43:07]
• How to build a successful company culture [1:03:14]
• What Thomas looks for when hiring [1:11:13]
• Thomas’s top resources for learning Bayesian stats yourself [1:13:57]
Additional materials: www.superdatascience.com/585
6/21/2022 • 1 hour, 26 minutes, 22 seconds
SDS 584: OpenAI Codex
In this episode, Jon reviews the remarkable natural language model Codex by OpenAI. Learn why it has amassed a waitlist and how you can leverage its practical applications in your work.
Additional materials: www.superdatascience.com/584
6/17/2022 • 4 minutes, 1 second
SDS 583: The State of Natural Language Processing
In this episode, natural language processing (NLP) expert and Lead Data Scientist at CB Insights, Rongyao Huang, joins Jon Krohn to discuss NLP. Listen in for a thorough review of the field over the past decade and how the coming iron age of NLP will help us overcome the limitations of today's approaches.
In this episode you will learn:
• The evolution of NLP techniques over the past decade [4:14]
• What's next in the coming iron age of NLP [35:33]
• Rongyao’s Bauhaus-inspired model for effective data science [43:12]
• Rongyao's long-term career pathfinding framework [51:50]
• Rongyao’s top tips for staying sane while juggling career and family [1:00:30]
Additional materials: www.superdatascience.com/583
6/14/2022 • 1 hour, 14 minutes, 57 seconds
SDS 582: Model Speed vs Model Accuracy
In this episode, Jon wraps up his three-part series on business value and machine learning. Listen in as he explains why starting with simple models is best, and why speed is likely more important to your users than accuracy.
Additional materials: www.superdatascience.com/582
6/10/2022 • 3 minutes, 20 seconds
SDS 581: Bayesian, Frequentist, and Fiducial Statistics in Data Science
In this episode founding Editor-in-Chief of the Harvard Data Science Review and Professor of Statistics at Harvard University, Prof. Xiao-Li Meng, joins Jon Krohn to dive into data trade-offs that abound, and shares his view on the paradoxical downside of having lots of data.
In this episode you will learn:
• What the Harvard Data Science Review is and why Xiao-Li founded it [5:31]
• The difference between data science and statistics [17:56]
• The concept of 'data minding' [22:27]
• The concept of 'data confession' [30:31]
• Why there’s no “free lunch” with data, and the tricky trade-offs that abound [35:20]
• The surprising paradoxical downside of having lots of data [43:23]
• What the Bayesian, Frequentist, and Fiducial schools of statistics are, and when each of them is most useful in data science [55:47]
Additional materials: www.superdatascience.com/581
6/7/2022 • 1 hour, 24 minutes, 30 seconds
SDS 580: Collecting Valuable Data
In this episode, Jon resumes his series on strategies for getting business value from machine learning. Part one saw him review several ways to identify a commercial problem before starting data collection or ML model development. And now, in part two, Jon digs into the data collection process.
Additional materials: www.superdatascience.com/580
6/3/2022 • 5 minutes, 37 seconds
SDS 579: Transforming Dentistry with A.I.
In this episode, the CEO of Overjet, Dr. Wardah Inam, joins Jon Krohn to discuss the classification and quantification of dental diagnoses with computer vision, her data labeling challenges, and tips for building a successful A.I. business.
In this episode you will learn:
• How Overjet leverages computer vision to qualify and quantify dental diagnoses [5:11]
• How A.I. solutions reduce the under-diagnosis of common diseases like periodontal disease [8:15]
• Overjet's particular ML challenges within the dental industry [15:45]
• Wardah's experience in introducing A.I. to the dental industry [20:12]
• Wardah's tips for building a successful A.I. business [23:34]
• What she looks for in the data scientists and software engineers she hires [39:36]
Additional materials: www.superdatascience.com/579
5/31/2022 • 47 minutes, 10 seconds
SDS 578: Identifying Commercial ML Problems
In this episode, Jon kicks off a new Five-Minute Friday series that explores the strategies for getting business value from machine learning. Part one sees him review several ways to identify a commercial problem before starting data collection or ML model development.
Additional materials: www.superdatascience.com/578
5/27/2022 • 3 minutes, 51 seconds
SDS 577: Scaling A.I. Startups Globally
In this episode, the former CEO and co-founder of Onfido, an AI-based ID verification, joins Jon Krohn to discuss his path to start-up success. Tune in to hear valuable information from Husayn Kassai.
In this episode you will learn:
• How Husayn's start-up journey began [5:55]
• How Husayn determined that his challenge could be solved by machine vision [11:18]
• Onfido's initial seed stages [18:23]
• Launching and scaling your start-up in the U.S. market [22:00]
• The most important component in building the best product [26:30]
• Husayn's latest start-up [28:52]
• Husayn’s startup project decision-making process [37:49]
• Choosing your co-founding team [44:04]
Additional materials: www.superdatascience.com/577
5/24/2022 • 55 minutes, 15 seconds
SDS 576: Tech Startup Dramas
Hollywood has officially fallen for the drama of tech startups! Tune in to hear Jon Krohn review the small-screen adaptations of WeWork (WeCrashed), Uber (Super Pumped), and Theranos (The Dropout).
Additional materials: www.superdatascience.com/576
5/20/2022 • 3 minutes, 26 seconds
SDS 575: Optimizing Computer Hardware with Deep Learning
In this episode, the Director of Architecture at NVIDIA, Dr. Magnus Ekman, joins Jon Krohn to discuss how machine learning, including deep learning, can optimize computer hardware design. The pair also review his exceptional book 'Learning Deep Learning.'
In this episode you will learn:
• What hardware architects do [10:15]
• How ML can optimize hardware speed [ 13:19]
• Magnus’s Deep Learning Book [21:14]
• Is understanding how ML models work important? [36:16]
• Algorithms inspired by biological evolution [41:25]
• How artificial general intelligence won’t be obtained by increasing model parameters alone [51:24]
• Why there will always be a place for CNNs and RNNs [54:51]
• How people can "transition" realistically into ML [1:09:15]
Additional materials: www.superdatascience.com/575
5/17/2022 • 1 hour, 23 minutes, 34 seconds
SDS 574: Music for Deep Work
In this episode, Jon shares how the right music can power your productivity. It's no secret that he's a big fan of 'deep work,' but this week, he opens up about the artists, sites, and playlists that propel his productivity to new levels.
Additional materials: www.superdatascience.com/574
5/13/2022 • 3 minutes, 52 seconds
SDS 573: Automating ML Model Deployment
In this episode, co-founder and CEO of Linea, Dr. Doris Xin, joins Jon Krohn to discuss how automating ML model deployment delivers groundbreaking change to data science productivity, and shares what it's like being the CEO of an exciting, early-stage tech start-up.
In this episode you will learn:
• How Linea reduces ML model deployment down to a couple of lines of Python code [5:14]
• Linea use cases [11:30]
• How DAGs can 10x production workflow efficiency [22:12]
• ML model graphlets and reducing wasted computation [24:14]
• What future Doris envisions for autoML [35:23]
• Doris’s day-to-day life as a CEO of an early-stage start-up [42:43]
• What Doris looks for in the engineers and data scientists that she hires [52:21]
• The future of Data Science and how to prepare best for it [53:58]
Additional materials: www.superdatascience.com/573
5/10/2022 • 1 hour, 6 minutes, 34 seconds
SDS 572: Daily Habit #9: Avoiding Messages Until a Set Time Each Day
In this episode, Jon shares his habit of blocking out two hours in his mornings that are free from email and social media distractions. Tune in to learn how this habit helps him deeply focus on his most delightful tasks of the day.
Additional materials: www.superdatascience.com/572
5/6/2022 • 3 minutes, 25 seconds
SDS 571: Collaborative, No-Code Machine Learning
Einblick co-founder and associate professor at MIT, Tim Kraska, joins Jon Krohn to discuss no-code collaboration tools for data science and uncovers the clever database and machine learning tricks under the hood of the visual data computing platform.
In this episode you will learn:
• The inspiration behind Einblick [2:45]
• Einblick's progressive approximation engine [6:43]
• How no-code tools impact productivity [17:18]
• The critical steps to become more data-driven as an organization [24:30]
• How research universities like MIT support high-risk, long-term research [38:37]
• How ML applied to databases enables them to be faster and more efficient [42:03]
• How real-time collaboration environments like Google Docs are likely to become more widespread for data science tasks [ 49:24]
Additional materials: www.superdatascience.com/571
5/3/2022 • 57 minutes, 39 seconds
SDS 570: DALL-E 2: Stunning Photorealism from Any Text Prompt
In this episode, Jon is back with another A.I. model breakthrough! He updates listeners on OpenAI's outstanding DALL-E 2 model. The new natural language processing model churns out staggering visual examples of whatever text your mind can dream up.
Additional materials: www.superdatascience.com/570
4/29/2022 • 5 minutes, 36 seconds
SDS 569: A.I. For Crushing Humans at Poker and Board Games
Research Scientist at Meta AI, Dr. Noam Brown, joins Jon Krohn to discuss his award-winning no-limit poker-playing algorithms and the real-world implications of his game-playing A.I. breakthroughs.
In this episode you will learn:
• What Meta A.I. is and how it fits into Meta, the company [3:01]
• Noam's award-winning no-limit poker-playing algorithms, Libratus and Pluribus algorithms. [4:33]
• What game theory is and how does Noam integrate it into his models? [8:45]
• The real-world implications of Noam’s game-playing A.I. breakthroughs [25:24]
• Why Noam elected to become a researcher at a big tech firm instead of in academia [27:06]
• The main barriers to getting AI game theory techniques beyond games to self-driving cars [30:16]
• Recommendations for people who want to break into poker AI [37:45]
Additional materials: www.superdatascience.com/569
4/26/2022 • 44 minutes, 35 seconds
SDS 568: PaLM: Google's Breakthrough Natural Language Model
In this episode, Jon updates listeners on one of the industry's biggest breakthroughs to date –Google's new natural language processing model, PaLM. The key innovation with PaLM is scaling up Google's Pathways modeling approach to half a trillion parameters — many-fold more parameters than had previously been trained using this approach.
Additional materials: www.superdatascience.com/568
4/22/2022 • 5 minutes, 1 second
SDS 567: Open-Access Publishing
In this episode, the MIT Press Director and Publisher, Dr. Amy Brand, joins Jon Krohn to discuss open-access publishing in data science and how to address the inequalities that exist for women and minorities in STEM.
In this episode you will learn:
• What it’s like to run the prestigious MIT Press [4:34]
• How open access makes scholarly work more impactful [6:34]
• How publishing outstanding STEM books for broader audiences, including for children, can help address STEM biases [19:28]
• Amy's award-winning documentary Picture A Scientist [25:28]
• What it's like to executive produce a documentary [37:24]
• What can be done to change STEM to make it more welcoming to minorities [48:44]
• The best open-source model going forward [58:26]
• What fascinates Amy about natural language processing [1:01:30]
• How author metadata in standardized taxonomies can help authors receive the credit they deserve [1:04:50]
Additional materials: www.superdatascience.com/567
4/19/2022 • 1 hour, 17 minutes, 46 seconds
SDS 566: The Best Time to Plant a Tree
In this episode, Jon reflects on the Chinese proverb: "The best time to plant a tree was 20 years ago. The second best time is now." He also challenges listeners to reflect on their long-term goals that have gone unfulfilled.
Additional materials: www.superdatascience.com/566
4/15/2022 • 3 minutes, 46 seconds
SDS 565: AGI: The Apocalypse Machine
In this episode, Jeremie Harris dives into the stirring topic of AI Safety and the existential risks that Artificial General Intelligence poses to humankind.
In this episode you will learn:
• Why mentorship is crucial in a data science career development [15:45]
• Canadian vs American start-up ecosystems [24:18]
• What is Artificial General Intelligence (AGI)? [38:50]
• How Artificial Superintelligence could destroy the world [1:04:00]
• How AGI could prove to be a panacea for humankind and life on the planet. [1:27:31]
• How to become an AI safety expert [1:30:07]
• Jeremie's day-to-day work life at Mercurius [1:35:39]
Additional materials: www.superdatascience.com/565
4/12/2022 • 2 hours, 5 minutes, 20 seconds
SDS 564: Clem Delangue on Hugging Face and Transformers
In this episode, Jon speaks with the CEO of Hugging Face, Clem Delangue, about open-source machine learning and transformer architectures, while attending the ScaleUp:AI Conference in New York.
Additional materials: www.superdatascience.com/564
4/8/2022 • 19 minutes, 21 seconds
SDS 563: How to Rock at Data Science — with Tina Huang
In this episode, superstar data science YouTuber Tina Huang joins us to discuss what it's like to work at one of the world's largest tech companies, her strategies for efficient learning, and how best to prepare for a career in data science from scratch.
In this episode you will learn:
• The key areas to focus on when getting started in data science [6:01]
• Tina’s five steps to consistently doing anything [11:55]
• Tina's day-to-day life as a data scientist at one of the world’s largest tech companies [20:02]
• How Tina's computer science background helps her work [26:20]
• Traditional banking culture vs big tech [32:12]
• How Tina's background in pharmacology impacts her work in data science [36:15]
• The software languages that Tina uses daily in her work [45:30]
• How Tina’s SQL course practically prepares you for data science interviews [47:24]
Additional materials: www.superdatascience.com/563
4/5/2022 • 1 hour, 4 minutes, 33 seconds
SDS 562: Daily Habit #8: Math or Computer Science Exercise
In this episode, Jon shares his daily technical exercise, which is part of an extensive habit tracking system that allows him to achieve more, create more structure within his day, and cut out bad habits. By completing mathematics, computer science, or programming exercise daily, Jon is able to hone his technical skills in a limitlessly broad field and open new professional opportunities in the long run.
Additional materials: www.superdatascience.com/562
4/1/2022 • 5 minutes, 33 seconds
SDS 561: Engineering Data APIs
In this episode, Ribbon Health CTO Nate Fox joins us to discuss the ins and outs of APIs. Tune in to hear him share how he and his team build out APIs from scratch; how they ensure the uptime and reliability of APIs and how they leverage machine learning to improve the quality of healthcare delivery and maximize their social impact.
In this episode you will learn:
• What are APIs? [13:20]
• How Ribbon Health’s data API leverages ML models to improve the quality of healthcare delivery [16:08]
• How to design a data API from scratch [20:00]
• How to ensure the uptime and reliability of APIs [25:28]
• How Ribbon uses knowledge graphs, manually labeled data samples, and an XGBoost model with hundreds of inputs to assign a confidence score [27:14]
• Nate’s favorite tool for easily scaling up the impact of data science [37:40]
• What is Nate’s day-to-day like? [34:34]
• The qualities Nate looks for when hiring data scientists [39:50]
• How scientists and engineers can make a big social impact in health technology [42:50]
Additional materials: www.superdatascience.com/561
3/29/2022 • 53 minutes, 54 seconds
SDS 560: Daily Habit #7: Read Two Pages
In this episode, Jon shares his daily habit of reading two pages and explains how it has transformed his productivity.
Additional materials: www.superdatascience.com/560
3/25/2022 • 4 minutes, 19 seconds
SDS 559: GPT-3 for Natural Language Processing
Natural language processing expert and PhD student Melanie Subbiah sits down with Jon Krohn to discuss GPT-3, its strengths and weaknesses, and the future of NLP.
In this episode you will learn:
• What is GPT-3? [6:24]
• The strengths and weaknesses of GPT-3 [14:38]
• What is autoregression? [18:03]
• GPT-3's new fine-tuning abilities [20:02]
• Bias issues with GPT-3 [22:47]
• The future of natural language processing models [27:54]
• How Melanie ended up working at OpenAI [38:13]
• Melanie’s self-study process [42:19]
• Melanie's work on OpenAI API [45:45]
• How to address the climate change and bias issues that cloud discussions of large natural language models [49:40]
• Why Melanie chose to do a PhD at Columbia University [1:01:17]
• The machine learning tools Melanie’s most excited about [1:08:09]
Additional materials: www.superdatascience.com/559
3/22/2022 • 1 hour, 28 minutes, 18 seconds
SDS 558: Jon's Answers to Questions on Machine Learning
In this episode, Jon shares the key topics he recently discussed with the Open Data Science Conference. From the approach behind his extensive machine learning and deep learning content library to revealing the key tools and software he uses daily, get to know Jon and his process a little better.
Additional materials: www.superdatascience.com/558
3/18/2022 • 6 minutes, 55 seconds
SDS 557: Effective Pandas
Pandas expert Matt Harrison sits down with Jon Krohn to discuss tips, tricks and best practices for Pandas learning and mastery.
In this episode you will learn:
• Pros and cons of self-publishing and working with a publisher [5:05]
• Matt's six tips for using Pandas [17:13]
• The best way for corporate teams to level up their skills [40:04]
• How to learn anything effectively [47:14]
• Matt’s tricks for staying motivated [50:00]
• Matt’s recommendations for using Git and the Unix command line [1:00:14]
• Matt’s recommended software libraries for working with tabular data [1:19:45]
Additional materials: www.superdatascience.com/557
3/15/2022 • 1 hour, 30 minutes, 56 seconds
SDS 556: Jon's Machine Learning Courses
Discover Jon’s extensive library of machine learning content and learn why Jon's Machine Learning House forms the knowledge structure of an outstanding data scientist or ML engineer.
Additional materials: www.superdatascience.com/556
3/11/2022 • 7 minutes, 7 seconds
SDS 555: Sports Analytics and 66 Days of Data with Ken Jee
Data scientist and Youtuber Ken Jee joins Jon Krohn for a deep dive into the world of sports analytics and brings us behind the makings of his large, online data science community.
In this episode you will learn:
• The inspiration behind Ken’s YouTube videos [18:03]
• Ken’s four steps for getting started in data science [24:18]
• How sports analytics is transforming sports like golf [33:32]
• Ken’s favorite tools for software scripting as well as for production code development [41:10]
• How the #66DaysofData hashtag can supercharge your capacity as a data scientist [42:51]
• Ken’s data science podcast Ken’s Nearest Neighbors [54:11]
• LinkedIn Q&A [1:00:32]
Additional materials: www.superdatascience.com/555
3/8/2022 • 1 hour, 13 minutes, 40 seconds
SDS 554: Jons Deep Learning Courses
In this episode, Jon shares where you can find his extensive deep learning video content and courses. Tune in to learn more about his deep learning curriculum and where you can learn for free.
Additional materials: www.superdatascience.com/554
3/4/2022 • 5 minutes, 20 seconds
SDS 553: The Statistics and Machine Learning Quests of Dr. Josh Starmer
In this episode, Dr. Josh Starmer, the creative, musical genius behind the wildly popular YouTube channel StatQuest joins the podcast to discuss statistics, learning and communication secrets, and how he grew his YouTube channel to over 650,000 subscribers.
In this episode you will learn:
• The inspiration behind Josh’s YouTube channel [18:39]
• Josh's simple approach to learning something new [34:25]
• Josh's secret tool for creating YouTube videos with over a million views [51:01]
• The StatQuest Illustrated Guide to Machine Learning [53:34]
• How and when Josh uses R vs. Python [1:07:53]
• How to cluster any types of data using the R randomForest package [1:11:24]
• Why Josh left his academic career [1:14:24]
• The two stats concepts Josh thinks everyone should know [1:38:50]
Additional materials: www.superdatascience.com/553
3/1/2022 • 1 hour, 48 minutes, 55 seconds
SDS 552: The Most Popular SuperDataScience Episodes of 2021
In this episode of Five-Minute Friday, Jon recaps the most popular SuperDataScience podcast episodes from 2021. See what you might have missed and catch up today!
Additional materials: www.superdatascience.com/552
2/25/2022 • 4 minutes, 37 seconds
SDS 551: Deep Reinforcement Learning — with Wah Loon Keng
In this episode, gifted author and software engineer Wah Loon Keng joins the podcast to dive deep into reinforcement learning. From its history to limitations, modern industrial applications, and future developments– there's no better expert to learn from if you want to know more about this complex topic.
In this episode you will learn:
• What is reinforcement learning? [4:50]
• Deep reinforcement learning vs reinforcement learning [13:17]
• A timeline of reinforcement learning breakthroughs [16:17]
• The limitations of deep RL today [39:53]
• Deep RL applications [53:10]
• Keng's open-source SLM-Lab framework [57:51]
• Keng’s responsibilities as an AI engineer [1:02:17]
• What is the future of RL? [1:08:05]
Additional materials: www.superdatascience.com/551
2/22/2022 • 1 hour, 21 minutes, 4 seconds
SDS 550: Daily Habit #6: Write Morning Pages
Jon is back with another Five-Minute Friday habit-tracking episode! Listen in as he explains how writing morning pages has helped his data science work flourish with creativity. Inspired by Julia Cameron's book The Artist's Way, he details his morning pages routine and how it kickstarted a new chapter in his career.
Additional materials: www.superdatascience.com/550
2/18/2022 • 4 minutes, 7 seconds
SDS 549: Engineering Natural Language Models — with Lauren Zhu
In this episode, Glean software engineer and Stanford graduate Lauren Zhu joins us to discuss her role at a fast-growing startup, working on natural language processing projects, and how she remains inspired by pursuing her side passions.
In this episode you will learn:
• Lauren's experience as a course assistant [5:53]
• Stanford's Hacking the Coronavirus Course [11:53]
• How do you empower minority groups in AI [19:45]
• Lauren on zero-shot multilingual neural machine translation [23:25]
• Lauren's work at Glean [27:58]
• The Contrary Talent Network [34:30]
• The tools Lauren uses at Glean [43:39]
• The most important skills to possess as a data scientist [47:29]
Additional materials: www.superdatascience.com/549
2/15/2022 • 1 hour, 6 minutes, 8 seconds
SDS 548: Daily Habit #5: Meditate
Our Five-Minute Friday series on habit tracking returns with a look at one of Jon's daily mindfulness habits–meditation. Learn how to keep this habit going for the long run and discover which tools help Jon stay on track.
Additional materials: www.superdatascience.com/548
2/11/2022 • 3 minutes, 40 seconds
SDS 547: How Genes Influence Behavior — with Prof. Jonathan Flint
In this episode, Dr. Jonathan Flint, Professor of Psychiatry and Biobehavioral Sciences at the University of California Los Angeles, joins us to discuss how he uses data science and machine learning to explore the link between genetics and depression.
In this episode you will learn:
• Johnathan's background [2:53]
• How we know that genetics plays a role in complex human behaviors including psychiatric disorders like anxiety, depression, and schizophrenia [8:00]
• The role that data science and ML play in modern genetics research [15:08]
• About Jonathan book "How Genes Influence Behavior" [19:45]
• The day-to-day life of a world-class medical sciences researcher [32:24]
• The open-source software libraries that Jonathan uses for data modeling [40:33]
• A single question you can ask to prevent a severely depressed person from committing suicide [52:00]
• LinkedIn Q&A [54:41]
• The future of psychiatric treatments [1:05:35]
Additional materials: www.superdatascience.com/547
Our Five-Minute Friday habit-tracking series continues! Learn more about alternate-nostril breathing–the mindfulness technique that is scientifically proven to lower blood pressure and regulate the stress response.
Additional materials: www.superdatascience.com/546
2/4/2022 • 4 minutes, 50 seconds
SDS 545: Scaling Data-Intensive Real-Time Applications — with Matthew Russell
Data scientist and entrepreneur Matthew Russell joins Jon Krohn to discuss the intersection of machine learning and fitness and dive deep into the strategies he and his team at Strongest AI use to scale data-intensive real-time applications.
In this episode you will learn:
• About Strongest's event platform and iOS app [6:06]
• How Strongest scaled to serve million [8:14]
• Strongest's unique approach to building a fitness app [17:50]
• How to rapidly test ML models for deployment [29:01]
• The three critical traits Matthew looks for in anyone he hires [33:11]
• Mining the Social Web [41:14]
• The values instilled in Matthew by pursuing a military education [53:30]
• The key skills Matthew wishes he’d learned earlier in his career [1:03:51]
Additional materials: www.superdatascience.com/545
2/1/2022 • 1 hour, 16 minutes, 24 seconds
SDS 544: Daily Habit #3: Make Your Bed
Our habit-tracking series continues with a look at how making your bed can jumpstart your mornings, prevent you from taking part in negative habits and help you become happier.
Additional materials: www.superdatascience.com/544
1/28/2022 • 2 minutes, 48 seconds
SDS 543: Sparking A.I. Innovation — with Nicole Büttner
Nicole Büttner (Founder and CEO of Merantix Labs) joins the podcast to discuss driving A.I. innovation, automation, and transformation and building the ideal A.I. start-up founding team.
In this episode you will learn:
• The three factors that spark A.I. innovation [12:48]
• How to make great use of the unlabelled, unbalanced data sets [18:54]
• How to engineer reusable data and software components [25:09]
• Merantix's A.I. Canvas framework for successful innovation [29:59]
• How to be a part of Merantix's program as a founder [45:23]
Additional materials: www.superdatascience.com/543
1/25/2022 • 55 minutes, 4 seconds
SDS 542: Continuous Calendar for 2022
Revisit the much-underrated continuous calendar and get started with this uncommon planning method thanks to Jon's 2022 template.
Additional materials: www.superdatascience.com/542
1/21/2022 • 2 minutes, 46 seconds
SDS 541: Data Observability — with Dr. Kevin Hu
In this episode, Kevin Hu joins the podcast to talk about founding and growing the data observability startup, Metaplane. Listen in to hear about his time in academia at MIT, his experience with Y Combinator, and his current routine as a technical founder.
In this episode you will learn:
• What is data observability? [4:35]
• How to identify data quality issues? [8:56]
• Kevin's PhD research on automating data science systems using machine learning [16:18]
• Why Kevin launched Metaplane [28:50]
• The pros and cons of an academic career relative to the start-up hustle [31:57]
• Kevin's experience in Y-Combinator accelerator [39:50]
• The software tools he uses daily as a CEO [50:54]
• What Kevin looks for in data engineer hires [56:13]
Additional materials: www.superdatascience.com/541
1/18/2022 • 1 hour, 8 minutes, 3 seconds
SDS 540: Daily Habit #2: Start the Day with a Glass of Water
In this episode, Jon opens up about starting his day with a glass of water – his first morning habit that sets his day off on a healthy and successful note.
Additional materials: www.superdatascience.com/540
1/14/2022 • 3 minutes, 50 seconds
SDS 539: Interpretable Machine Learning — with Serg Masís
In this episode, Serg Masís joins the podcast to share his in-depth technical knowledge of Interpretable Machine Learning. Together they discuss why this field matters, how it’s evolving, and so much more.
In this episode you will learn:
• What is interpretable machine learning? [8:41]
• The social and financial ramifications of interpreting models incorrectly [10:23]
• The challenges involved in interpretable ML [16:00]
• The most important interpretable ML concepts to master [19:54]
• The future of Interpretable ML [32:41]
• What it’s like to be a Climate & Agronomic Data Scientist [42:28]
• Serg’s day-to-day tools [49:05]
• Serg's productivity tips [50:25]
• Why Serg pursued a Master's in Data Science [52:25]
Additional materials: www.superdatascience.com/539
1/11/2022 • 1 hour, 1 minute, 36 seconds
SDS 538: Daily Habit #1: Track Your Habits
In this episode, Jon shares his "life-changing" habit tracking system that has allowed him to achieve more, create more structure within his day and cut out bad habits.
Additional materials: www.superdatascience.com/538
1/7/2022 • 7 minutes, 4 seconds
SDS 537: Data Science Trends for 2022
Sadie St. Lawrence returns to discuss the biggest data science trends that are set to take over the industry in 2022.
In this episode you will learn:
• A look back at data science trends for 2021 [4:03]
• Micro and macro data science trends for 2022 [12:30]
• AutoML tools [15:20]
• The social implications of deepfakes [21:21]
• Scalable AI [38:40]
• Macro data science trends for 2022 [42:45]
• The impact of the remote-working economy in data science [43:21]
• Blockchain in data science [50:28]
• Data literacy of the global workforce [1:01:07]
Additional materials: www.superdatascience.com/537
1/4/2022 • 1 hour, 16 minutes, 9 seconds
SDS 536: What I Learned in 2021
Jon goes over his five biggest learnings from 2021 and what he hopes to work on in 2022.
Additional materials: www.superdatascience.com/536
12/31/2021 • 13 minutes, 28 seconds
SDS 535: How to Found, Grow, and Sell a Data Science Start-up
Prolific data science entrepreneur and Y Combinator alum Austin Ogilvie (Laika, Yhat) joins Jon Krohn for a revealing look into his journey of starting, growing, and selling a data science startup. From liberal arts graduate to twice successful technical founder, take a seat and learn from the best.
In this episode you will learn:
• The story behind the naming of Yhat and its early beginnings [5:10]
• Austin and Yhat's experience at Y Combinator [19:00]
• The benefits of being a technical founder [25:00]
• From arts degree graduate to successful tech entrepreneur [27:00]
• Austin's latest venture, Laika [39:30]
• The tools that Austin uses day-to-day [47:30]
• Unity gaming environment [49:58]
• What makes a great data scientist [56:23]
Additional materials: www.superdatascience.com/535
12/28/2021 • 1 hour, 9 minutes, 42 seconds
SDS 534: A Holiday Greeting
Jon sends a holiday greeting to all listeners.
Additional materials: www.superdatascience.com/534
12/24/2021 • 1 minute, 50 seconds
SDS 533: Fusion Energy, Cancer Proteomics, and Massive-Scale Machine Vision — with Dr. Brett Tully
Dr. Brett Tully joins us on the podcast to discuss his work as Director of AI Output Systems at Nearmap and his previous research in biomedical topics and nuclear fusion.
In this episode you will learn:
• What is Nearmap? [5:22]
• What is a Director of AI Output Systems? [7:51]
• A case study [20:35]
• MLOps at Nearmap [26:37]
• Brett’s day-to-day and what he looks for in hires [40:19]
• Brett’s academic and research history [53:30]
• Brett’s work in nuclear fusion and predictions for the technology [1:04:48]
• The tools Brett used in his research [1:26:34]
• ProCan project [1:34:27]
• Brett’s prediction for future AI applications [1:48:30]
Additional materials: www.superdatascience.com/533
12/21/2021 • 1 hour, 59 minutes, 3 seconds
SDS 532: Mutable vs Immutable Conditions
Jon discusses one helpful framework when it comes to problem-solving and how data scientists are uniquely positioned to employ this technique.
Additional materials: www.superdatascience.com/532
12/17/2021 • 4 minutes, 57 seconds
SDS 531: Data Science at the Command Line
Jeroen Janssens joins on the podcast to discuss his book on utilizing the command line for data science and the importance of polyglot data science work.
In this episode you will learn:
• The genesis of Jeroen’s book [3:24]
• Data Science at the Command Line [8:55]
• Creating your own command line tools [22:07]
• Polyglot data scientist [24:29]
• Data Science Workshops [27:01]
• Jeroen’s PhD research [30:38]
Additional materials: www.superdatascience.com/531
12/14/2021 • 50 minutes, 30 seconds
SDS 530: Ten A.I. Thought Leaders to Follow (on Twitter)
Jon details his top ten AI thought leaders hoping that his suggestions prove valuable to you in your data science journey.
Additional materials: www.superdatascience.com/530
12/10/2021 • 5 minutes, 23 seconds
SDS 529: A.I. Robotics at Home
Dave Niewinski joins us to discuss his prolific work in robotics both as a consultant and a popular YouTuber.
In this episode you will learn:
• Dave’s Armoury [4:44]
• Robotic cornhole tournament [12:33]
• Dave’s many robots [14:25]
• Dave’s idea process [28:51]
• Future robots [31:43]
• Dave’s consulting business [33:27]
• Tools Dave likes to use [37:05]
• How did Dave get started in this line of work? [38:50]
• Dave’s advice to people who want to get into robotics [41:18]
• What is Dave excited about in the future? [45:38]
Additional materials: www.superdatascience.com/529
12/7/2021 • 53 minutes, 17 seconds
SDS 528: The Normal Anxiety of Content Creation
Jon explores his personal anxieties as a content creator to encourage fellow creators to keep sharing their knowledge.
Additional materials: www.superdatascience.com/528
12/3/2021 • 3 minutes, 50 seconds
SDS 527: Automating Data Analytics
Peter Bailis joins the podcast to discuss the work of his company that solves complex commercial problems through automated data analysis.
In this episode you will learn:
• Meaning of the name Sisu [3:08]
• What Sisu does [4:45]
• Sisu and the data science stack [17:00]
• Going from academia to startups [22:37]
• What Sisu looks for when hiring [28:57]
• Peter’s favorite tools [32:40]
• Peter’s academic research [45:02]
Additional materials: www.superdatascience.com/527
11/30/2021 • 1 hour, 1 minute, 15 seconds
SDS 526: The Highest-Paying Data Frameworks
I finish up our three-part series on the results of the O’Reilly Survey, looking at the highest-paying data frameworks.
Additional materials: www.superdatascience.com/526
11/26/2021 • 6 minutes, 9 seconds
SDS 525: Hurdling Over Data Career Obstacles
Karen Jean-Francois joins us to discuss how she wants to empower her team members and a wider audience of data scientists battling imposter syndrome.
In this episode you will learn:
• Karen’s background as a hurdler [4:42]
• Women in Data Podcast [10:32]
• Cardlytics [19:04]
• Karen’s background and current career [22:55]
• Karen’s favorite tools [31:29]
• Karen’s balance of fitness and work [34:45]
• The biggest challenge of Karen’s career [47:09]
• Advancement in data [54:13]
• What is Karen most excited about? [59:40]
Additional materials: www.superdatascience.com/525
11/23/2021 • 1 hour, 8 minutes, 59 seconds
SDS 524: The Highest-Paying Data Tools
In this episode, I go over the highest-paying data tools based on the O’Reilly survey.
Additional materials: www.superdatascience.com/524
Wes McKinney joins us to discuss the history and philosophy of pandas and Apache Arrow as well as his continued work in open source tools.
In this episode you will learn:
• History of pandas [7:29]
• The trends of R and Python [23:33]
• Python for Data Analysis [25:58]
• pandas updates and community [30:10]
• Apache Arrow [41:50]
• Voltron Data [55:10]
• Origin of Wes’s project names [1:08:14]
• Wes’s favorite tools [1:09:46]
• Audience Q&A [1:15:34]
Additional materials: www.superdatascience.com/523
11/16/2021 • 1 hour, 27 minutes, 35 seconds
SDS 522: Data Tools vs. Data Platforms
I provide you with some quick definitions of data tools vs data platforms to prep us for deep dives in future episodes.
Additional materials: www.superdatascience.com/522
11/12/2021 • 3 minutes, 25 seconds
SDS 521: Skyrocket Your Career by Sharing Your Writing
Khuyen Tran joins us to discuss her work as a prolific technical writer and undergraduate data science student.
In this episode you will learn:
• Khuyen’s online writing [4:00]
• Book writing [8:50]
• How you can increase your engagement [13:49]
• Khuyen’s work with Towards Data Science and NVIDIA [19:01]
• Ocelot Consulting [24:08]
• Khuyen’s undergrad work [32:12]
• Audience questions [47:00]
Additional materials: www.superdatascience.com/521
11/9/2021 • 1 hour, 1 minute, 52 seconds
SDS 520: The Highest-Paying Programming Languages for Data Scientists
I take a look at the results of O’Reilly’s survey on salaries for data scientists in 2021.
Additional materials: www.superdatascience.com/520
11/5/2021 • 5 minutes, 23 seconds
SDS 519: A.I. for Good
James Hodson joins us to discuss his philosophy and work at A.I. For Good and how they aim to promote sustainability and A.I. use for social issues.
In this episode you will learn:
• AI for Good [5:17]
• Founding of AI for Good [8:50]
• Case studies [14:58]
• How you can get involved [46:29]
• Skills James looks for in hires [50:39]
Additional materials: www.superdatascience.com/519
11/2/2021 • 1 hour, 8 minutes, 12 seconds
SDS 518: Fail More
This week, I provide a short but important bit of advice on failure.
Additional materials: www.superdatascience.com/518
10/29/2021 • 2 minutes, 18 seconds
SDS 517: Courses in Data Science and Machine Learning
Sadie St. Lawrence talks in-depth about her extensive work as a data science educator through both online and collegiate courses as well as her organization for diversifying data science careers.
In this episode you will learn:
• Sadie’s education work in SQL [4:13]
• The popularity of Sadie’s course [13:32]
• Sadie’s forthcoming machine learning certificate course [16:29]
• Women in Data [25:32]
• Sadie’s non-technical background [36:17]
• NFTs and VR [46:41]
Additional materials: www.superdatascience.com/517
10/26/2021 • 55 minutes, 39 seconds
SDS 516: Does Caffeine Hurt Productivity? (Part 3: Scientific Literature)
In this episode, I finish up my saga into the effects of caffeine on productivity.
Additional materials: www.superdatascience.com/516
10/22/2021 • 7 minutes, 24 seconds
SDS 515: Accelerating Impact through Community — with Chrys Wu
Chrys Wu joins us to discuss her community organizations, her tips, and her recommended resources for building data science communities for impact.
In this episode you will learn:
• The world of K-Pop [ 4:07]
• Chrys’s talk at the R Conference [8:56]
• Write/Speak/Code [14:05]
• Hacks/Hackers [21:58]
• Tips on developing data communities [27:22]
Additional materials: www.superdatascience.com/515
10/19/2021 • 38 minutes, 54 seconds
SDS 514: Does Caffeine Hurt Productivity? (Part 2: Experimental Results)
In this episode, I dive into the nuts and bolts of data on my experiment into caffeine and productivity.
Additional materials: www.superdatascience.com/514
10/15/2021 • 8 minutes, 25 seconds
SDS 513: Transformers for Natural Language Processing
Denis Rothman joins us to discuss his writing work in natural language processing, explainable AI, and more!
In this episode you will learn:
• What are transformers and their applications? [7:54]
• Denis’s book on explainable AI [25:08]
• AI by Example [35:53]
• LinkedIn audience questions [42:00]
Additional materials: www.superdatascience.com/513
10/12/2021 • 54 minutes, 8 seconds
SDS 512: Does Caffeine Hurt Productivity? (Part 1)
I dive into a personal experiment to test my productivity relative to my coffee intake and if caffeine is actually hurting my productivity.
Additional materials: www.superdatascience.com/512
10/8/2021 • 5 minutes, 56 seconds
SDS 511: Data Science for Private Investing — LIVE with Drew Conway
Drew Conway joins us on the first live podcast to discuss his work in private investing and how data science figures into and improves his work.
In this episode you will learn:
• The R Conference and NYHackR [6:33]
• Machine Learning for Hackers [20:17]
• Two Sigma and Drew’s work [28:27]
• Drew’s team structure at Two Sigma [35:12]
• Audience Q&A [46:27]
Additional materials: www.superdatascience.com/511
10/5/2021 • 1 hour, 9 minutes, 6 seconds
SDS 510: Deep Reinforcement Learning
In this episode, I dive into the world of reinforcement learning and deep reinforcement learning and the benefits of both.
Additional materials: www.superdatascience.com/510
10/1/2021 • 7 minutes, 13 seconds
SDS 509: Accelerating Start-up Growth with A.I. Specialists
Parinaz Sobhani joins us to discuss the cutting-edge work of Georgian, a collaborative company that helps start-ups implement and scale machine learning and AI.
In this episode you will learn:
• Parinaz’s work at Georgian [5:35]
• Use cases of Georgian’s work [14:35]
• Tools and approaches Parinaz uses [32:27]
• Environmental concerns of machine learning [42:52]
• Hiring at Georgian and what Parinaz looks for [48:18]
• How did Parinaz become interested in this? [56:19]
• Fairness in AI [1:09:01]
Additional materials: www.superdatascience.com/509
9/28/2021 • 1 hour, 21 minutes, 11 seconds
SDS 508: Building Your Ant Hill
In this episode, I discuss an interesting bit of my grandmother’s view about the process of working and going through life.
Additional materials: www.superdatascience.com/508
9/24/2021 • 3 minutes, 29 seconds
SDS 507: Bayesian Statistics
Rob Trangucci joins us to discuss his work and study in Bayesian statistics and how he applies it to real-world problems.
In this episode you will learn:
• Getting Rob on the show [8:12]
• Stan [9:34]
• Gradients [18:15]
• What is Bayesian statistics? [23:05]
• Multi-modal deep learning [45:20]
• Stan package [53:46]
• Applications of Bayesian stats [1:09:47]
• The day-to-day of a PhD in stats [1:21:56]
• What does the future hold? [1:42:37]
Additional materials: www.superdatascience.com/507
9/21/2021 • 1 hour, 55 minutes, 3 seconds
SDS 506: Supervised vs Unsupervised Learning
In this episode, I continue with last week’s theme and discuss the differences between supervised and unsupervised learning.
Additional materials: www.superdatascience.com/506
9/17/2021 • 9 minutes, 16 seconds
SDS 505: From Data Science to Cinema
Hadelin de Ponteves joins us to discuss his latest educational work and how his skills as a data science educator helped him start his career in acting.
In this episode you will learn:
• What has Hadelin been up to? [4:27]
• Hadelin’s cinema career and data science crossover [16:02]
• Sleep for productivity [27:27]
• How did Hadelin decide to undertake this? [32:26]
• Bollywood vs Hollywood [37:26]
Additional materials: www.superdatascience.com/505
9/14/2021 • 46 minutes, 9 seconds
SDS 504: Classification vs Regression
In this episode, I give a quick introduction to subcategories of supervised learning problems.
Additional materials: www.superdatascience.com/504
9/10/2021 • 5 minutes, 44 seconds
SDS 503: Deep Reinforcement Learning for Robotics
Pieter Abbeel joins us to discuss his work as an academic and entrepreneur in the field of AI robotics and what the future of the industry holds.
In this episode you will learn:
• How does Pieter do it all? [5:45]
• Pieter’s exciting areas of research [12:30]
• Research application at Covariant [32:27]
• Getting into AI robotics [42:18]
• Traits of good AI robotics apprentices [49:38]
• Valuable skills [56:40]
• What Pieter hopes to look back on [1:04:30]
• LinkedIn Q&A [1:06:51]
Additional materials: www.superdatascience.com/503
9/7/2021 • 1 hour, 18 minutes, 6 seconds
SDS 502: Managing Imposter Syndrome
In this episode, I explore a common issue plaguing people across fields: imposter syndrome.
Additional materials: www.superdatascience.com/502
9/3/2021 • 4 minutes, 50 seconds
SDS 501: Statistical Programming with Friends
Jared Lander joins us to discuss his work as an R meetup organizer, the upcoming virtual R Conference, and his work as a consultant for a variety of companies from metal workers to professional football teams.
In this episode you will learn:
• Jared’s R meetups and our professional history [3:27]
• NYHackR [6:42]
• The R Conference [13:25]
• R for Everyone [18:55]
• Lander Analytics [22:10]
• Job openings at Lander Analytics [25:04]
• R vs. Python [29:15]
• The importance of pizza in Jared’s life [32:19]
Additional materials: www.superdatascience.com/501
8/31/2021 • 41 minutes, 20 seconds
SDS 499: Data Meshes and Data Reliability
Barr Moses joins us to discuss the importance of data reliability for pipelines and how companies can achieve data mesh.
In this episode you will learn:
• Data meshes [4:25]
• Self-serve data reliability [15:36]
• How Monte Carlo helps data up time [21:13]
• How to build an effective data science team [26:50]
• LinkedIn Q&A [31:50]
Additional materials: www.superdatascience.com/499
8/24/2021 • 53 minutes, 51 seconds
SDS 500: Yoga Nidra with Jes Allen
In this very special episode, we delve into a live yoga Nidra practice with Jes Allen and go over how you can open up to consciousness through yoga practice.
In this episode you will learn:
• [3:40] What Yoga means
• [10:00] Jes’s current work as a yoga practitioner
• [22:31] How to find Jes online
• [27:09] The Yoga Nidra practice
• [54:50] Coming out of the practice
Additional materials: www.superdatascience.com/500
8/24/2021 • 1 hour, 29 seconds
SDS 498: How Only Beginners Know Everything
In this episode, I dive into a reoccurring pattern I’ve noticed where beginners, myself included, think they’re more skilled and experienced than they really are.
Additional materials: www.superdatascience.com/498
8/20/2021 • 5 minutes, 53 seconds
SDS 497: Maximizing the Global Impact of Your Career
Benjamin Todd joins us to discuss his work helping professionals maximize their career capital, the top skills to learn across professions, and more.
In this episode you will learn:
• How Benjamin helped me become a data scientist [6:56]
• How did 80,000 Hours come about? [9:39]
• The impact of 80,000 Hours [14:46]
• Funding [17:23]
• Where does the name come from? [23:32]
• What kind of advice does Benjamin give to people? [25:21]
• How data scientists can make an impact [42:04]
• How can someone strategize about their career? [1:02:53]
• Top skills that everyone should learn [1:05:49]
Additional materials: www.superdatascience.com/497
8/17/2021 • 1 hour, 19 minutes, 5 seconds
SDS 496: 2040: A Brain-Computer Interface Story
In this episode, you’ll enjoy a fictional narrative I’ve titled “2040: A Brain-Computer Interface Story”.
Additional materials: www.superdatascience.com/496
8/13/2021 • 3 minutes, 56 seconds
SDS 495: Successful AI Projects and AI Startups
Greg Coquillo joins us to discuss his work on ROI for startups and the best ways to make the most of your company’s AI investment.
In this episode you will learn:
• Our connection through Harpreet’s happy hours and DSGO [4:48]
• Greg’s content on LinkedIn [6:40]
• The scope of Greg’s work [9:25]
• Making the most out of AI [16:05]
• LinkedIn Q&A [20:00]
• Quantum machine learning [32:06]
Additional materials: www.superdatascience.com/495
8/10/2021 • 49 minutes, 42 seconds
SDS 494: How to Instantly Appreciate Being Alive
In this episode, I talk about an interesting thought experiment that helps you appreciate your existence.
Additional materials: www.superdatascience.com/494
8/6/2021 • 2 minutes, 50 seconds
SDS 493: Bringing Data to the People
Anjali Shrivastava joins us to discuss her data science degree and her content creation efforts to bring data science to the people.
In this episode you will learn:
• Anjali’s studies [2:00]
• Anjali’s YouTube channel [11:57]
• The content creation process [17:58]
• Yoga during the pandemic [21:34]
• Anjali as a writer [24:38]
• Anjali’s dual degrees [31:28]
• Anjali’s previous data science roles [43:04]
• Anjali’s first full-time data job [51:12]
• Anjali’s hopes for the future [55:29]
Additional materials: www.superdatascience.com/493
8/3/2021 • 1 hour, 2 minutes, 41 seconds
SDS 492: The World is Awful (and it's Never Been Better)
In this episode, I discuss the changing child mortality rate as evidence of how much better the world is and how much better it could be.
Additional materials: www.superdatascience.com/492
7/30/2021 • 5 minutes, 48 seconds
SDS 491: R in Production
Veerle van Leemput joins us to make the case for why you should be using R for production.
In this episode you will learn:
• Our shared powerlifting passion [2:47]
• The stigma of using R [12:02]
• What does Analytic Health do? [13:55]
• How Analytic Health uses R [19:08]
• Tidyverse [34:44]
• Tools for API creation [37:09]
Additional materials: www.superdatascience.com/491
7/27/2021 • 43 minutes, 33 seconds
SDS 490: Say No to Pie Charts
In this episode, I discuss why you should avoid the visually pleasing but flawed pie chart.
Additional materials: www.superdatascience.com/490
7/23/2021 • 1 minute, 59 seconds
SDS 489: Monetizing Machine Learning
Vin Vashishta joins us to discuss his AI consulting work and his philosophy on AI strategy for monetization.
In this episode you will learn:
• V-Squared [4:59]
• Vin’s online content [17:18]
• Low-code/no-code in data science [25:33]
• Top five gap skills [35:19]
• Data sets for insights on consumers and targeting [40:26]
• Are there socially beneficial data science and machine learning applications? [43:16]
• The most difficult data science problem Vin ever faced [50:39]
Additional materials: www.superdatascience.com/489
7/20/2021 • 1 hour, 13 seconds
SDS 488: The Price of Your Attention
In this episode, I discuss the simple and cheap ways you can buy yourself more time during the day.
Additional materials: www.superdatascience.com/488
7/16/2021 • 3 minutes, 36 seconds
SDS 487: Fixing Dirty Data
Susan Walsh joins us to discuss the importance of data cleaning and normalization and how clean procurement data can save companies money.
In this episode you will learn:
• Susan’s “COAT” system [7:16]
• The Classification Guru [15:39]
• Case studies [22:46]
• Susan’s book [30:26]
Additional materials: www.superdatascience.com/487
7/13/2021 • 43 minutes, 12 seconds
SDS 486: The History of Calculus
In this episode, I go over the world history of calculus and how we still use these techniques today.
Additional materials: www.superdatascience.com/486
7/9/2021 • 6 minutes, 20 seconds
SDS 485: Financial Data Engineering
Doug Eisenstein joins us for a great and in-depth conversation on data engineering in the financial sector.
In this episode you will learn:
• The founding of Advanti [4:37]
• Aristos and solution products [16:45]
• The kinds of financial industries and how Doug helps [26:25]
• Entity Extraction [34:27]
• Temporality data [44:27]
• How to work with Doug [58:19]
Additional materials: www.superdatascience.com/485
7/6/2021 • 1 hour, 5 minutes, 33 seconds
SDS 484: Algorithm Aversion
In this episode, I discuss interesting research on why humans are so quick to lose faith in algorithms.
Additional materials: www.superdatascience.com/484
7/2/2021 • 2 minutes, 55 seconds
SDS 483: Setting Yourself Apart in Data Science Interviews
Andrew Jones joins us to discuss data science interviews and how you can maximize your chances on interview time, resume, and more!
In this episode you will learn:
• Data Science Infinity [5:40]
• “The Essential AI and Data Science Handbook for Recruitment” [17:40]
• How can aspiring data scientists set themselves apart? [21:30]
• What skillset should data scientists have? [34:36]
• Should data science be trying to be data engineers? [41:14]
• How can organizations ensure data science projects are a success? [50:50]
Additional materials: www.superdatascience.com/483
6/29/2021 • 1 hour, 4 minutes, 27 seconds
SDS 482: The Continuous Calendar
In this episode, I talk about the advantages of using a continuous calendar.
Additional materials: www.superdatascience.com/482
6/25/2021 • 4 minutes, 48 seconds
SDS 481: Performance Marketing Analytics
Kris Tait joins us to discuss the vast world of digital performance marketing and how automation, data, and optimization play an important role.
In this episode you will learn:
• What is performance marketing? [3:29]
• How can advertisers take advantage of these tactics? [13:04]
• The importance of quality data in performance marketing [20:19]
• Human value performance marketing [25:30]
• How does Croud optimize? [29:05]
• What are the best KPIs in this industry? [34:02]
• Roles available at Croud now [39:11]
• Typical tools at Croud [42:43]
• What clients work best for Croud? [48:56]
Additional materials: www.superdatascience.com/481
6/22/2021 • 58 minutes, 48 seconds
SDS 480: Top Five Resume Tips
In this episode, I go over my top 5 tips for refining your perfect data science resume.
Additional materials: www.superdatascience.com/480
6/18/2021 • 8 minutes
SDS 479: Knowledge Graphs
Maureen Teyssier joins us to discuss the cutting-edge work Reonomy is doing in commercial property real estate and her views and tips on building a great data science team.
In this episode you will learn:
• Maureen’s work with Reonomy [5:40]
• Knowledge graphs and use cases [7:35]
• Other tools Reonomy uses [18:58]
• What Maureen looks for in potential hires, soft skills and hard skills [26:28]
• Hiring at Reonomy [41:40]
• Maureen’s tips for growing a data science team [48:55]
• Tools to transition from academia to industry [52:45]
Additional materials: www.superdatascience.com/479
6/15/2021 • 1 hour, 12 minutes, 38 seconds
SDS 478: Five Keys to Success
In this episode, I go over my 5 keys to success to tackle any goal.
Additional materials: www.superdatascience.com/478
6/11/2021 • 5 minutes, 33 seconds
SDS 477: How to Thrive as an Early-Career Data Scientist
Sidney Arcidiacono joins us to discuss her studies and work at Make School and her interest in utilizing AI for healthcare, as well as her tips and strategies for becoming a successful early-career data scientist.
In this episode you will learn:
• What is Make School? [5:00]
• Sidney’s interest in AI and computer science [10:56]
• Graph theory and graph convolutional neural networks [19:53]
• What tools does Sidney use for her work? [31:16]
• Sidney’s internship [36:52]
• How other beginners can get involved in data science [38:12]
• Sidney’s goals [41:57]
Additional materials: www.superdatascience.com/477
6/8/2021 • 50 minutes, 28 seconds
SDS 476: Peer-Driven Learning
In this episode, I discuss the amazing benefits of implementing peer-driven learning in your professional life.
Additional materials: www.superdatascience.com/476
6/4/2021 • 2 minutes, 58 seconds
SDS 475: The 20% of Analytics Driving 80% of ROI
David Langer joins us to discuss his work as a data analytics educator and his beliefs in the use of Excel, SQL and R in analytics work.
In this episode you will learn:
• Intro to Dave on Data [6:50]
• 20% analytics that drives 80% of ROI [11:04]
• The benefits of SQL [19:15]
• The uses of R [24:50]
• Machine learning [34:15]
Additional materials: www.superdatascience.com/475
6/1/2021 • 44 minutes, 39 seconds
SDS 474: The Machine Learning House
In this episode, I discuss the architecture of a “machine learning house”, representing the skills and learnings you can use as foundations to build your data science career.
Additional materials: www.superdatascience.com/474
5/28/2021 • 5 minutes, 44 seconds
SDS 473: Machine Learning at NVIDIA
Anima Anandkumar joins us to discuss her work as a researcher in machine learning at NVIDIA and a professor at CalTech, and how they often go hand-in-hand and inform each other.
In this episode you will learn:
• Anima’s recent discovery of yoga [5:20]
• How does Anima balance her work? [12:25]
• Applications of Anima’s work [14:45]
• Tensors [22:55]
• Anima’s favorite NVIDIA projects [35:35]
• What tools does NVIDIA use? [41:55]
• CalTech interdisciplinary science [47:41]
• The path to generalized artificial intelligence [57:19]
• The skills to have to get into this field [1:00:27]
• LinkedIn questions for Anima [1:07:03]
Additional materials: www.superdatascience.com/473
5/25/2021 • 1 hour, 13 minutes, 10 seconds
SDS 472: The Learning Never Stops (so Relax)
In this episode, I share a note I received from a student who expressed his thoughts on the learning that never stops as he goes through his data science career.
Additional materials: www.superdatascience.com/472
5/21/2021 • 3 minutes, 23 seconds
SDS 471: 99 Days to Your First Data Science Job
Kirill Eremenko returns to the SDS podcast as a guest to debunk common myths you may believe about getting a data science job.
In this episode you will learn:
• What has Kirill been up to? [3:48]
• The genesis of the 99-days challenge [5:27]
• 5 myths about pursuing a data science career [15:49]
• First data science jobs [1:00:53]
• 5 components for success [1:08:19]
Additional materials: www.superdatascience.com/471
5/18/2021 • 1 hour, 40 minutes, 45 seconds
SDS 470: My Favorite Books
In this episode, I follow up on the popular book recommendation portion of the podcast with my own list of favorite books.
Additional materials: www.superdatascience.com/470
5/14/2021 • 6 minutes, 4 seconds
SDS 469: Learning Deep Learning Together
Konrad Körding joins us to discuss his work in educating the next generation in deep learning and his views on the importance of causality in deep learning research.
In this episode you will learn:
• Konrad’s academic background [3:54]
• Neuromatch Academy [5:23]
• Artificial general intelligence [35:02]
• Defining deep learning [41:24]
• Symbol representation [44:12]
• Konrad’s career journey [47:25]
• What other skills should you develop for the future? [52:46]
• What is the future of intelligence in our timeline? [56:37]
Additional materials: www.superdatascience.com/469
5/11/2021 • 1 hour, 11 minutes, 53 seconds
SDS 468: The History of Data
In this episode, I tackle another historical topic: the history of data.
Additional materials: www.superdatascience.com/468
5/7/2021 • 7 minutes, 46 seconds
SDS 467: High-Impact Data Science Made Easy
Noah Gift joins us to discuss how he believes data science urgency and the end of hierarchies will change the world for the better.
In this episode you will learn:
• Catch up with Noah [2:50]
• Educational options to pursue in data science [13:09]
• Outside university education [24:06]
• Noah as a prolific author [28:15]
• Urgent applications of technology [37:34]
• Noah’s income streams color code [48:38]
• How to harness our free time to solve big problems [54:13]
• Noah’s Coursera course [1:09:12]
Additional materials: www.superdatascience.com/467
5/4/2021 • 1 hour, 16 minutes, 48 seconds
SDS 466: Good vs. Great Data Scientists
In this episode, I go over what separates a good data scientist from a great one in skills, practices, and approach.
Additional materials: www.superdatascience.com/466
4/30/2021 • 7 minutes, 55 seconds
SDS 465: Analytics for Commercial and Personal Success
Konrad Kopczynski joins us to discuss how data, tracking, analytics, and key performance indicators can help your professional and personal development.
In this episode you will learn:
• What does Konrad do [3:40]
• Tools and techniques used in Impakt Advisors [10:35]
• Impakt’s unique hiring model [18:53]
• How does Impakt manage remote work [21:36]
• Konrad’s professional history and daily structure [28:42]
• Konrad’s Iron Man triathlon [44:11]
• Konrad’s years’ long project on presidential biographies [47:46]
Additional materials: www.superdatascience.com/465
4/27/2021 • 59 minutes, 4 seconds
SDS 464: A.I. vs Machine Learning vs Deep Learning
In this episode, I tackle three often conflated terms - AI, machine learning, and deep learning - to shine some light on what exactly they are.
Additional materials: www.superdatascience.com/464
4/23/2021 • 7 minutes, 14 seconds
SDS 463: Time Series Analysis
Matt Dancho joins us to discuss his various packages for time series analysis and his courses on the topic through his company Business Science.
In this episode you will learn:
• How Matt got into time series library development [4:22]
• Business Science [7:00]
• R Shiny [9:36]
• Matt’s 6 time series models [14:11]
• Timetk [15:02]
• Modeltime [29:32]
• Gluon package [36:04]
• Modeltime Ensemble [43:12]
• Modeltime H2O [45:22]
• Modeltime Resample [48:10]
Additional materials: www.superdatascience.com/463
4/20/2021 • 55 minutes, 51 seconds
SDS 462: It Could Be Even Better
In this episode, I discuss taking a positive approach to the good things that happen in life, rather than focusing on potential negative outcomes.
Additional materials: www.superdatascience.com/462
4/16/2021 • 4 minutes, 25 seconds
SDS 461: MLOps for Renewable Energy
Sam Hinton joins us to discuss his work since assisting COVID-19 data pipelines, now working in renewable energy and applications of ML and MLOps for the industry.
In this episode you will learn:
• Catching up with Sam [3:05]
• Updates on the COVID-19 data pipelines [7:07]
• Sam’s current work at Arenko [10:41]
• Sam’s stint on Survivor, PhD, and his software engineering background [16:32]
• Machine learning in renewable energy [35:23]
• Sam’s day-to-day tools [49:33]
• How can listeners utilize MLOps [53:08]
• Sam’s forthcoming novel [59:05]
Additional materials: www.superdatascience.com/461
4/14/2021 • 1 hour, 10 minutes, 25 seconds
SDS 460: The History of Algebra
In this episode, I talk about the ancient history of algebra, an important component of data science today.
Additional materials: www.superdatascience.com/460
4/9/2021 • 11 minutes, 4 seconds
SDS 459: Tackling Climate Change with ML
Vince Petaccio joins us to discuss how he sees data science, ML, and AI making positive impacts in the fight against climate change.
In this episode you will learn:
• Where in the world is Vince? [2:08]
• Vince’s interest in climate science [4:33]
• The Citizen’s Climate Lobby [9:12]
• Where data science comes in [13:28]
• Risks of relying on tools [31:54]
• How can you make an impact? [37:28]
Additional materials: www.superdatascience.com/459
4/7/2021 • 46 minutes, 17 seconds
SDS 458: Behind the Scenes
In this week’s episode, I take you behind the scenes of our video tutorial productions to see what goes into making our tutorials.
Additional materials: www.superdatascience.com/458
4/2/2021 • 4 minutes, 1 second
SDS 457: Landing Your Data Science Dream Job
Harpreet Sahota joins us to discuss his data science mentorship work outside his day job and how you can land your dream job.
In this episode you will learn:
• Harpreet’s current life and location [2:25]
• Data Community Content Creator Awards [8:37]
• The Artists of Data Science Podcast [14:46]
• Data Science Dream Job [24:18]
• Harpreet’s day job at Price Industries [30:48]
• Coming in data science from a non-data background [40:55]
• Tools and skills to know [47:57]
Additional materials: www.superdatascience.com/457
4/1/2021 • 1 hour, 1 minute, 23 seconds
SDS 456: The Pomodoro Technique
In this week’s episode, I talk about one of my favorite time management techniques: the Pomodoro technique.
Additional materials: www.superdatascience.com/456
3/26/2021 • 6 minutes, 51 seconds
SDS 455: Legal Tech, Powered by Machine Learning
Horace Wu joins us to discuss his work on Syntheia, a unique product that helps sift through massive amounts of legal data to augment the capacities and function of law firms.
In this episode you will learn:
• Horace’s life and work in New York City [5:00]
• Syntheia and Horace’s role there [6:25]
• Horace’s background [12:07]
• Nearmap [16:35]
• Syntheia NLP use cases [21:46]
• Design, coding, and the team [34:19]
• What skills does one need for this field? [41:41]
• What would Horace do differently and what is he excited for? [46:15]
Additional materials: www.superdatascience.com/455
3/24/2021 • 58 minutes, 21 seconds
SDS 454: The Staggering Pace of Progress Part 2
In this episode, I continue my discussion about the quick-paced growth of technology and how it impacts different fields.
Additional materials: www.superdatascience.com/454
3/19/2021 • 6 minutes, 56 seconds
SDS 453: Big Global Problems Worth Solving with Machine Learning
Stephen Welch joins to go over his year-end 2020 list of 10 important questions and pain points that machine learning can improve.
In this episode you will learn:
• Welch Labs on YouTube [4:54]
• What Stephen’s been up to [7:56]
• Stephen’s 2020 year-end blog post [10:11]
• Stephen’s reflections on 10 areas worth focusing on [16:25]
Additional materials: www.superdatascience.com/453
3/17/2021 • 1 hour, 21 minutes, 55 seconds
SDS 452: The Staggering Pace of Progress
In this week’s episode, I discuss how technology propelled the recruitment industry forward and continues to do so today.
Additional materials: www.superdatascience.com/452
3/12/2021 • 5 minutes, 51 seconds
SDS 451: Translating PhD Research into ML Applications
Dan Shiebler joins us to discuss his category theory Ph.D. program, his full-time job at Twitter, and how the two crossover and combine in his overall data work.
In this episode you will learn:
• Dan’s neuroscience undergrad and MATLAB [4:12]
• Dan’s Ph.D. timeline and research [14:01]
• How to start a Ph.D. while working full time [22:45]
• Dan’s work at TrueMotion and label data [30:39]
• Dan’s title and role at Twitter [39:15]
• Specific projects at Twitter [44:09]
• What skills someone should bring to a Twitter job interview [52:06]
• What machine learning approaches will be important in the future? [1:00:38]
Additional materials: www.superdatascience.com/451
3/11/2021 • 1 hour, 16 minutes, 13 seconds
SDS 450: Yoga Nidra
This week, Jon talks with Steve Fazzari about the physical and emotional benefits of practicing Yoga Nidra.
Additional materials: www.superdatascience.com/450
3/5/2021 • 30 minutes, 5 seconds
SDS 449: Fairness in A.I.
Ayodele Odubela joins us to discuss fairness in AI and how we can work towards a more equitable and transparent world of data science and machine learning.
In this episode you will learn:
• Comet ML [3:22]
• What is a data science evangelist? [7:08]
• FullyConnected [12:04]
• Imposter Syndrome and Ayodele’s book [15:57]
• What Ayodele wished she learned from grad school [20:25]
• Uncovering Bias in Machine Learning [27:00]
• Where can we affect this positive change in fairness? [31:08]
• The potential for a rosy future [49:20]
• Ayodele’s LinkedIn Learning course [52:24]
Additional materials: www.superdatascience.com/449
3/4/2021 • 59 minutes, 33 seconds
SDS 448: How to be a Data Science Leader
This week, I answer your questions about how to take yourself from data science practitioner to data science leader.
Additional materials: www.superdatascience.com/448
2/26/2021 • 5 minutes, 21 seconds
SDS 447: Commercial ML Opportunities Lie Everywhere
Michael Segala joins us to discuss how machine learning can provide creative and novel solutions to longstanding problems in both the private and public sectors.
In this episode you will learn:
• SFL Scientific [4:20]
• SFL’s example work [10:55]
• Public sector vs private sector work [20:28]
• Michael’s day-to-day [30:18]
• What is Michael looking for in the people he hires? [33:38]
• Michael’s career journey [41:39]
• What is Michael excited about for the future? [48:38]
Additional materials: www.superdatascience.com/447
2/25/2021 • 58 minutes, 15 seconds
SDS 446: Getting Started in Machine Learning
This week I answer your questions about machine learning and how to educate yourself further in the field.
Additional materials: www.superdatascience.com/446
2/19/2021 • 6 minutes, 36 seconds
SDS 445: Conversational A.I.
Sinan Ozdemir joins us to share his work in conversational AI and what it takes to keep chatbots up to date and functional in an ever-changing world.
In this episode you will learn:
• Kylie.ai under Directly [4:51]
• Sinan’s day-to-day work and tools [10:45]
• Use cases [18:27]
• AutoML’s role in these processes [21:55]
• What hard or soft skills are needed for this work? [29:32]
• Sinan’s background in teaching [34:58]
• Sinan’s history in pure math and applied math [39:44]
• Sinan’s math tattoos [43:48]
Additional materials: www.superdatascience.com/445
2/18/2021 • 54 minutes, 42 seconds
SDS 444: Future-Proofing Your Career
In today’s episode, I answer your questions on how to best future-proof your data science career in AI, AutoML, and model interpretability.
Additional materials: www.superdatascience.com/444
2/12/2021 • 5 minutes, 41 seconds
SDS 443: The End of Jobs
Jeff Wald joins us to discuss his book and the research he has done into the data and trends around the job market, the decline of the 9-5 office job, and more.
In this episode you will learn:
• The Birthday Rules [3:51]
• A history of work [7:41]
• The myth of the lifetime contract [12:15]
• What the data says about now [21:02]
• On-demand labor market [25:34]
• Remote work [32:09]
• What role will automation play? [46:27]
• Future of employment from the study lens [48:30]
Additional materials: www.superdatascience.com/443
2/11/2021 • 1 hour, 7 minutes, 7 seconds
SDS 442: Data Science as an Atomic Habit
In today’s episode, I discuss how focusing on process and habit building can provide more for you and your professional progress than simply chasing a goal.
Additional materials: www.superdatascience.com/442
2/5/2021 • 7 minutes, 3 seconds
SDS 441: Communicating Data Effectively
Kate Strachnyi joins us to discuss her work in data visualization education from conferences to published books as well as her tips for visualization best practices.
In this episode you will learn:
• What does Kate do (from her children’s perspective) [1:56]
• What kind of tools does Kate employ? [5:19]
• Kate’s day-to-day [13:03]
• DATAcated Conference [16:03]
• How do you amass a big LinkedIn following? [20:39]
• Kate’s four published books [29:55]
• The guidelines to follow to succeed in this field [37:00]
• What’s next for Kate? [41:24]
Additional materials: www.superdatascience.com/441
2/4/2021 • 55 minutes, 30 seconds
SDS 440: MuZero: Learning Without Rules
In this episode, I continue my discussion on the leaps we’re making towards AGI, by looking at MuZero.
Additional materials: www.superdatascience.com/440
1/29/2021 • 5 minutes, 17 seconds
SDS 439: Deep Learning for Machine Vision
Deblina Bhattacharjee joins us to talk about her amazing work in computer vision and give advice for getting into and excelling in the field.
In this episode you will learn:
• Deblina’s master’s program work [4:03]
• Deblina’s computer vision research and Ph.D. [11:46]
• Deblina’s drumming hobby [20:18]
• The daily work [24:40]
• What key skills do you need as a data scientist? [33:21]
• How can a data scientist prepare for the future? [37:03]
• How does Deblina tackle time management? [40:24]
Additional materials: www.superdatascience.com/439
1/28/2021 • 48 minutes, 32 seconds
SDS 438: Artificial General Intelligence
In this episode, I discuss DeepMind’s latest breakthrough towards AGI and the stepping stones that got them there.
Additional materials: www.superdatascience.com/438
1/22/2021 • 6 minutes, 15 seconds
SDS 437: Data Science at a World-Leading Hedge Fund
Claudia Perlich joins us to discuss her work at one of the world’s largest hedge funds and how she got to work there, as well as her history of winning data science competitions.
In this episode you will learn:
• Life and work during the pandemic [2:23]
• Claudia’s history with horses and riding [8:28]
• Claudia’s work at Two Sigma [12:00]
• Claudia’s role on a daily basis [20:51]
• Tools of the trade [30:27]
• What Claudia looks for when hiring [36:37]
• What skills do future hires need? [40:32]
• Claudia’s history with data science competitions [48:22]
• Why work in finance and at Two Sigma? [1:00:19]
Additional materials: www.superdatascience.com/437
1/20/2021 • 1 hour, 13 minutes, 59 seconds
SDS 436: Attention Sharpening Tools Part 2
In this episode, I continue my discussion on daily mindfulness practice and how to form a growing habit in it.
Additional materials: www.superdatascience.com/436
1/15/2021 • 7 minutes, 6 seconds
SDS 435: Scaling Up Machine Learning
Erica Greene joins us to discuss her work as a machine learning manager at Etsy, how they tackle problem-solving, how they implement ML scaling, and more.
In this episode you will learn:
• Erica’s role at Etsy and problem solving between platforms [2:28]
• Interesting failures Erica has navigated [25:40]
• How does Erica’s team select problems to solve [33:07]
• Engineering at scale [40:15]
• What does Erica’s working day look like? [46:30]
• Etsy is hiring [53:00]
• Diversity in hiring [57:12]
• Do data scientists need PhDs? [1:01:26]
Additional materials: www.superdatascience.com/435
1/14/2021 • 1 hour, 9 minutes, 58 seconds
SDS 434: Attention Sharpening Tools Part 1
In this episode, I discuss my use of mindfulness and attention sharpening tools to boost my productivity throughout the day.
Additional materials: www.superdatascience.com/434
1/8/2021 • 6 minutes, 17 seconds
SDS 433: Data Science Trends for 2021
Ben Taylor joins us for the fourth time to discuss the upcoming 2021 trends in the world of data science as well as the post-COVID world.
In this episode you will learn:
• Ben’s passion for AI [9:41]
• Delivering results and KPIs [12:43]
• DataRobot and AutoML [20:38]
• Transparent storytelling [24:29]
• Federated learning [31:37]
• ML productionization [37:01]
• AI ethics [46:01]
• Emerging software packages/tools [54:39]
• Remote work [1:02:44]
Additional materials: www.superdatascience.com/433
1/7/2021 • 1 hour, 17 minutes, 35 seconds
SDS 432: Hello from Jon and Welcome to 2021
In this episode, I introduce myself, Jon Krohn, as the new host of the SuperDataScience podcast and give you a taste of what to look forward to in 2021!
Additional materials: www.superdatascience.com/432
1/1/2021 • 4 minutes, 25 seconds
SDS 431: One-on-one with Kirill: What I learned in 2020
In this final episode featuring Kirill as the host, he examines and presents his top 7 learnings from this unprecedented year.
In this episode you will learn:
• Backpain and standing desks [5:41]
• The internal conflict model [14:33]
• What acceptance really means [38:32]
• Intellect and Intelligence [58:10]
• Needs vs. wants/desires/wishes [1:08:00]
• Intention vs effect [1:25:51]
• Do not take things personally [1:46:12]
Additional materials: www.superdatascience.com/431
12/31/2020 • 1 hour, 59 minutes, 20 seconds
SDS 430: Intellect and Intelligence
In this episode, I talk about the reasoning behind my decision to step down as the host of the SDS podcast.
Additional materials: www.superdatascience.com/430
12/25/2020 • 15 minutes, 23 seconds
SDS 429: 2020's Biggest Data Science Breakthroughs
Jon Krohn joins us for a year-end episode about 2020’s biggest data science breakthroughs and for a big podcast announcement for 2021.
In this episode you will learn:
• Global warming [4:37]
• Our big podcast announcement [6:57]
• Who is Jon Krohn? [12:14]
• Top 3 technological breakthroughs of the year [21:28]
• AlphaFold [23:33]
• GPUs [45:51]
• GPT-3 [1:00:26]
• Wrap up [1:26:40]
Additional materials: www.superdatascience.com/429
12/24/2020 • 1 hour, 31 minutes, 11 seconds
SDS 428: The Internal Conflict Model
In this episode, I talk about a very interesting concept around expectations and reality, and how the gap between the two might be affecting us.
Additional materials: www.superdatascience.com/428
12/18/2020 • 32 minutes, 56 seconds
SDS 427: Impacting Through Technology
Syafri Bahar joins us for a great conversation about his work at GOJEK, a decacorn super app bringing services to Indonesia, and his philosophy of empowered data science teams.
In this episode you will learn:
• Syafri’s day job at GOJEK [11:26]
• What is a super app? [14:50]
• The data science department at GOJEK [19:47]
• High-performance data science team [31:17]
• Syafri’s career journey and love of math [39:49]
• Apply to work at GOJEK [55:42]
• Working for the benefit of others [1:00:21]
Additional materials: www.superdatascience.com/427
12/17/2020 • 1 hour, 12 minutes, 6 seconds
SDS 426: The Shift: From Ambition to Meaning
In this episode, I talk about something profoundly important for me this year in shifting away from ego-driven ambition towards non-materialistic meaning in your life and work.
Additional materials: www.superdatascience.com/426
12/11/2020 • 17 minutes, 14 seconds
SDS 425: The Past, Present, and Future of AI Services
Rama Akkiraju joins us to discuss the past, present, and future of AI services and how companies and data scientists can best prepare themselves to become AI consumers.
In this episode you will learn:
• 23 years at IBM, before and after data science [6:11]
• IBM Watson and AI services [12:25]
• Skills to utilize AI services [25:02]
• How to achieve significant ROI on AI deployment [41:31]
• What does the AI future look like to Rama? [52:41]
• Ethics and the benefits of AI [1:04:37]
Additional materials: www.superdatascience.com/425
12/10/2020 • 1 hour, 14 minutes, 16 seconds
SDS 424: A Symbiotic Relationship With AI
In this episode, we talk about how businesses can maximize their relationship with AI to ensure visible ROI and progress of industries.
Additional materials: www.superdatascience.com/424
12/4/2020 • 9 minutes, 17 seconds
SDS 423: The Growth and Future of STEM in Africa
Amanda Obidike joined us for a great discussion about her work in Nigeria and the African continent in empowering and enabling STEM education and job placement.
In this episode you will learn:
• Life in Lagos, Nigeria [5:22]
• Amanda’s journey to data science [7:28]
• Case studies and example projects [13:00]
• STEM skills and the start of STEMi [19:41]
• What are the issues STEMi is addressing? [24:48]
• Get involved in STEMi’s mentoring project [30:12]
• STEMi’s results so far [36:02]
• Amanda’s best tips for landing jobs [39:04]
• Work in promoting education and literacy [45:19]
• The progress of STEM in Africa [47:34]
Additional materials: www.superdatascience.com/423
12/3/2020 • 1 hour, 14 seconds
SDS 422: Pain Vs. Suffering
In this episode, I talk about the difference between pain and suffering and the importance of becoming aware of it.
Additional materials: www.superdatascience.com/422
11/27/2020 • 10 minutes, 24 seconds
SDS 421: Real-World Applications of Digital Twins
Theunis Barnard joins us for a great conversation about digital twins and how data scientists can learn about the technology and get involved with its applications.
In this episode you will learn:
• Data science in South Africa [6:08]
• Theunis’s current companies [11:32]
• Industry 4.0 [13:59]
• Digital twins [22:37]
• Theunis’s day-to-day [38:54]
• Further examples of digital twins [42:26]
• Future of digital twins [48:02]
• Theunis’s advice for data science newcomers [57:17]
• Process digital twins vs. system digital twins [59:42]
Additional materials: www.superdatascience.com/421
11/26/2020 • 1 hour, 6 minutes, 23 seconds
SDS 420: Wheel of Life
In this episode, we do an exercise using the wheel of life to examine your time management and understand how balanced your life currently is.
Additional materials: www.superdatascience.com/420
11/20/2020 • 9 minutes, 4 seconds
SDS 419: Unlocking the Architecture of Innovation
Juval Löwy joins us for an exceptional episode that condenses much of his masterclass teachings into a powerful hour of information about the right approach to designing systems as well as projects.
In this episode you will learn:
• Career planning [7:24]
• Consequences of designing against requirements [8:57]
• The framework of a good system design [30:32]
• The right approach to project design [44:00]
• Juval’s book [1:02:31]
• The progress and future [1:03:48]
Additional materials: www.superdatascience.com/419
11/19/2020 • 1 hour, 12 minutes, 1 second
SDS 418: Play With Feeling
In this episode, I discuss a very interesting quote by Beethoven about the importance of giving space to feelings, even if that means making a mistake.
Additional materials: www.superdatascience.com/418
11/13/2020 • 6 minutes, 29 seconds
SDS 417: Data Engineering and Product Development
Arthur Shectman joins us to discuss the data engineering and data product development work they do in Elephant Ventures and the importance of capturing value through data.
In this episode you will learn:
• What is Elephant Ventures? [8:11]
• Data quality engineering [21:00]
• The importance of focusing on business value [39:58]
• Methodology for understanding the company’s business value [46:05]
• What is data engineering? [49:28]
• What is data product development [51:34]
• What are the technical skills needed for these jobs? [56:02]
• What is the future bringing for data science? [59:23]
Additional materials: www.superdatascience.com/417
11/12/2020 • 1 hour, 6 minutes, 26 seconds
SDS 416: My Advice for Career Success
In this episode, I talk about the three key ingredients for a successful, happy career in data science.
Additional materials: www.superdatascience.com/416
11/6/2020 • 16 minutes, 24 seconds
SDS 415: Developing and Maintaining Your Technical and Soft Skills
Asieh Ahani joins us to discuss her rapid career progress, the unique work she does at MassMutual, and how she maintains her technical skills while working in a leading position.
In this episode you will learn:
• Asieh’s background [5:09]
• Machine learning techniques for processing biosignals [14:24]
• Signal processing [22:22]
• Asieh’s career and move from academia to industry [27:19]
• Maintaining technical skills as a manager [41:48]
• MassMutual is hiring [47:02]
• Leading a remote data science team and work/life balance [49:55]
• Asieh’s words for other women in data science [55:28]
• Future of data science [1:00:30]
Additional materials: www.superdatascience.com/415
11/5/2020 • 1 hour, 8 minutes, 44 seconds
SDS 414: Needs vs. Wants
Today I talked about the importance of understanding the balance between acting selfishly and acting with self-neglect and how the awareness of our needs and wants can help with that.
Additional materials: www.superdatascience.com/414
10/30/2020 • 13 minutes, 58 seconds
SDS 413: Changing The World With Data
Emmanuel Letouzé discussed in-depth his work in global data science literacy and how he hopes data science will benefit the world in various societal and socio-economic challenges.
In this episode you will learn:
• Parenting and its effects on Emmanuel’s life and work [3:14]
• Why did Data-Pop Alliance come to life? [8:42]
• Working with Harvard and MIT [13:04]
• Examples of projects and areas of focus [18:16]
• Data as lenses and data as lever [29:43]
• Sustainable Development Goals indicators [38:21]
• How can we use data as a lever? [43:41]
• How can data help with disaster resilience? [57:09]
• The future of data science [1:04:09]
Additional materials: www.superdatascience.com/413
10/28/2020 • 1 hour, 16 minutes, 14 seconds
SDS 412: Stand More - Sit Less
Today I talked with a chiropractor about how to best treat your back while working during the day.
Additional materials: www.superdatascience.com/412
10/23/2020 • 12 minutes, 16 seconds
SDS 411: Succeeding in Analytics by Thinking Outside the Data
Jennifer Cooper talked with us about her role as a strategic analyst and how others can get involved with similar positions around analytics and hybrid roles.
In this episode you will learn:
• Jennifer’s start in data science [6:04]
• What is analytics support function? [16:01]
• Keys to success in analytics roles [21:09]
• How do you find these roles? [42:42]
• DataScienceGO Virtual #2 [50:45]
• Common questions Jennifer gets [1:00:52]
Additional materials: www.superdatascience.com/411
10/21/2020 • 1 hour, 16 minutes, 32 seconds
SDS 410: Communicate Your Needs
Today I talk about something important, which I recently had to reteach myself, about personal needs and communication.
Additional materials: www.superdatascience.com/410
10/16/2020 • 7 minutes, 13 seconds
SDS 409: Succeeding & Networking In The Virtual Space
Steve Nouri talks with us about the importance of managing your personal brand, participating in hackathons, and being active in the conversations around AI as you begin your career.
In this episode you will learn:
• Steve’s work in the Australian Computer Society [4:32]
• River City Labs [12:22]
• Hackathons during the pandemic [16:21]
• Choosing a path in AI [26:09]
• The AI bubble and its implications [31:09]
• Strategic data acquisition [38:04]
• Explainable AI [43:50]
• Creating a personal brand [51:35]
Additional materials: www.superdatascience.com/409
10/14/2020 • 1 hour, 9 minutes, 6 seconds
SDS 408: Meaning is Everything
Today I talk about an interesting concept that can often be the cause of conflicts in professional and personal relationships.
Additional materials: www.superdatascience.com/408
10/9/2020 • 12 minutes, 39 seconds
SDS 407: How to Encourage Diversity in Data Science
Margot Gerritsen joins us for a great discussion that was both technical and inspiring, on the topics of principal component analysis and linear algebra, as well as the importance of women in data science.
In this episode you will learn:
• Margot’s travels and background [7:29]
• Margot’s position and work at Stanford [13:38]
• What is linear algebra? [18:00]
• Principle component analysis [23:02]
• WIDS, Women in Data Science [32:08]
• Margot’s diversity call to action [58:12]
• How can men support their female colleagues? [1:05:55]
Additional materials: www.superdatascience.com/407
10/7/2020 • 1 hour, 23 minutes, 45 seconds
SDS 406: Abandon Hope
Today we discussed the Buddhist concept “abandon hope” as a way to avoid falling victim to negative emotions and fear.
Additional materials: www.superdatascience.com/406
10/2/2020 • 7 minutes, 45 seconds
SDS 405: The Work of Quants and Data Scientists in the Financial Space
Thomas Obrist joins us to give an advanced talk on the work he does in the financial and energy space as a quant and how it overlaps with data science.
In this episode you will learn:
• Thomas’s background and studies [5:04]
• Long and short in financial markets [8:33]
• Thomas’s current role at Axpo [14:55]
• Quant vs. data scientist vs. data analyst [18:55]
• The Monte Carlo method [26:26]
• Thomas’s day-to-day [30:06]
• Grid loss use case [35:25]
• Thomas’s hackathon success [53:22]
• Thomas’s recommendation for those interested in the space [1:01:39]
Additional materials: www.superdatascience.com/405
9/30/2020 • 1 hour, 9 minutes
SDS 404: The Narrative Arc in Storytelling
Today we dissect the building blocks of storytelling to help you become a better presenter of your data science insights.
Additional materials: www.superdatascience.com/404
9/25/2020 • 15 minutes, 50 seconds
SDS 403: Gamifying Your Data Science Work and Education
Juan Gabriel Gomila Salas joins for an exciting discussion about his work in the game industry and how gamification can boost data science impact across industries.
In this episode you will learn:
• Juan Gabriel’s work before and during COVID-19 [3:37]
• Juan Gabriel’s unique career path [10:36]
• Video game monetization case study [25:44]
• How can data scientists utilize gamification in their daily jobs? [36:28]
• Juan Gabriel’s work as a professor [42:46]
• Is online education the future? [47:40]
• Data science in the English speaking world vs the Spanish speaking world [52:30]
• Where is data science headed? [59:45]
Additional materials: www.superdatascience.com/403
9/23/2020 • 1 hour, 16 minutes, 31 seconds
SDS 402: Face Your Demons
In this episode, I discuss an interesting metaphor I’ve recently utilized to help myself face and overcome toxic or negative feelings.
Additional materials: www.superdatascience.com/402
9/18/2020 • 6 minutes, 21 seconds
SDS 401: From Data Science Student to Professional
Michael Galarnyk joins to tackle your questions on data science job hunting and data science education.
In this episode you will learn:
• Who is Michael Galarnyk? [3:48]
• Tools and skills to know [11:52]
• Building and sharing a portfolio [26:21]
• Advantages of online and in-person education [37:42]
• Teaching data science to younger students [43:55]
• Necessary soft skills to be a successful data scientist [51:31]
Additional materials: www.superdatascience.com/401
9/16/2020 • 1 hour, 5 minutes, 27 seconds
SDS 400: Think Bigger
In this anniversary episode, we discuss the importance of knowing why you do data science and how your skills may one day impact the world as challenges arise.
Additional materials: www.superdatascience.com/400
9/11/2020 • 6 minutes, 2 seconds
SDS 399: Contributing to the Community of Data Scientists
Monica Royal joins us to discuss her journey from consumer to contributor in the data science community and how sharing your work and exploring networking can help you on your journey.
In this episode you will learn:
• Monica’s activity in the data science community [5:17]
• The biggest takeaways from Monica’s 100 Days of Learnings [11:00]
• Techniques for productivity and continued learning [16:03]
• Monica’s interest in the SDS podcast and keeping up to date in data science [33:01]
• The DataScienceGO Virtual experience [35:51]
• Strategic thinking [38:38]
• Monica’s parting inspirational thoughts [41:01]
Additional materials: www.superdatascience.com/399
9/9/2020 • 47 minutes, 20 seconds
SDS 398: Emotional Burnout
In this episode, I discuss a very important topic on the stages and symptoms of burnout and how to tackle them at each point to avoid irreparable damage.
Additional materials: www.superdatascience.com/398
9/4/2020 • 21 minutes, 43 seconds
SDS 397: The Importance of Data Science Literacy
We chatted with data science influencer, educator, and principal data scientist Kirk Borne about his philosophy and work in spreading data science literacy across fields and industries through his frameworks.
In this episode you will learn:
• Live vs. virtual events [4:20]
• Who is Kirk Borne? [7:13]
• Big data’s evolution and the emergence of small data [11:17]
• The fourth industrial revolution and the future [22:00]
• How has the data science education space changed in 14 years? [33:44]
• Four types of data discovery [38:00]
• The broad categories of AI you should pursue [50:44]
• 5 dimensions of analytics implementation [53:50]
• LinkedIn Q&A [1:05:00]
• Hiring at Booz Allen [1:15:18]
Additional materials: www.superdatascience.com/397
9/2/2020 • 1 hour, 22 minutes, 51 seconds
SDS 396: Five Job Hunting Tips
In this episode, I share a series of great tips, plus a bonus tip for getting your application further along in the hiring process and getting the job.
Additional materials: www.superdatascience.com/396
8/28/2020 • 22 minutes, 19 seconds
SDS 395: How to Tell Stories with Data
Cole Nussbaumer Knaflic talks about her influential book Storytelling with Data and shares some best practices for conveying meaning from your visualizations.
In this episode you will learn:
• Cole’s business Storytelling With Data [4:04]
• How did Cole get into this space? [7:24]
• When did Cole start writing the book? [15:33]
• Top 3 tips from the book [22:44]
• How to structure a good story [35:17]
• Communicating in-person vs. virtually [41:37]
• Cole’s upcoming workshops [43:50]
• LinkedIn Q&A [48:57]
• Cole’s advice on preparing for the future in the field [1:05:22]
Additional materials: www.superdatascience.com/395
8/26/2020 • 1 hour, 13 minutes, 39 seconds
SDS 394: Teach It
In this episode, I discuss the power of teaching what you learn to help you retain the highest amount of the information you are learning.
Additional materials: www.superdatascience.com/394
8/21/2020 • 9 minutes, 38 seconds
SDS 393: The Importance of Keeping Science in Data Science
John Peach joins to discuss his passion for bringing more scientific approaches to the data science field, making it smarter and more efficient.
In this episode you will learn:
• John’s move from Canada to the US [3:37]
• John’s new position at Oracle [8:31]
• Data Science Workflows [9:34]
• John’s solution to data science workflow exploration [12:06]
• John’s data science design thinking framework [21:20]
• Case study [34:21]
• Literate statistical programming [43:12]
• R or Python? [51:55]
• Data unit testing [53:28]
• What drives John? [1:00:56]
Additional materials: www.superdatascience.com/393
8/19/2020 • 1 hour, 8 minutes, 12 seconds
SDS 392: Start Your Own Morning Ritual
In this episode, I describe my morning ritual and discuss the importance of setting up a morning ritual for yourself.
Additional materials: www.superdatascience.com/392
8/14/2020 • 13 minutes, 20 seconds
SDS 391: Data Science Campfire Tales with John Elder
John Elder joins for an amazing podcast to share his data science "campfire tales" spanning over 20 years of his career in the industry. It will definitely help you in your work to incorporate some of the best principles.
In this episode you will learn:
• John’s first bungee jump [4:01]
• Calculus vs. resampling [14:01]
• Elder Research [21:11]
• Domain knowledge advice [25:26]
• The importance of instincts [41:52]
• Ensembles and simplicity [59:33]
• John’s opinions on neural nets [1:10:49]
• Target shuffling method and the crisis in science [1:17:27]
• What does the future of data science hold? [1:39:53]
Additional materials: www.superdatascience.com/391
8/12/2020 • 1 hour, 50 minutes, 52 seconds
SDS 390: Perception vs. Emotion
In this episode, I share a tip I came across this week about avoiding conflict in interpersonal relationships.
Additional materials: www.superdatascience.com/390
8/7/2020 • 10 minutes, 57 seconds
SDS 389: Becoming Good Enough: Jumpstarting Your Data Science Career
Josh Hortaleza discusses how he’s become a juggernaut of an aspiring data scientist and powered through networking and internships to reach his goals in the field.
In this episode you will learn:
• How did Kirill and Josh meet [8:26]
• Who is Josh? [12:42]
• Josh’s first internships [17:07]
• Being “good enough” and the luck factor [34:51]
• Josh’s goal [40:55]
• Genuine networking [43:08]
Additional materials: www.superdatascience.com/389
8/5/2020 • 1 hour, 4 minutes, 41 seconds
SDS 388: Get a Headhunter
In this episode, I share an awesome tip for anyone at any level around recruitment and headhunters.
Additional materials: www.superdatascience.com/388
7/31/2020 • 7 minutes, 5 seconds
SDS 387: Becoming a Data Science Leader
Lillian Pierson discusses her work on data leadership and how any data scientist can become a data leader in their organization or community.
In this episode you will learn:
• Who is Lillian Pierson? [3:27]
• Winning With Data [6:08]
• Four superpowers of great data leaders [11:53]
• Benefits of developing these skills [17:27]
• Examples of quick win challenges in Winning With Data [19:34]
• Impact of COVID-19 [22:23]
• Where is the industry going? [28:26]
Additional materials: www.superdatascience.com/387
7/29/2020 • 33 minutes, 34 seconds
SDS 386: Cohort Analysis
Today, I explain cohort analysis and how this can be used for conversion metrics and tracking the customer journey.
Additional materials: www.superdatascience.com/386
7/24/2020 • 9 minutes, 53 seconds
SDS 385: Advanced Data Topics and People-Centered Data Science
T. Scott Clendaniel joins to discuss advanced topics in data science and his forecasts for the future in this field. He also talks about the importance of soft skills for data scientists.
In this episode you will learn:
• Who is Scott Clendaniel? [6:57]
• Scott’s role at Franklin Templeton [10:24]
• LinkedIn advanced Q&A [13:29]
• Tools that Scott uses the most [26:57]
• Target mean encoding technique [30:35]
• LinkedIn Q&A on models [33:11]
• LinkedIn Q&A on soft skills [54:04]
• LinkedIn Q&A on forecasts for the future [01:00:19]
• Hub and spoke model in Data Science Management [01:08:32]
• Scott’s advice for advanced data scientists [01:10:12]
Additional materials: www.superdatascience.com/385
7/22/2020 • 1 hour, 17 minutes, 50 seconds
SDS 384: 10 Tips to Become a Master Presenter
Today, I discuss best practices for data visualization and how to build on what we learned about cognitive load.
Additional materials: www.superdatascience.com/384
7/17/2020 • 19 minutes, 49 seconds
SDS 383: You're Not an Imposter, You're Learning: Data Science Journeys
Sean Casey joins to discuss his data science journey and how he’s used online courses, secondary resources, and the wider network to help his journey to a data visualization professional.
In this episode you will learn:
• How Sean and Kirill met at DSGO Virtual [4:25]
• Sean’s experience at the virtual event [7:32]
• Sean’s journey [10:06]
• Do you need the credibility of a degree? [22:01]
• Sean’s supplemental readings [24:33]
• What can others do to replicate Sean’s success? [39:18]
• Sean’s advice for others just starting [50:15]
Additional materials: www.superdatascience.com/383
7/15/2020 • 59 minutes, 20 seconds
SDS 382: Manage Cognitive Load in Data Science
Today, I discussed the types of cognitive load and how to best utilize them when imparting information through data.
Additional materials: www.superdatascience.com/382
7/10/2020 • 9 minutes, 7 seconds
SDS 381: How to Avoid Failing at Digital Transformation
Tony Saldanha joins the podcast to discuss the realities of digital transformation and the steps companies must take to successfully transform in this fourth industrial revolution.
In this episode you will learn:
• Tony’s book on digital transformation [2:51]
• What is digital transformation [8:30]
• Five stage framework of going through digital transformation [11:13]
• Case studies through the stages [16:43]
• Why do digital transformations fail? [27:26]
• VC portfolio approach [31:21]
• Tony’s consulting work and top tips [40:11]
• Change management and COVID-19 [44:44]
• Disruption vs. digital transformation [51:14]
• What does the future hold? [53:50]
Additional materials: www.superdatascience.com/381
7/8/2020 • 1 hour, 8 seconds
SDS 380: Data Analyst vs. Data Scientist
Today, I discuss the difference between a data analyst and data scientist and how you can join our team as a potential data analyst.
Additional materials: www.superdatascience.com/380
7/3/2020 • 10 minutes, 13 seconds
SDS 379: Maelstrom, Chaos, and Mayhem: Guiding Your Data Science Career Path
Christopher Bishop speaks on the importance of career tactics in data science and how to prepare and move through the career path you want.
In this episode you will learn:
• Who is Christopher Bishop? [5:18]
• How Christopher developed his advising framework [9:17]
• Why data scientists? [12:07]
• What is the Future Career Toolkit? [15:54]
• How to connect with people as an unknown data scientist [34:09]
• What's the intended outcome of the framework? [43:53]
Additional materials: www.superdatascience.com/379
7/1/2020 • 55 minutes, 38 seconds
SDS 378: Use Your Unconscious Mind
In this episode, I talk about the importance of the unconscious mind in decision making and how logic and reasoning may sometimes hinder you.
Additional materials: www.superdatascience.com/378
6/26/2020 • 9 minutes, 58 seconds
SDS 377: The Power of Women in STEM
Deborah Berebichez joins us to discuss her experience as a woman in STEM, her work with upcoming generations of women in STEM, and how she helps facilitate data science trainings.
In this episode you will learn:
• Deborah's origins [4:21]
• Pursuing physics as a Jewish-Mexican woman [9:43]
• Deborah's work in helping women in STEM [23:10]
• How can companies also aid women in STEM? [28:10]
• How can individual data scientists work on creative thinking? [44:31]
• Deborah's work at Metis [48:26]
• Data literacy done the right way [1:00:33]
• The future of data science [1:04:55]
Additional materials: www.superdatascience.com/377
6/24/2020 • 1 hour, 18 minutes, 2 seconds
SDS 376: Expose Yourself to New Ideas Regularly
In this FiveMinuteFriday, I talk about the need to widen your horizons, expose yourself to more varied disciplines and thought processes, and the benefits you can get in your work from doing this.
Additional materials: www.superdatascience.com/376
6/19/2020 • 9 minutes, 40 seconds
SDS 375: Utilizing Oracle Cloud as an Enterprise, Small Business, or Developer
Greg Pavlik joins me for a great talk about the current state of the cloud and how single practitioners and small businesses can take advantage of cloud services.
In this episode you will learn:
• Will we have cloud-based solutions for VR and working from home? [8:15]
• Greg’s career journey [11:50]
• From Hadoop to Cloud [23:35]
• The cloud element in data science [30:17]
• Data science and AI in Oracle [33:00]
• Is Oracle more suited for larger companies only? [37:35]
• Fundamental differences between Oracle Cloud, Amazon, and Azure [42:12]
• Trends in data science and data management [45:14]
• Why should someone choose Oracle over any other open source? [52:50]
• How does the future of data management look like? [56:00]
• 5G and edge computing [1:01:36]
• Greg’s recommendation to data scientists [1:04:28]
Additional materials: www.superdatascience.com/375
6/17/2020 • 1 hour, 12 minutes, 47 seconds
SDS 374: Remember to Wind Down
In this episode, I talk about an issue I’ve been having when it comes to phasing my mind out of work and into post-work activities, a concept called “attention residue”.
Additional materials: www.superdatascience.com/374
6/12/2020 • 7 minutes, 42 seconds
SDS 373: TensorFlow and AI Learnings for Developers
Laurence Moroney sits down to talk about TensorFlow, its community, and his work educating developers in AI and machine learning. We talk about the explosive growth of the community and the great chance for career advancement for all developers, regardless of educational background.
In this episode you will learn:
• Who is Laurence Moroney? [4:14]
• The importance of developers' focus on AI [8:21]
• What is TensorFlow and how can it help in AI? [15:53]
• Differences in TensorFlow editions [26:26]
• Careers and overcoming the fear of AI [31:14]
• TensorFlow community [48:46]
• What does the future look like? [54:40]
Additional materials: www.superdatascience.com/373
6/10/2020 • 59 minutes, 51 seconds
SDS 372: Understanding the P-Value
Today, I talk about P-value and proper hypothesis testing as well as the importance of statistical significance.
Additional materials: www.superdatascience.com/372
6/5/2020 • 16 minutes, 57 seconds
SDS 371: The Power of Memory For Productivity
Anthony Metivier joins us again for an in-depth discussion about how memory and presence can boost productivity for people in their professional and personal lives.
In this episode you will learn:
• Anthony’s technique for memorizing names [12:04]
• Anthony’s new book and concept of memory [15:45]
• Memory and productivity insights [31:44]
• Memory palace construction methods [37:30]
• How can memory techniques help a data scientist [1:01:30]
• Challenge frustration curve [1:07:28]
• Further advice and learnings [1:11:59]
Additional materials: www.superdatascience.com/371
6/3/2020 • 1 hour, 23 minutes, 18 seconds
SDS 370: What is Support Vector Regression (SVR)?
In today’s FiveMinuteFriday episode, I wanted to experience explaining support vector regression without the use of any visual aids.
Additional materials: www.superdatascience.com/370
5/29/2020 • 9 minutes, 53 seconds
SDS 369: Real Data Analytics for Economics, HR, and COVID-19
John Johnson joins me for a thoughtful discussion about the importance of data in the world of economics and business analytics. We discuss his academic and professional history until his work now and how his company is sifting through economic data during the COVID-19 pandemic.
In this episode you will learn:
• Living and working in Washington D.C. [4:11]
• John’s initial jobs before Edgeworth [8:41]
• Edgeworth's core values [12:01]
• Edgeworth Economics and Edgeworth Analytics case studies [16:57]
• Data analytics vs. data science [29:50]
• Parachuting into industries [36:06]
• Real analytics vs. “lip service” [42:11]
• HR business analytics [51:13]
• How much, as a business owner, should you rely on a consultant? [56:26]
• John’s advice to worried business owners [59:24]
Additional materials: www.superdatascience.com/369
5/27/2020 • 1 hour, 5 minutes, 43 seconds
SDS 368: Future-Proof Your Career
Today, I discuss the best ways to ensure you future-proof your career for the great restructuring of the workforce that technological advancements already brought and will bring even more in the future.
Additional materials: www.superdatascience.com/368
5/22/2020 • 10 minutes, 5 seconds
SDS 367: Building Data Pipelines for COVID-19 Modeling
Samuel Hinton joins us again for an important and timely discussion on data pipelines and the work he’s doing to aid research on COVID-19 with the COVID-19 Critical Care Consortium. We also talk about his new online courses and his continued research into dark matter.
In this episode you will learn:
• Sam’s current work and COVID-19 Critical Care Consortium [4:22]
• The COVID data science pipeline and workflow [12:50]
• Sam’s second online course [36:22]
• Bayesian inference [43:06]
• Sam at DSGO Virtual [53:30]
• Sam’s work on dark matter [1:01:25]
• What is Sam reading right now? [1:09:14]
Additional materials: www.superdatascience.com/367
5/20/2020 • 1 hour, 17 minutes, 53 seconds
SDS 366: Define Your Own Success
Today, I discuss a profound conversation we had with our team this month on success and how you can define your own success.
Additional materials: www.superdatascience.com/366
5/15/2020 • 7 minutes, 11 seconds
SDS 365: Deep Learning Models For Recruitment
Jon Krohn joins me to discuss his work at untapt in designing models for HR purposes. We also discuss the power of data science across fields of medicine and epidemiology, as well as the future of deep learning.
In this episode you will learn:
• Coronavirus update in New York City [2:36]
• What brought Jon to New York? [5:38]
• Data science and coronavirus [12:50]
• Jon’s work at untapt [18:09]
• Techniques used to design models in untapt [22:02]
• untapt’s approach to explainability and bias [30:19]
• Jon’s other contributions to data science [38:10]
• Jon’s book and visual teaching styles [44:32]
• LinkedIn Q&A [52:05]
• Jon’s recommendation for becoming best at deep learning [1:13:09]
Additional materials: www.superdatascience.com/365
5/13/2020 • 1 hour, 21 minutes, 25 seconds
SDS 364: Depression and Suicidal Thoughts
Today, I’m talking with Anthony Metivier about practices to help your brain and body, work through the stress of the pandemic.
Additional materials: www.superdatascience.com/364
5/8/2020 • 9 minutes, 29 seconds
SDS 363: Intuition, Frameworks, and Unlocking the Power of Data
Piyanka Jain goes in-depth about the true power of data that can be unlocked when you combine intuition with data science practices and follow a hypothesis-driven framework to reach your project goals.
Items mentioned in this podcast:
• The power of data plus intuition [5:29]
• BADIR framework for data science [12:36]
• What can students pick up from Aryng’s courses? [24:58]
• SWAT data science teams [34:16]
• The rate of successful projects [39:38]
• Four D’s of Data Culture [45:27]
• Decision science vs data science [49:17]
• Piyanka’s inspiration for her book [51:23]
Additional materials: www.superdatascience.com/363
5/6/2020 • 58 minutes, 8 seconds
SDS 362: Hybrid AI
Today, I’m talking about an interesting topic I found in our own Data Science newsletter about the need for hybrid AI models in the future.
Additional materials: www.superdatascience.com/362
5/1/2020 • 6 minutes, 24 seconds
SDS 361: How To Succeed As An Analytics Consultant
John David Ariansen joins me for an episode on the best practices for getting into data science consulting, the importance of understanding data science and analytics, and how you can network, even during a pandemic.
In this episode you will learn:
• Coronavirus and how it will affect the way we work [3:06]
• John David’s consulting work [8:19]
• Why did John get into consulting? [13:17]
• Does John David’s age affect his clients? [25:03]
• John David’s podcast [34:59]
• The difference between data science and analytics [40:26]
• Creating space for opportunities [49:54]
• 3 top tips for getting a job in data science [54:28]
Additional materials: www.superdatascience.com/361
4/29/2020 • 1 hour, 15 minutes, 51 seconds
SDS 360: Importance of Sleep
In this episode, I’m exploring some topics in proper sleep habits to help you keep good sleep schedules.
Additional materials: www.superdatascience.com/360
4/24/2020 • 13 minutes, 37 seconds
SDS 359: Tackling Data Science Job Hunting, Interviews & Negotiations
Emily Robinson breaks down her new book “Build a Career in Data Science” by sharing what skills she focuses on exploring, who the data science field is for, and how to tackle interviews and negotiations.
In this episode you will learn:
• Long-distance networking [5:58]
• Emily’s book [9:38]
• Who is the field of data science for? [14:02]
• Should newcomers use Python or R? [23:34]
• Five company archetypes [28:36]
• Approaching data science interviews and negotiating [31:25]
• How do you actually get an interview? [48:08]
• Emily at DSGO 2020 [57:07]
• Emily’s final take-home message [58:52]
• Where to buy Emily’s book and SDS discount code [1:04:26]
Additional materials: www.superdatascience.com/359
4/22/2020 • 1 hour, 10 minutes, 1 second
SDS 358: Racism and Discrimination
In this episode, I’m discussing my personal experience with discrimination during a trip at the start of the pandemic and how it elevated my understanding of racism and discrimination beyond just a cognitive level.
Additional materials: www.superdatascience.com/358
4/17/2020 • 12 minutes, 28 seconds
SDS 357: Emotions, Relationships, and Being Kind During the Pandemic
Tracy Crossley, a Behavioral Relationship Expert, talks about how you can explore yourself during this difficult time. We also explored how different relationship dynamics can be tested during a forced lockdown and how to avoid dangerous emotional pitfalls.
In this episode you will learn:
• What work does Tracy do? [5:50]
• Tracy’s training [8:20]
• Tracy’s view on the consequences of the pandemic [12:55]
• Ways to tackle emotions during lockdown [17:14]
• Final advice to those struggling during lockdown [1:00:11]
Additional materials: www.superdatascience.com/357
4/15/2020 • 1 hour, 6 minutes, 34 seconds
SDS 356: Working Remotely
Today, I’m helping you explore working remotely. Whether you’ve started doing this during the pandemic or you've been interested in and exploring remote-based jobs recently. I outline three advantages and three disadvantages to consider.
Additional materials: www.superdatascience.com/356
4/10/2020 • 15 minutes, 43 seconds
SDS 355: DJ Patil on Harnessing the Power of Data Science Community
DJ Patil talks about ethics in data science and the importance of data science communities working together to make sure data science is an accelerant of solutions for our children and our children’s children.
In this episode you will learn:
• How does it feel to be the person who created data science as we know it now? [3:17]
• What data science is not [6:01]
• Ethics and data science development in different countries [10:00]
• What is the “biorevolution”? [16:02]
• The importance of data sharing [20:10]
• The current state of Chief Data Scientist of USA [24:07]
• LinkedIn Q&A [26:03]
• What to think about when you think about data science [44:08]
Additional materials: www.superdatascience.com/355
4/8/2020 • 49 minutes, 26 seconds
SDS 354: Negative Coefficients
Today I discuss a negative coefficient as a philosophical concept in problem-solving in your life. Do you make things worse by ignoring a problem or doing the wrong things to fix it?
Additional materials: www.superdatascience.com/354
4/3/2020 • 13 minutes, 33 seconds
SDS 353: How to Practice Human-Centric Data Science
Brian T. O’Neill joins me for an insightful dive into how you can implement human-centric practices into your data work, whether you’re a consultant or individual contributor. There are ways and steps to workshop best practices in conversations with stakeholders.
In this episode you will learn:
• Brian’s two lives [7:28]
• Brian’s human-first focal point [10:05]
• The process of Brian’s consulting work [17:07]
• How can an individual contributor be better at design thinking? [40:43]
• Walkthrough Brian’s course and seminar [54:37]
Additional materials: www.superdatascience.com/353
4/1/2020 • 1 hour, 15 minutes, 40 seconds
SDS 352: History of Data Science - Part 5
Today, we’re diving into our fifth and final part of our history of data science series by looking into data science’s future through the eyes of five of the most influential people in our space and how they see the next few decades.
Additional materials: www.superdatascience.com/352
3/27/2020 • 14 minutes, 9 seconds
SDS 351: Self-Starting In Data Science
Stratos Hadjioannou is a freshly hired data scientist who is self-taught and made the jump to visit DSGO. He talks about his learnings, putting himself in a data science ecosystem, and how to tackle interviews with little experience.
In this episode you will learn:
• Where did Stratos start? [6:16]
• How to keep the momentum for learning [12:20]
• Stratos’s goals [19:35]
• Planning the steps to getting a data science job [23:01]
• Triad for successful interviews [32:47]
• Application process [34:53]
• Experiences from the first data science job [45:51]
Additional materials: www.superdatascience.com/351
3/25/2020 • 1 hour, 1 minute
SDS 350: Coronavirus
Today, we take some time to discuss the real mental and emotional toll social distancing can take during the coronavirus. How can we effectively tackle each other's needs during this period?
Additional materials: www.superdatascience.com/350
3/20/2020 • 5 minutes, 50 seconds
SDS 349: Human-in-the-Loop Algorithms in Retail
Brad Klingenberg talks about the unique way Stitch Fix uses algorithms and human-in-the-loop AI to generate excellent customer experiences and pull ahead of other retailers in the space.
In this episode you will learn:
• Working in Stitch Fix [5:18]
• How does Stitch Fix work? [11:29]
• Stitch Fix algorithms tour [20:14]
• Open positions in Stitch Fix [36:25]
• Stitch Fix takeaways for other companies [39:07]
• Humans + machines [44:19]
• Stitch Fix global expansion [47:34]
• Future of personalization [50:16]
• Brad’s advice to data scientists [55:33]
Additional materials: www.superdatascience.com/349
3/19/2020 • 1 hour, 2 minutes, 13 seconds
SDS 348: History of Data Science - Part 4
In the penultimate episode of our history of data science series, we look at 2015 on and watch as data science goes from being about hard skills and coding to being about ethics and progress.
Additional materials: www.superdatascience.com/348
3/13/2020 • 19 minutes, 42 seconds
SDS 347: How To Tell Your Story For Career Success
Kerri Twigg talks with me about her work in helping professionals talk about themselves and tell stories about their passions and professional work to land ideal jobs and propel their career trajectory.
In this episode you will learn:
• Kerri at DSGO 2019 [6:09]
• Who is Kerri Twigg? [8:00]
• A case study from DSGO 2019 [9:51]
• How do you build a career story? [18:30]
• Kerri’s book and practices [32:35]
• How to prepare for interviews [43:22]
• 3-Parts of a career story [56:53]
Additional materials: www.superdatascience.com/347
3/12/2020 • 1 hour, 8 minutes, 33 seconds
SDS 346: My Top 5 Productivity Hacks
In this FiveMinuteFriday we take a break from our series on the history of data science to discuss productivity and my top 5 hacks for getting more hours out of your day and week.
Additional materials: www.superdatascience.com/346
3/6/2020 • 14 minutes, 38 seconds
SDS 345: Machine Learning At Twitter
I speak with Dan Shiebler who works as a machine learning engineer at Twitter Cortex and at the same time, is doing a Ph.D. on applying category theory in machine learning. We discuss his work at Twitter, the importance of academics, and the future of machine learning.
In this episode you will learn:
• What is great about Twitter [5:31]
• Dan’s Ph.D. program [9:25]
• Dan’s work at Twitter [18:07]
• Dan at DSGO 2020 [35:16]
• LinkedIn Q&A [40:25]
• Dan’s advice [1:03:58]
Additional materials: www.superdatascience.com/345
3/5/2020 • 1 hour, 12 minutes, 8 seconds
SDS 344: History of Data Science - Part 3
In the third of five episodes in this series, I journey through 2010 into 2015 to look at the boom of self-driving cars, the growth of data science as a profession, and the beginning of educational paths for future data scientists.
Additional materials: www.superdatascience.com/344
2/28/2020 • 8 minutes, 24 seconds
SDS 343: Career Jumpstarts through Data Science Retreat
I speak with Jose Quesada, founder and CEO of Data Science Retreat about the purpose of his program to help data scientists learn and find jobs through a three-month retreat and portfolio project.
In this episode you will learn:
• Overview of Jose’s current projects [5:55]
• "What if I don’t have a tech background?" [09:58]
• How does it work? [11:51]
• Program structure [21:24]
• Tips for picking a portfolio project [26:45]
• The program’s next intake [1:03:06]
Additional materials: www.superdatascience.com/343
2/27/2020 • 1 hour, 10 minutes
SDS 342: History of Data Science - Part 2
In the second of five episodes in this series, I take a step into the early 2000s and the true boom of data science as a profession and philosophy of study, as well as look at some of science fiction’s failed hopes for data science by this time.
Additional materials: www.superdatascience.com/342
2/21/2020 • 8 minutes, 55 seconds
SDS 341: Talking Robotics with Brandon Rohrer
Brandon Rohrer joins me in this special episode about robotics, machine learning, and the merge of software and hardware to create innovative technology for homes around the world.
In this episode you will learn:
• Brandon at MIT [7:41]
• iRobot [15:14]
• Moving from Facebook to iRobot [17:14]
• Brandon’s work in iRobot [20:18]
• Brandon as a data science influencer [30:08]
• Q&A [40:40]
Additional materials: www.superdatascience.com/341
2/20/2020 • 1 hour, 16 minutes, 4 seconds
SDS 340: History of Data Science - Part 1
In this five-episode series, I dive into the history of data science from the beginning of mathematics to today. In this first episode, we start by looking in the 1950s and go up to the dawn of the 2000s.
Additional materials: www.superdatascience.com/340
2/14/2020 • 20 minutes, 18 seconds
SDS 339: The Power of Coaching
I sat down with my coach Ivor Lok to discuss the power and importance of coaching and how everyone can use it in their personal and professional lives to become happier.
In this episode you will learn:
• Managing expectations [9:21]
• Personal beliefs & parenting [17:42]
• Value of having a coach [25:33]
• Mindset over skillset [37:24]
• Dream lists [51:06]
• Ivor’s new projects [1:03:20]
Additional materials: www.superdatascience.com/339
2/13/2020 • 1 hour, 25 minutes, 35 seconds
SDS 338: Too Many Photos
I discuss an observation I had recently about how many photos we take, and how much we miss out on by focusing on capturing a moment rather than living it.
Additional materials: www.superdatascience.com/338
2/7/2020 • 5 minutes, 33 seconds
SDS 337: Hadley Wickham Talks Integration and Future of R and Python
Hadley Wickham, a huge presence in data science, sits down to talk about R, Python, and the future of potential integrations, as well as some Q&A with our listeners through LinkedIn about programming languages and how to make data science accessible for all.
In this episode you will learn:
• Hadley’s R packages [8:26]
• Better integrations between R and Python [20:11]
• LinkedIn Q&A [33:34]
• useR Conference vs. RStudio Conference [50:46]
• LinkedIn Q&A: Career-related questions [1:01:06]
• LinkedIn Q&A: Future-related questions [1:08:01]
Additional materials: www.superdatascience.com/337
2/6/2020 • 1 hour, 14 minutes, 35 seconds
SDS 336: Better Than Perfect
I discuss something that popped up for me recently: is it better to have something finished or to have something be perfect? I explore the answer and what it can mean for you in your life.
Additional materials: www.superdatascience.com/336
1/31/2020 • 9 minutes, 50 seconds
SDS 335: Many Ways to Fail & Five Ways to Succeed in Startups
Rico Meinl failed when he tried to make a successful startup. He learned a lot from it and shared his story and learnings for nearly 2 hours in one of our longest and most insightful podcasts to date.
In this episode you will learn:
• Rico at DSGO [8:50]
• Dresswell [17:10]
• B2B vs B2C in startups [34:03]
• Rico's 5 learnings [53:25]
• Learning no. 1 [53:54]
• Learning no. 2 [56:33]
• Learning no. 3 [1:10:43]
• Learning no. 4 [1:24:08]
• Learning no. 5 [1:34:02]
• Rico’s next steps [1:45:35]
Additional materials: www.superdatascience.com/335
1/30/2020 • 1 hour, 54 minutes, 35 seconds
SDS 334: No Coaching
I return to the concept of no coaching in more detail and discuss how I recently had a good conversation with a friend without giving advice but offering empathy.
Additional materials: www.superdatascience.com/334
1/24/2020 • 18 minutes, 14 seconds
SDS 333: BERT and NLP in 2020 and Beyond
Sinan Ozdemir is back again, this time talking about his work since his company Kylie.ai was acquired by Directly. We discuss his work, the way he is creating human and AI synergy and the future of NLP as it continues to progress.
In this episode you will learn:
• Sinan’s company acquired [7:29]
• Explainable deep learning models [16:13]
• Airbnb case study [19:42]
• Microsoft case study [22:25]
• Sinan’s role at Directly [25:57]
• Work with Sinan [32:57]
• Preview of Sinan at DSGO [38:38]
• BERT [43:18]
• Sinan’s prediction for NLP in 2020 [53:17]
Additional materials: www.superdatascience.com/333
1/23/2020 • 1 hour, 4 minutes, 52 seconds
SDS 332: Go through the Motions
I discuss the concept of putting yourself on autopilot and powering through getting work done when you feel like giving up.
Additional materials: www.superdatascience.com/332
1/17/2020 • 9 minutes, 4 seconds
SDS 331: Hacking Data Science Interviews for Graduates
Harshal Sanap talks about how he took himself from a data science student and graduate to a full time professional in data science and shares mistakes to avoid to get started in your career.
In this episode you will learn:
• Harshal at DSGO [8:12]
• Harshal’s first data science job [16:01]
• The process of getting your first job [21:25]
• 3 steps to data science job search [23:37]
• 4 tips on how to apply for jobs [36:59]
• 5 tips on how to prepare for an interview [53:21]
• 5 mistakes to avoid [1:10:31]
Additional materials: www.superdatascience.com/331
1/16/2020 • 1 hour, 31 minutes, 8 seconds
SDS 330: Good!
I discuss finding the good in something that is objectively not so good and how you can take setbacks as a learning experience and challenge.
Additional materials: www.superdatascience.com/330
1/10/2020 • 8 minutes, 31 seconds
SDS 329: Telling a Story Right with Data
Isaac Reyes talks about his approach to data visualization. We dive into the science behind it, the psychology, and the needs in businesses for proper and informed data storytelling.
• Catching up with Isaac [6:37]
• StoryIQ's office in Manila [10:04]
• What is data storytelling? [12:29]
• The keys to data storytelling [15:47]
• Second key to data storytelling [18:36]
• Third key to data storytelling [21:35]
• Elementary Perceptual Tasks Scale [24:10]
• Gestalt principles [38:56]
• How does StoryIQ teach this? [48:35]
• Fourth key to data storytelling [49:20]
Additional materials: www.superdatascience.com/329
1/9/2020 • 1 hour, 4 minutes, 38 seconds
SDS 328: Look for the Horse
In this week’s FiveMinuteFriday, I wish you all a happy New Year with an interesting story about having the choice to see the best in situations or see the worst in them.
Additional materials: www.superdatascience.com/328
1/3/2020 • 4 minutes, 29 seconds
SDS 327: Data Science Trends for 2020
Hadelin and I outlined our top 5 trends in Data Science for 2020. We discussed why they’re hot topics and how companies can utilize them to drive profit and efficiency in the coming year.
In this episode you will learn:
• The decade in review [1:45]
• A decade preview [5:20]
• 2020 trends webinar [7:30]
• Robotic process automation [9:00]
• Natural language processing [18:28]
• Reinforcement learning [26:35]
• Edge computing [37:25]
• Open source AI frameworks [52:02]
Additional materials: www.superdatascience.com/327
1/2/2020 • 1 hour, 8 minutes, 8 seconds
SDS 326: Who Inspires You?
This week’s FiveMinuteFriday and final episode of 2019 is about who inspires you and how it may be those closest to you without you even realizing it.
Additional materials: www.superdatascience.com/326
12/27/2019 • 5 minutes, 9 seconds
SDS 325: What I Learned in 2019
I went over the 7 top learnings I took from this exciting year of ups, downs, and incredible adventures and explorations.
In this episode you will learn:
• Dichotomies [6:18]
• F*ck FOMO [19:00]
• Full circle stress [27:23]
• Letting doors close [38:57]
• Managing my energy as an introvert [45:00]
• No coaching [56:15]
• Feelings [1:03:38]
Additional materials: www.superdatascience.com/325
12/26/2019 • 1 hour, 14 minutes, 26 seconds
SDS 324: Proximity is Power #2
In this week’s FiveMinuteFriday, Vitaly and I talked more about a familiar topic: proximity is power. We discussed the importance of connection, how to not saturate, and how to decide with whom you spend your time.
Additional materials: www.superdatascience.com/324
12/20/2019 • 29 minutes, 59 seconds
SDS 323: Data Science as a Freelance Career
I chatted with top Upwork freelancer Wesley Engers who has worked over 150 jobs in data science. He’s worked in a variety of industries and shared a few of his most interesting jobs and offered advice for those considering diving into freelance data science work.
In this episode you will learn:
• Wesley on Upwork [9:11]
• Wesley’s background [16:20]
• How Wesley onboards a client [26:32]
• Good clients vs. bad clients [31:12]
• Tools [37:23]
• Wesley’s best projects [45:09]
• Freelance vs. full-time work [59:26]
• Tips about getting into Upwork [1:06:48]
Additional materials: www.superdatascience.com/323
12/19/2019 • 1 hour, 14 minutes, 41 seconds
SDS 322: Diets
In this week’s FiveMinuteFriday we are with Vitaly and Hadelin again and we are discussing our diets and how we maintain feeling healthy and good through food intake.
Additional materials: www.superdatascience.com/322
12/13/2019 • 23 minutes, 45 seconds
SDS 321: The Life of One Advanced Data Scientist
I sat down with Morgan Mendis whom I met at DSGO this year. He is one of the most advanced data scientists I’ve met and he’s been using his skills and experience to give back to his community. We discuss his career, his dreams, his ideology, and his hunt for a VP of Data Science at his former company.
In this episode you will hear:
• Catch up since DSGO 2019 [8:04]
• VP of Data Science at Inspire [12:00]
• Morgan’s career dreams [22:04]
• Morgan’s experience [30:50]
• Tools & solutions [1:01:33]
• How you can get involved [1:12:45]
Additional materials: www.superdatascience.com/321
12/12/2019 • 1 hour, 20 minutes, 45 seconds
SDS 320: Mentorship
In this week’s FiveMinuteFriday I sat down with Vitaly and Hadelin to discuss the concept of mentorship and how we work through our professional and personal hurdles with mentors.
Additional materials: www.superdatascience.com/320
12/6/2019 • 37 minutes, 21 seconds
SDS 319: The Path to Data Visualization
I sat down with Jonathan and Ogo, two DataScienceGO attendees, who are experts in the field of data visualization. Their methods and backgrounds differ but ultimately they believe in the same goal: telling a meaningful story.
Additional materials: www.superdatascience.com/319
12/5/2019 • 1 hour, 18 minutes, 43 seconds
SDS 318: Amazing
In this week’s FiveMinuteFriday I discuss the concept of “fake it until you become it” and use of the word “amazing” when thinking about your current state and when people ask how you are.
Additional materials: www.superdatascience.com/318
11/29/2019 • 9 minutes, 20 seconds
SDS 317: A Deep Dive Into Neural Nets
An incredible young guest is in this episode after he attended DSGO. Edis is a 15-year-old, building his own neural networks. We discussed his background, his process of building neural networks from scratch, Kaggle competitions, and the benefit of online data science education.
Additional materials: www.superdatascience.com/317
11/28/2019 • 1 hour, 2 minutes, 36 seconds
SDS 316: Make It About Yourself
In this week’s FiveMinuteFriday I discuss how best to handle disagreements by keeping your focus on yourself and your own actions.
Additional materials: www.superdatascience.com/316
11/22/2019 • 10 minutes, 35 seconds
SDS 315: Making Data Accessible
Back by popular demand is Gabriela de Queiroz to discuss various data accessibility issues and how her work, talks, and organizations are working to make data science and AI more available across the board.
Additional materials: www.superdatascience.com/315
11/21/2019 • 1 hour, 13 minutes, 17 seconds
SDS 314: Meet the Team
I asked the team what was one wish they had for our students on their data science journey. The answers are inspirational and encouraging for students at all levels.
Additional materials: www.superdatascience.com/314
11/15/2019 • 4 minutes, 51 seconds
SDS 313: The Power of Online Data Education
Marco Caviezel’s journey from research-based psychology into a career as a data analyst is really fascinating. He did his entire data education online and managed to not only teach himself in topics of machine learning and data visualization but got a job as a data analyst through his own work.
Additional materials: www.superdatascience.com/313
11/14/2019 • 1 hour, 4 minutes, 2 seconds
SDS 312: Contemplation
Kirill and Mitja share some thoughts about one of the workshops at the SuperDataScience offsite retreat. They explore the practice of contemplation as a way to get a deeper understanding and insights.
Additional materials: www.superdatascience.com/312
11/8/2019 • 12 minutes, 59 seconds
SDS 311: Using Data Right In Smart Cities
This episode with Daniel Obodovski explores smart cities and the importance of problem-solving from city to city by using data correctly. But solutions aren’t always obvious, privacy continues to be a huge issue for citizens, and not every city prioritizes problems the same way. It’s a fascinating topic.
Additional materials: www.superdatascience.com/311
11/7/2019 • 1 hour, 14 minutes, 2 seconds
SDS 310: Trial by Fire
Kirill and Mitja share thoughts on purposeful “trials by fire” in your life and how you can force yourself to grow through intended adversity.
Additional materials: www.superdatascience.com/310
11/1/2019 • 6 minutes, 55 seconds
SDS 309: Learning Through Competition
A conversation between rival online educators in the data science community about the challenges of creating a worldwide community with millions of students, the trends in data science, and how education can keep up to date.
Additional materials: www.superdatascience.com/309
10/30/2019 • 54 minutes, 59 seconds
SDS 308: Your Tribe
A FiveMinuteFriday about the importance of belonging and how a connection to the larger community in the work that you do can be incredibly beneficial and meaningful for both your career and personal happiness.
Additional materials: www.superdatascience.com/308
10/25/2019 • 10 minutes, 46 seconds
SDS 307: Problem Solving Through Better Thinking
Kirill and Marc have a conversation that started as a quick FiveMinuteFriday discussion on thoughtfulness that turned into a full podcast worth of content on the power of thought, mindfulness, practice, and how even data scientists need to look past facts and information and follow their intuition.
Additional materials: www.superdatascience.com/307
10/23/2019 • 46 minutes, 55 seconds
SDS 306: Pura Vida
The Costa Rican phrase "Pura Vida" is something very important to think about because it is incredibly beautiful, filled with emotion and it is so powerful. What meaning would this phrase have for you, in your life?
Additional materials: www.superdatascience.com/306
10/18/2019 • 6 minutes, 54 seconds
SDS 305: Using Data Visualization Tools
Jean-Pierre Labuschagne's career journey started in South Africa and moved to Europe, where he is bringing massive value with the power of data visualization. He is also teaching successful courses online after spending 2 years as a student of online courses himself.
Additional materials: www.superdatascience.com/305
10/16/2019 • 1 hour, 6 minutes, 15 seconds
SDS 304: The Law of Attraction
Can you think of examples when the law of attraction worked in your life?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/304
10/11/2019 • 10 minutes, 35 seconds
SDS 303: Proper Hypothesis Testing For Every Field
In this episode of the SuperDataScience Podcast, I chat with Astrophysicist and Online Data Science Instructor, Sam Hinton. You will hear about the Lindau Nobel Laureates meeting, where he met Nobel Prize winners and you will also hear about his appearance on the Survivor TV show. You will learn about quantum mechanics. You will also learn about the course he launched in Python for Statistical Analysis, as well as going in-depth on hypothesis testing. You will hear about Python versus R, statistical significance, why p-value of 0.5 is bad, Bayesian statistics, and what is the difference between frequentist and Bayesian approaches.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/303
10/9/2019 • 1 hour, 10 minutes, 4 seconds
SDS 302: What is Data Science to you?
What is Data Science to you?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/302
10/4/2019 • 5 minutes, 32 seconds
SDS 301: Finding Your Edge
In this episode of the SuperDataScience Podcast, I chat with Data Scientist at TD Bank, Ayobami Ayodeji. You will hear Ayobami's valuable insights about the takeaways from DataScienceGO 2019, including productization of data science products, the 3 types of data science teams, and building character and resilience. You will also learn about Ayobami's career journey from project manager to data scientist and the sacrifices he made on that journey.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/301
10/2/2019 • 1 hour, 8 minutes, 41 seconds
SDS 300: Legacy
What are you leaving for the next generation on this planet?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/300
9/27/2019 • 9 minutes, 35 seconds
SDS 299: Becoming Seasoned At Failure
In this episode of the SuperDataScience Podcast, I chat with Head of Data Science and Machine Learning, Michelle Keim. You will hear what working remotely is all about in data science. You will learn about the importance of failure, and why everyone should lose their job at least once. You will hear about churn and segmentation, what they meant 10 years ago and what they mean now. You will also learn about the imposter syndrome and what to do when you feel like an imposter while applying for a role. You will hear about moving from centralized data science teams to integrated experts within the business and leading people on the three key learnings that Michelle has taken away from her experience as a leader.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/299
9/25/2019 • 1 hour, 9 minutes, 33 seconds
SDS 298: The Six Months Rule
What would you change about the things you do in your life if you thought you only had 6 months to live?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/298
9/20/2019 • 5 minutes, 8 seconds
SDS 297: Fortitude & Passion in the Data Science Journey
In this episode of the SuperDataScience Podcast, I chat with data scientist Ayodele Odubela. You will hear how and why she chose to do a Masters in Data Science and supplemented that with online education. You will also hear about self-discovery, fortitude and passion, and how she got one of her data science jobs through Twitter. You will learn about some of Ayodele's projects like using SVM for detecting poisonous vs. edible mushrooms, using random forests and decision trees for ranking wines based on the chemical contents, using the Naive Bayes to detect spam. You will learn about the real-world project that she's worked on, bullet stopping flying drones. You will find out what role machine learning played in that project and how they're going to be applied in society once they get rolled out.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/297
9/18/2019 • 1 hour, 5 minutes, 50 seconds
SDS 296: Who You Become
Do you take time to reflect on who you became or actions you took while on a path to achieving a goal?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/296
9/13/2019 • 11 minutes, 4 seconds
SDS 295: A Deep Conversation About Tech & Life
In this episode of the SuperDataScience Podcast, I chat with my friend and business partner, Hadelin de Ponteves. You will hear what new exciting things are happening in Hadelin's life now. You will hear some preview of his upcoming presentation at DataScienceGO 2019, which will cover NLP, especially the BERT model, which raised a whole new level in NLP. You will also learn about reinforcement learning and Hadelin's new course on Twin-Delayed DDPG.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/295
9/11/2019 • 55 minutes, 27 seconds
SDS 294: Perception of AI in Big Companies
What about AI worries you in the professional world?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/294
9/6/2019 • 9 minutes, 44 seconds
SDS 293: True Personalization Through Reinforcement Learning
In this episode of the SuperDataScience Podcast, I chat with Data Scientist, Peyman Hesami. You will find out what reinforcement learning is and how it works on an intuitive level. You will hear about the differences between reinforcement learning versus classification, or other supervised learning methods, and how it's used for personalization specifically. You will learn about six distinct advantages of reinforcement learning, what role reinforcement learning is going to play in the future of machine learning and why. Also, you will find out how and why Peyman made a career transition to work for a startup, how he's using reinforcement learning, and what is the biggest mistake he has made with reinforcement learning.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/293
9/4/2019 • 1 hour, 59 seconds
SDS 292: Introverts and Extroverts
How can you find a way to balance your energy through recharging in the way that works best for you?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/292
8/30/2019 • 9 minutes, 11 seconds
SDS 291: Changing the World With Theory & Data
In this episode of the SuperDataScience Podcast, I chat with founder and CEO at Daisy Intelligence, Gary Saarenvirta. You'll learn about dangerous implicit assumptions, the power of theory and theory versus data. You'll also learn about two types of decisions, the spacial interaction model, traffic flow model, the concept of dividing the world in two and what humans should be doing, and what artificial intelligence should be doing. You will hear about the difference between artificial intelligence that leverages just data versus artificial intelligence that leverages theory and data, and what advantages that creates.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/291
8/28/2019 • 1 hour, 5 minutes, 19 seconds
SDS 290: The Passion Paradox
How does your inner voice compare to your passions?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/290
8/23/2019 • 10 minutes, 52 seconds
SDS 289: AI, Deepfakes and Call of Duty
In this episode of the SuperDataScience Podcast, I chat with top AI influencer, Ben Taylor. You will learn some very cool concepts about artificial intelligence such as active adverse impact mitigation, what that means and how that can help train on your dataset without bias. You will hear about AI ethics, deepfakes and Ben's current passion project, building an artificial intelligence that plays Call of Duty, which he will actually demonstrate at DataScienceGO this year at the end of September.
If you enjoyed this episode, check out the video, show notes, resources, and more at www.superdatascience.com/289
8/21/2019 • 1 hour, 4 minutes, 50 seconds
SDS 288: Love Yourself
Can you pick one activity to implement for yourself this week to engage in loving yourself?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/288
8/16/2019 • 11 minutes, 53 seconds
SDS 287: How To Be Social About Data Science
In this episode of the SuperDataScience Podcast, I chat with a data scientist, public speaker and the mastermind behind the Let’s Go Data Science meetups, Ashwin Chirag. You will learn why it is very important to attend meetups and what are the benefits and advantages you get from meetups. You will hear some great stories on how in-person connections with data scientists can take your career to the next level. You will also hear Ashwin’s experience in standup comedy and what it has taught him. And finally, you will hear how attending meetups, attending conferences such as DataScienceGO has changed the trajectory of Ashwin’s life.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/287
8/14/2019 • 1 hour, 5 minutes, 35 seconds
SDS 286: Solitude Deprivation
Can you give yourself an hour this weekend to physically separate from your phone?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/286
8/9/2019 • 11 minutes, 17 seconds
SDS 285: Bringing Dev & Diverse Communities Into Data Science
In this episode of the SuperDataScience Podcast, I chat with the top contributor on Stack Overflow, Jon Skeet. You will learn what is versioning, how that affects developers and how that affects data scientists. You will hear about compiled versus interpreted languages, what is the silver bullet in cold diagnostics, what kind of problems you want to diagnose and the 'divide and conquer' principle. You will also hear about the importance of community, what it means to be part of a community and how communities grow, what you can do as a data scientist to make our community be more inclusive, more welcoming and prosper further.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/285
8/7/2019 • 1 hour, 16 minutes, 25 seconds
SDS 284: Proximity is Power
Who in your life can you get more inspiration and learning from by increasing your proximity?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/284
8/2/2019 • 10 minutes, 10 seconds
SDS 283: Getting The Most Out of Data With Gradient Boosting
In this episode of the SuperDataScience Podcast, I chat with one of the key people behind the Python package scikit-learn, Andreas Mueller. You will learn about gradient boosting algorithms, XGBoost, LightGBM and HistGradientBoosting. You will hear Andreas's approach to solving problems, what machine learning algorithms he prefers to apply to a given data science challenge, in which order and why. You will also hear about problems with Kaggle competitions. You will find out the four key questions that Andreas recommends to ask when you have a data challenge in front of you. You will learn about his 95% rule to creating models, and creating success in business enterprises with the help of machine learning. And, finally, you will also learn about the Data Science Institute at Columbia University.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/283
7/31/2019 • 1 hour, 2 minutes, 25 seconds
SDS 282: Learning Something New
What can you spend time learning that’s new to you and how can it help your lifestyle and career?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/282
7/26/2019 • 11 minutes, 3 seconds
SDS 281: Futureproofing Your Digital Marketing Tactics
In this episode of the SuperDataScience Podcast, I chat with the Founder and Director of Digital Strategy at Webfor, Kevin Getch. You will learn what digital assistants are and where they're going with the help of people like Ray Kurzweil at Google. You will hear Kevin's philosophy on 'what gets measured gets managed' and what it means for marketing and data science. You will also learn why websites are less and less important, how segmentation is slowly transitioning to personalization, creating amazing customer experiences, disk profiles, natural language processing, and computer vision and their role in the future of marketing.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/281
7/24/2019 • 1 hour, 6 minutes, 11 seconds
SDS 280: Gap Year
Is there a period in your life you can look at and feel grateful for your freedom and experience?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/280
7/19/2019 • 10 minutes, 23 seconds
SDS 279: Embedding Data Science in Business
In this episode of the SuperDataScience Podcast, I chat with Head of Data Science at Scribd, Kevin Perko. You will learn what it's like to be a data science manager, or a data science leader, and what it's like to manage a team, and more so two teams, in two different locations, and how that is different to actually doing the technical work. Also, you'll learn about the Book Genome Project at Scribd, what it's like when a company sees data science as a product, as opposed to an auxiliary function, and a very valuable concept of decentralized, or embedded teams, versus core data science teams and the advantages and disadvantages of each approach.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/279
7/17/2019 • 1 hour, 8 minutes, 44 seconds
SDS 278: Your Core Strength
What do you do extremely well that no one else around you does quite as well and how can you leverage it?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/278
7/12/2019 • 15 minutes, 43 seconds
SDS 277: The New Age of Reason
In this episode of the SuperDataScience Podcast, I chat with Serial Entrepreneur, Khai Pham. You will learn why data science is an advantage in terms of mindset even to be an entrepreneur. You will hear about general artificial intelligence versus superintelligence, what are the differences and why you don't really need general artificial intelligence to get to superintelligence. You'll also learn how questions are more important than answers, and hence the reasoning engine versus a search engine. You'll hear about Khai's experience in becoming a founder of companies. You'll learn what the whole idea of reasoning is and why companies need to move from data-driven and machine learning-driven to reasoning-driven.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/277
7/10/2019 • 1 hour, 4 minutes, 27 seconds
SDS 276: Data Science in Wealth Management
How can you use these processes in your industry?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/276
7/5/2019 • 16 minutes, 20 seconds
SDS 275: Machine Learning Through Reinforcement & Contextual Bandits
In this episode of the SuperDataScience Podcast, I chat with the Machine Learning Research Scientist, John Langford. You will hear about unsupervised, supervised learning and reinforcement learning, and the differences between the three. You will learn about applications of contextual bandits and reinforcement learning in general, YOLO style algorithms versus simulator algorithms, technics for avoiding local optimums. You will also learn about the balance between exploration and exploitation, learning to search and active learning.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/275
7/3/2019 • 1 hour, 1 minute, 55 seconds
SDS 274: Ask the Right Question
What has happened to you recently that could call for clarity from a mentor or coach?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/274
6/28/2019 • 14 minutes, 24 seconds
SDS 273: Predict, Prevent, Detect: Cyber Security
In this episode of the SuperDataScience Podcast, I chat with Matthew Rosenquist, one of the top leading world experts in the space of cybersecurity. You will learn what balance in cybersecurity means and what the dark web is. You will hear how Matthew's career developed and how he thinks about the strategy of cybersecurity. You will also learn about the valuable role of data science in cybersecurity and the steps you can take to get into this space.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/273
6/26/2019 • 1 hour, 6 minutes, 19 seconds
SDS 272: Data Science in Energy
How can you use data science to help keep the energy industry helpful and ethical towards the planet and future generations?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/272
6/21/2019 • 14 minutes, 25 seconds
SDS 271: Making the Public Graphically Literate
In this episode of the SuperDataScience Podcast, I chat with the legend of visual journalism, Alberto Cairo, who talks about understanding if your data is measuring the right thing that you wanted to be measuring. You will hear about Simpson's paradox and the ecological fallacy. You will learn about the four kinds of literacy, exploratory data analysis versus communicating results, how to create a narrative structure in your visualization and convey the insights in a certain way so that people can better understand them. And finally, you will hear about ethics in data visualization.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/271
6/19/2019 • 1 hour, 5 minutes, 35 seconds
SDS 270: The Cold is My Master
How can you leverage this to help you focus on your work in data science?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/270
6/14/2019 • 24 minutes, 44 seconds
SDS 269: Maximizing Advertising Efforts With Data
In this episode of the SuperDataScience Podcast, I chat with Justin Fortier, the principal data scientist at ViralGains. You will hear about ad tech, performing insights, getting insights, and making decisions within milliseconds with data science. You will learn about the business impact, why business impact is ultra-important, and on the other hand, why user experience is an ultra-important factor for a data scientist to consider more and more in today's world. You'll also learn about Justin's path from managing data scientists at large organizations, to being the single data scientist at smaller startups and you will hear some very interesting decisions he made throughout his career.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/269
6/12/2019 • 1 hour, 3 minutes, 45 seconds
SDS 268: Data Science in Insurance
How can you see data science disrupting and progressing insurance industry in the future?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/268
6/7/2019 • 16 minutes, 18 seconds
SDS 267: Achieving Data Science Maturity
In this episode of the SuperDataScience Podcast, I chat with Manasi Vartak, founder and CEO at Verta.ai. You will learn about model tracking, versioning, and maintenance. You will also learn what data maturity means and what are the 3 areas where top-tier data science teams are investing in. You will hear a great discussion about the boom that will happen with machine learning in the next 3 years and what you can do to prepare your career or your business for the future.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/267
6/5/2019 • 56 minutes, 49 seconds
SDS 266: Exploration vs Exploitation
Where are you overexploiting patterns in your life and where can you be more open to some exploration?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/266
5/31/2019 • 15 minutes, 9 seconds
SDS 265: Data Science in the World of Big Data
In this episode of the SuperDataScience Podcast, I chat with Frank Kane, an expert and top instructor in the field of big data, who also worked in Amazon for 10 years. You will learn how data science and big data have been different but are now converging into something that is very intertwined. You will hear about recommender systems such as user-based and item-based collaborative filtering as well as other types of recommender systems and where this space of recommender systems is going. You will also hear about singular value decomposition or SVD model-based methods, deep learning and Amazon DSSTNE. And in the end, you will learn some very valuable tips on how to get hired by big companies like Amazon.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/265
5/29/2019 • 1 hour, 1 minute, 53 seconds
SDS 264: Data Science in Agriculture
How do you see data science continuing to improve the world’s most important industry?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/264
5/24/2019 • 10 minutes, 49 seconds
SDS 263: Communicating Data
In this episode of the SuperDataScience Podcast, I chat with Eoin Murray, the founder of Kyso.io, a platform where you can blog about your data science projects using tools such as Jupyter notebooks. You will learn what the platform means for data scientists and how you can use it to build your online presence and online portfolio. You will hear about startups and how you can jump into creating a startup, what accelerators are, what angel investors are, what venture capital funds are. And you will also hear where data science is going and whether or not data science should be a certified profession.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/263
5/22/2019 • 1 hour, 1 minute, 20 seconds
SDS 262: You Cannot Make Progress Without a Routine
How can you put a routine on a goal you’re trying to achieve?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/262
5/17/2019 • 11 minutes, 16 seconds
SDS 261: Succeeding in Data Science with the Trichotomy of Control
In this episode of the SuperDataScience Podcast, I chat with Andrei Lyskov, a data science writer, who shares not only his experience but also his research and his thoughts and ideas in the space of getting a job in data science. You will learn about the trichotomy of control and the stages involved in job interviews in data science. You will also learn about the importance of referrals and portfolio, and you will hear about learning how to learn.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/261
5/15/2019 • 1 hour, 2 minutes, 47 seconds
SDS 260: Data Science in Real Estate
How can we, as data scientists, scale the existing technologies in real estate?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/260
5/10/2019 • 13 minutes, 25 seconds
SDS 259: Building Machine Autonomy With Neural Networks
In this episode of the SuperDataScience Podcast, I chat with Stephen Welch, a computer vision and neural networks expert who will share a ton of information about the space of self-driving cars. You will learn about self-driving cars starting from the history of neural networks and how that was associated with self-driving cars from the '60s, '70s, '80s and all the way until now. You'll also learn about autonomous driving and the three components in the neural networks related to autonomous driving and what they are and how they work. You will find out about the five different levels of autonomous driving and what to expect in the next 10-20 years. You will hear a case study of how machine learning can be applied to historically older industries and you will also hear some very valuable career advice.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/259
5/8/2019 • 59 minutes, 49 seconds
SDS 258: Eating S.L.O.W.L.Y.
How can you work S.L.O.W.L.Y. into your daily routine?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/258
5/3/2019 • 17 minutes, 58 seconds
SDS 257: AI: How Far We Haven’t Actually Come
In this episode of the SuperDataScience Podcast, I chat with Melanie Mitchell, one of the leading researchers in the field of AI. You will learn about complexity, what it is and how it works, and how it can be seen in different areas of life. You will hear about common sense, meta-cognition, explainable AI, and you will also hear Melanie's ideas and thoughts on the future of AI, which break down into two areas which you'll find out in this podcast.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/257
5/1/2019 • 1 hour, 2 minutes, 25 seconds
SDS 256: Data Science in Transportation
How do you see data science assisting in the transportation industry in the future?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/256
4/26/2019 • 13 minutes, 55 seconds
SDS 255: Diving Into Computer Vision
In this episode of the SuperDataScience Podcast, I chat with the founder of PyImageSearch.com, Adrian Rosebrock, who gives us a great overview of the space of computer vision. You will learn what computer vision was in the past, what it is now, and most importantly, what it will be in the future and what you need to prepare for if you're interested in computer vision. You will also learn about OpenCV and how to quickly get started with it as it is one of the most popular libraries and tools for computer vision in the world right now.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/255
4/24/2019 • 56 minutes, 12 seconds
SDS 254: Two Wolves
What negative and positive emotions do you find most often acknowledging in yourself?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/254
4/19/2019 • 6 minutes, 10 seconds
SDS 253: Solving Problems With Data Science & Uber
In this episode of the SuperDataScience Podcast, I chat with Associate Professor at the University of California San Diego, Bradley Voytek, who was the first data scientist, and the person to kickstart data science at Uber. You will hear a lot about his past work in Uber and his current work in UCSD. You will learn four very valuable philosophical points about data science being a separate field and you will also learn what data science skills can help you resist automation.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/253
4/17/2019 • 1 hour, 14 minutes, 55 seconds
SDS 252: Data Science In Construction
What other problems in construction can you think of that data science could offer a solution for?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/252
4/12/2019 • 13 minutes, 19 seconds
SDS 251: Transforming the Identity Authentication Space
In this episode of the SuperDataScience Podcast, I chat with the CEO and Data Scientist at TypingDNA. You will hear about a brand new industry which is transforming everything we know about security. You will learn about typing biometrics and how it works and how machine learning and data science enable this industry to go forward. You will also hear what is like to run a data science startup and how to go from an idea to research and finally to creating a business.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/251
4/10/2019 • 1 hour, 5 minutes, 25 seconds
SDS 250: Guilt vs Shame
How can you pivot your way of talking about actions and yourself to make negative feelings productive?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/250
4/5/2019 • 13 minutes, 42 seconds
SDS 249: Diving Into Data Science Consulting
In this episode of the SuperDataScience Podcast, I chat with the CEO and Co-Founder of SFL Scientific, Michael Segala, who is joining us for the second time to share his overview of data science consulting and data science projects overall. You will hear some amazing case studies which include healthcare imaging, logistics and supply chain, and the space of energy. You will also learn about the challenge of small data and how to deal with unbalanced data sets.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/249
4/3/2019 • 1 hour, 11 minutes, 2 seconds
SDS 248: Data Science in Government
How has data science implementation in government helped improve your community?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/248
3/29/2019 • 12 minutes, 45 seconds
SDS 247: The Science Fact of Technology, AI, & Social Media
In this episode of the SuperDataScience Podcast, I chat with Pablos Holman, the famous hacker, inventor, and entrepreneur. You will hear how artificial intelligence is impacting the world, what the Maslow's hierarchy of needs is, and how that is affected by technology. You will also hear what roles Data Science and machine learning are playing in the future of the world.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/247
3/27/2019 • 59 minutes, 7 seconds
SDS 246: Boost Your Self-Confidence
How will you prepare yourself for imperfection going forward?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/246
3/22/2019 • 16 minutes, 7 seconds
SDS 245: Knowing What You Need to Know With Data Science
In this episode of the SuperDataScience Podcast, I chat with the Seasoned Executive Luis Blanco about his amazing career journey from which you will gain very valuable insights for your career development. Also, you will learn about fact-based decision making cultures, how to create and nurture them and you will also learn about cross-departmental work and sharing models between departments.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/245
3/20/2019 • 1 hour, 1 minute, 20 seconds
SDS 244: Data Science in Entertainment
How do you experience data science in your everyday use of entertainment media?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/244
3/15/2019 • 16 minutes, 16 seconds
SDS 243: Geospatial Analytics: Where Data Science & Actuarial Science Meet
In this episode of the SuperDataScience Podcast, I chat with the Senior Underwriter Dominic Roe about his actuarial work and the pioneering and implementation of a risk assessment method he’s utilized for insurance companies that’s now widespread across Australia. You will hear a very detailed explanation of how he built his model. You will also learn about geodemographic segmentation and hear some very interesting use cases of geodemographic segmentation in data science.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/243
3/13/2019 • 1 hour, 6 minutes, 24 seconds
SDS 242: Meditation
Do you use meditation to more efficiently filter your thoughts during both work and leisure time?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/242
3/8/2019 • 11 minutes, 46 seconds
SDS 241: Pushing the Boundaries in Mental Healthcare with Data Science
In this episode of the SuperDataScience Podcast, I chat with Dr. Guillermo Cecchi about the role of data science in medical research and maybe even the future of artificial intelligence. You will learn how data science and artificial intelligence are pushing the boundaries of mental healthcare. You will hear some very interesting approaches about getting insights from audio samples of patients’ voices and their speech and you will also learn about the development of some fascinating techniques, like transferring intuitive knowledge from professionals in the healthcare field into algorithms.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/241
3/7/2019 • 1 hour, 1 minute, 28 seconds
SDS 240: State of Artificial Intelligence in Business
How would you approach to adopting AI into your business?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/240
3/1/2019 • 32 minutes, 28 seconds
SDS 239: From Candidate to Career: Pathways for Data Scientists
In this episode of the SuperDataScience Podcast, I chat with the Data Science Headhunter and Head of Analytics Recruitment at IT Search and Selection, Adrian Clarke. You will learn about the state of the data science industry globally and the different data science roles that exist in the world. You will also learn what to expect in terms of data science salaries and why there is a huge demand for data scientists. You will hear about the concept of hybrid professional and a lot of valuable career insights.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/239
2/28/2019 • 1 hour, 3 minutes, 35 seconds
SDS 238: Data Science in Banking
What aspect in banking can you see which makes the best use of data science?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/238
2/22/2019 • 13 minutes, 15 seconds
SDS 237: Data Privacy, GDPR, and You
In this episode of the SuperDataScience Podcast, I chat with the principal attorney, Jessica Merlet, who is an extremely experienced lawyer and gives us an excellent breakdown of what GDPR is.
You will learn all about GDPR, from requirements for capturing and storing to processing data under GDPR. You will also learn some important terms such as data controller, data processor, affirmative consent, sensitive information and you will also learn what are the 4 pillars of GDPR and the 6 legal bases for capturing and processing data.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/237
2/21/2019 • 1 hour, 17 minutes, 32 seconds
SDS 236: How to Deal with Negative Emotions
What negative emotion has been difficult for you to control lately?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/236
2/15/2019 • 8 minutes, 42 seconds
SDS 235: Living the Dream With Data Science
In this episode of the SuperDataScience Podcast, I chat with the Data Science Consultant, Nic Ryan, who does a great job in combining the technical and consulting/mentoring part of data science in his career. You will hear Nic's journey of creating his remote carrier and a balanced work-family relationship, along with some valuable tips you can apply to your own carrier. You will also hear about natural language processing and some examples of his work.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/235
2/14/2019 • 1 hour, 3 minutes, 15 seconds
SDS 234: Data Science in Education
What challenges have you seen lately in the educational institutions in which data science or AI can help?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/234
2/8/2019 • 13 minutes, 8 seconds
SDS 233: High Octane Data Science Leadership at Red Bull
In this episode of the SuperDataScience Podcast, I chat with the Director of Data Science at Red Bull, Josh Muncke, who gives us some very valuable insights. You will hear a couple of case studies of how Red Bull uses Data Science, you will learn how Data Science Leadership is an important area for businesses in Data Science and you will also learn about the importance of asking good data questions.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/233
2/7/2019 • 59 minutes, 20 seconds
SDS 232: Sleep on it
How do you approach your roadblocks?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/232
2/1/2019 • 10 minutes, 31 seconds
SDS 231: Data Visualizers: The Storytellers of Data Science
In this episode of the SuperDataScience Podcast, I chat with the professional data visualizer, Mollie Pettit. You will learn about the difference between data scientist and data visualizer. You will also learn about the D3.js javascript library, when to use it and how you can benefit from using it. You will hear one of Mollie's case studies and how she participates in projects that use Data Science to contribute to the world.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/231
1/31/2019 • 1 hour, 3 minutes, 32 seconds
SDS 230: SuperDataScience 2.0
How will SuperDataScience 2.0 serve you now and in the future?
SuperDataScience 2.0 is available at: www.superdatascience.com/yes
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/230
1/25/2019 • 10 minutes, 36 seconds
SDS 229: Data-Driven Approach of Doing Business
In this episode of the SuperDataScience Podcast, I chat with the Co-Founder at Cursor, Adam Weinstein. You will hear Adam's journey from working at LinkedIn to founding his own company. You will learn about the concepts of Data Literacy and Citizen Data Scientist, how Cursor can help you on this journey and what does it mean for an organization to be Data Literate and Data Driven.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/229
1/24/2019 • 58 minutes, 23 seconds
SDS 228: Data Science in Mining
What more can you add to the list of how the mining industry can benefit from data science?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/228
1/18/2019 • 20 minutes, 38 seconds
SDS 227: Enhancing Your Mobile Gaming Experience With Data Science
In this episode of the SuperDataScience Podcast, I chat with the Data Science Influencer, Sarah Nooravi, who inspires the data science community in many different ways. You will hear about Sarah's background and interests, from culinary chef to nuclear fusion, and how she found her way to data science. You will also hear a specific case study of data science in marketing and mobile gaming industry and Sarah's role in it. You will learn about diversity in data science and how the community can help inspire data scientists to be successful.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/227
1/17/2019 • 1 hour, 1 minute, 30 seconds
SDS 226: Flat Tyres Happen
What do you consider your ‘flat tyre’ recently and how did you approach it?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/226
1/11/2019 • 3 minutes, 53 seconds
SDS 225: The Benefit of Having a Diverse Skill Set
In this episode of the SuperDataScience Podcast, I chat with the Business Development Specialist at Velocity Group, Anna Foard who have built an amazing data science career in very short time, besides being a mother of two children. You will hear how she is challenging and conquering the field of Data Science from many different perspectives. You will learn about her strategic approach to her career and how she has managed to meet and work on projects with many of her heroes in Data Science.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/225
1/10/2019 • 57 minutes, 44 seconds
SDS 224: Hacks For Reading More Books
What great books can you suggest for others to read this year?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/224
1/4/2019 • 9 minutes, 50 seconds
SDS 223: Data Science Trends for 2019
In this episode of the SuperDataScience Podcast, Hadelin de Ponteves and I discuss the accuracy rate of the predictions we made for 2018. We also discuss the key AI and technology trends to look out for in 2019 which can help you structure your carrier and design your path through technology.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/223
1/3/2019 • 1 hour, 53 minutes, 22 seconds
SDS 222: 2018 in Numbers
What key trend do you think will continue in the year 2019?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/222
12/28/2018 • 9 minutes, 23 seconds
SDS 221: 1-on-1 with Kirill: What I learned in 2018
In this episode of the SuperDataScience Podcast, I reflect on 2018 and share top 7 things I learned in this year.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/221
12/27/2018 • 1 hour, 58 minutes
SDS 220: Data Science in Retail
What big help does data science do for you when you do your shopping?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/220
12/21/2018 • 14 minutes, 7 seconds
SDS 219: How Kaplan uses Data for Education
In this episode of the SuperDataScience Podcast, I chat with the Vice President of Measurement and Evaluation at Kaplan, David Niemi. You will hear how David applies data in the space of education in order to extract insights and understand how the learning journey can be improved. You will learn some very valuable tips about learning and you will also hear how the education industry is booming and is going to keep growing.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/219
12/20/2018 • 1 hour, 4 minutes, 30 seconds
SDS 218: Start A Great Day
What’s the first thing you’ve written on the list of things that you should be grateful for?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/218
12/14/2018 • 8 minutes, 40 seconds
SDS 217: Aerospace Engineers and Data Science
In this episode of the SuperDataScience Podcast, I chat with the aerospace engineer, Carlos Hervás García, who works for Airbus. You will hear about Aerospace and Orbital Mechanics, the International Space Station and what aerospace engineers actually do. You will learn how Data Science, Machine Learning, Deep Learning, and Artificial Intelligence can be used in Aerospace Engineering and what value do these technologies bring to the field of Interplanetary Travel. You will also hear how Carlos combined his passions for Aerospace and Artificial Intelligence in his career journey.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/217
12/13/2018 • 1 hour, 5 minutes, 8 seconds
SDS 216: Data Science In Healthcare
What do you think is the most significant application of data science in the healthcare industry so far?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/216
12/7/2018 • 12 minutes, 54 seconds
SDS 215: Integrating Data Science as a Developer
In this episode of the SuperDataScience Podcast, I chat with the full stack web developer and aspiring data scientist, Brian Dowe. You will hear what is like to go from developer to data scientist and how to integrate data science in your career as a developer. You will learn about the development, deployment and maintenance life cycle of models in business and you will hear some ideas on modeling in general.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/215
12/6/2018 • 1 hour, 6 minutes, 39 seconds
SDS 214: What Is Amazing In Your Life
What in your life is super amazing right now?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/214
11/30/2018 • 4 minutes, 33 seconds
SDS 213: Amazing Tips from Two Legends of Visualization
In this episode of the SuperDataScience Podcast, I chat with the legends of visualization, Andy Kriebel and Eva Murray. You will learn about visualization and why it is important for data scientists and machine learning experts to know how to visualize data. You will also hear some amazing tips from their brand new creation, MakeoverMonday the Book.
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/213
11/29/2018 • 45 minutes, 33 seconds
SDS 212: Model Driven Vs Data Driven
Are you willing to make the digital shift to be a model-driven business?
If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/212