Data Science – Ai Inteliigence

Skip to content Skip to sidebar Skip to footer

How to Set the Number of Trees in Random Forest

Data ScienceMay 19, 202519Views 0Likes 0Comments

Scientific publication T. M. Lange, M. Gültas, A. O. Schmitt & F. Heinrich (2025). optRF: Optimising random forest stability by determining the optimal number of trees. BMC bioinformatics, 26(1), 95. Follow this LINK to the original publication. Random Forest — A Powerful Tool for Anyone Working With Data What is Random Forest? Have you ever wished you…

Survival Analysis When No One Dies: A Value-Based Approach

Data ScienceMay 14, 202528Views 0Likes 0Comments

Survival Analysis is a statistical approach used to answer the question: “How long will something last?” That “something” could range from a patient’s lifespan to the durability of a machine component or the duration of a user’s subscription. One of the most widely used tools in this area is the Kaplan-Meier estimator. Born in the…

Clustering Eating Behaviors in Time: A Machine Learning Approach to Preventive Health

Data ScienceMay 9, 202528Views 0Likes 0Comments

It’s well known that what we eat matters — but what if when and how often we eat matters just as much? In the midst of ongoing scientific debate around the benefits of intermittent fasting, this question becomes even more intriguing. As someone passionate about machine learning and healthy living, I was inspired by a 2017 research paper[1] exploring this intersection. The…

From a Point to L∞

Data ScienceMay 4, 202538Views 0Likes 0Comments

Why you should read this As someone who did a Bachelors in Mathematics I was first introduced to L¹ and L² as a measure of Distance… now it seems to be a measure of error — where have we gone wrong? But jokes aside, there seems to be this misconception that L₁ and L₂ serve the same function — and…

The Secret Inner Lives of AI Agents: Understanding How Evolving AI Behavior Impacts Business Risks

Data ScienceApril 29, 202535Views 0Likes 0Comments

Artificial intelligence (AI) capabilities and autonomy are growing at an accelerated pace in Agentic Ai, escalating an AI alignment problem. These rapid advancements require new methods to ensure that AI agent behavior is aligned with the intent of its human creators and societal norms. However, developers and data scientists first need an understanding of the…

How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals

Data ScienceApril 24, 202585Views 0Likes 0Comments

The recent launch of the DeepSeek-R1 model sent ripples across the global AI community. It delivered breakthroughs on par with the reasoning models from Meta and OpenAI, achieving this in a fraction of the time and at a significantly lower cost. Beyond the headlines and online buzz, how can we assess the model’s reasoning abilities…

Load-Testing LLMs Using LLMPerf

Data ScienceApril 19, 202551Views 0Likes 0Comments

Deploying your Large Language Model (LLM) is not necessarily the final step in productionizing your Generative AI application. An often forgotten, yet crucial part of the MLOPs lifecycle is properly load testing your LLM and ensuring it is ready to withstand your expected production traffic. Load testing at a high level is the practice of…

Sesame Speech Model: How This Viral AI Model Generates Human-Like Speech

Data ScienceApril 14, 202546Views 0Likes 0Comments

Recently, Sesame AI published a demo of their latest Speech-to-Speech model. A conversational AI agent who is really good at speaking, they provide relevant answers, they speak with expressions, and honestly, they are just very fun and interactive to play with. Note that a technical paper is not out yet, but they do have a…

A Data Scientist’s Guide to Docker Containers

Data ScienceApril 9, 202532Views 0Likes 0Comments

For a ML model to be useful it needs to run somewhere. This somewhere is most likely not your local machine. A not-so-good model that runs in a production environment is better than a perfect model that never leaves your local machine. However, the production machine is usually different from the one you developed the…

Are We Watching More Ads Than Content? Analyzing YouTube Sponsor Data

Data ScienceApril 4, 202556Views 0Likes 0Comments

I’m definitely not the only person who feels that YouTube sponsor segments have become longer and more frequent recently. Sometimes, I watch videos that seem to be trying to sell me something every couple of seconds. On one hand, it’s great that both small and medium-sized YouTubers are able to make a living from their…