Image by Author | Canva  
 
#  Introduction 
  Traditional debugging with print() or logging works, but it’s slow and clunky with LLMs. Phoenix provides a timeline view of every step, prompt, and response inspection, error detection with retries, visibility into latency and costs, and a complete visual understanding of your app. Phoenix by Arize…
		Vision Language Models (VLMs) allow both text inputs and visual understanding. However, image resolution is crucial for VLM performance for processing text and chart-rich data. Increasing image resolution creates significant challenges. First, pretrained vision encoders often struggle with high-resolution images due to inefficient pretraining requirements. Running inference on high-resolution images increases computational costs and latency…
		How Deep Think works: extending Gemini’s parallel “thinking time” Just as people tackle complex problems by taking the time to explore different angles, weigh potential solutions, and refine a final answer, Deep Think pushes the frontier of thinking capabilities by using parallel thinking techniques. This approach lets Gemini generate many ideas at once and consider…
		Estimated reading time:  5  minutes  
Introduction 
    Embodied AI agents are increasingly being called upon to interpret complex, multimodal instructions and act robustly in dynamic environments. ThinkAct, presented by researchers from Nvidia and National Taiwan University, offers a breakthrough for vision-language-action (VLA) reasoning, introducing reinforced visual latent planning to…
		Image by Author | Ideogram  
 
#  Introduction 
  From your email spam filter to music recommendations, machine learning algorithms power everything. But they don't have to be supposedly complex black boxes. Each algorithm is essentially a different approach to finding patterns in data and making predictions. 
In this article, we'll learn essential machine learning…
		Embedding models act as bridges between different data modalities by encoding diverse multimodal information into a shared dense representation space. There have been advancements in embedding models in recent years, driven by progress in large foundation models. However, existing multimodal embedding models are trained on datasets such as MMEB and M-BEIR, with most focus only…
		Research 
      
      
    
      
        Published
        23 July 2025
      
      
 …
		Micromobility solutions—such as delivery robots, mobility scooters, and electric wheelchairs—are rapidly transforming short-distance urban travel. Despite their growing popularity as flexible, eco-friendly transport alternatives, most micromobility devices still rely heavily on human control. This dependence limits operational efficiency and raises safety concerns, especially in complex, crowded city environments filled with dynamic obstacles like pedestrians and…
		Supply chains are the lifeblood of global commerce, yet they remain plagued by inefficiencies—delays, stockouts, overproduction, and unpredictable disruptions. Enter autonomous AI agents, the silent orchestrators now optimizing logistics with superhuman precision. Unlike traditional software, these agents learn, adapt, and make decisions in real-time, often without human intervention. 
“AI agents don’t just follow rules—they rewrite them. In…
		Image by Author | Canva  
 
#  Introduction 
  This is the second article in my beginner project series. If you haven’t seen the first one on Python, it’s worth checking out: 5 Fun Python Projects for Absolute Beginners. 
So, what’s generative AI or Gen AI? It is all about creating new content like text,…