I'm Taewoon | Taewoon Kim

From REINFORCE to PPO: The Complete On-Policy RL Journey

Understanding the evolution from basic policy gradients to modern LLM fine-tuning algorithms

By Taewoon Kim

Posted on August 7, 2025

Motivation: Why On-Policy RL Matters for Modern AI [Read More]

Tags:

From VAEs to Diffusion Models: A Step-by-Step Journey

Understanding the evolution of generative models through practical implementations

By Taewoon Kim

Posted on May 28, 2025

Motivation: Why Do We Need Generative Models? [Read More]

Tags:

Discrete vs. Continuous: A Tale of Two Approaches to Language Modeling

Balancing n-gram models and neural networks for next-token prediction

By Taewoon Kim

Posted on March 9, 2025

Modern Natural Language Processing (NLP) revolves around language modeling—the art of predicting the next token given the previous ones. Formally, if we have a sequence of tokens \(w_1, w_2, \dots, w_{n-1}\), we want to learn: [Read More]

Tags:

RL vs SL: Understanding Their Roles in Large Language Models

Bridging Reinforcement Learning and Supervised Learning in LLMs

By Taewoon Kim

Posted on February 6, 2025

Large Language Models (LLMs) have brought a new twist to the way we think about training algorithms. While traditional Supervised Learning (SL) and Reinforcement Learning (RL) might seem worlds apart at first glance, their underlying optimization procedures share a striking similarity. In both cases, we aim to maximize an objective... [Read More]

Tags:

Understanding Maximum Likelihood Estimation

A Deep Dive into MLE, Loss Functions, and Beyond

By Taewoon Kim

Posted on February 5, 2025

Most of us, whether consciously or not, use Maximum Likelihood Estimation (MLE) in our daily machine learning workflows. When you’re training a model to predict labels in a supervised setting, or even when you’re using a self-supervised approach like masked language modeling, you’re often implicitly performing MLE under the hood.... [Read More]

Tags:

Sequential Decision-Making with Transformers

Offline RL, Behavior Cloning, and The Magic of Sequence Modeling

By Taewoon Kim

Posted on December 15, 2024

Sequential decision-making is a fundamental challenge in machine learning and AI. From planning your next vacation itinerary to training a robot to navigate a warehouse, we often face tasks where a series of actions must be taken to achieve a goal, often with delayed feedback or sparse rewards. [Read More]

Tags:

From 1+1 in Assembly to LLMs: The Evolution of Computing Abstraction

Tracing the Layers from Machine Code to Natural Language Interfaces

By Taewoon Kim

Posted on November 12, 2024

Computing has come a long way since the early days of punch cards and assembly language. With each new generation of programming paradigms, we’ve added layers of abstraction that make it easier for humans to interact with machines. In this post, we’ll explore how a simple operation like 1 +... [Read More]

Tags:

The Problems with p-values

Why Frequentist Significance Testing Falls Short

By Taewoon Kim

Posted on October 16, 2024

Statistical significance testing, specifically the use of p-values, has been the cornerstone of hypothesis testing for decades. However, this frequentist approach has critical flaws that can lead to misleading interpretations and false confidence in research results. In this post, we will break down why p-values often fall short and discuss... [Read More]

Tags:

Can the Transformer be viewed as a special case of a Graph Neural Network (GNN)?

A natural language text can be seen as a knowledge graph

By Taewoon Kim

Posted on October 15, 2024

In recent years, Transformers have dominated the field of natural language processing (NLP), while Graph Neural Networks (GNNs) have proven essential for tasks involving graph-structured data. Interestingly, the Transformer can be seen as a special case of GNNs, particularly as an attention-based GNN. This connection emerges when we treat natural... [Read More]

Tags:

Is supervised learning a special type of reinforcement learning?

The reason why reinforcement learning is such a hard problem

By Taewoon Kim

Posted on September 22, 2024

Supervised Learning Objective: Maximum Likelihood [Read More]

Tags: