I've been spending a lot of time thinking about memory systems for AI agents. A big part of that work is not just the model, but the storage layer underneath it. I want something t...
Learn more
I care about building toward AGI through better memory, stronger systems, and practical engineering.
Browse recent posts.
Why I wanted a lightweight, local-first way to use ArcadeDB from Python
I've been spending a lot of time thinking about memory systems for AI agents. A big part of that work is not just the model, but the storage layer underneath it. I want something t...
Learn more
Formal basics, real systems, and why industry favors property graphs while RDF remains important
Motivation “Knowledge graph” gets used everywhere. This post pins down a minimal formal meaning, relates it to a widely cited definition from Hogan et al. (2020), and compares RDF...
Learn more
Understanding the evolution from basic policy gradients to modern LLM fine-tuning algorithms
Motivation: Why On Policy RL Matters for Modern AI If you've been following the latest developments in large language models (LLMs), you've probably heard of GRPO (Group Relative P...
Learn more
Understanding the evolution of generative models through practical implementations
Motivation: Why Do We Need Generative Models? In order to generate data, we need to sample from a distribution. But here, we are not talking about simply sampling from the training...
Learn more
Balancing n-gram models and neural networks for next-token prediction
Modern Natural Language Processing (NLP) revolves around language modeling —the art of predicting the next token given the previous ones. Formally, if we have a sequence of tokens...
Learn more
Bridging Reinforcement Learning and Supervised Learning in LLMs
Large Language Models (LLMs) have brought a new twist to the way we think about training algorithms. While traditional Supervised Learning (SL) and Reinforcement Learning (RL) migh...
Learn more
A Deep Dive into MLE, Loss Functions, and Beyond
Most of us, whether consciously or not, use Maximum Likelihood Estimation (MLE) in our daily machine learning workflows. When you’re training a model to predict labels in a supervi...
Learn more
Offline RL, Behavior Cloning, and The Magic of Sequence Modeling
Sequential decision making is a fundamental challenge in machine learning and AI. From planning your next vacation itinerary to training a robot to navigate a warehouse, we often f...
Learn more
Tracing the Layers from Machine Code to Natural Language Interfaces
Computing has come a long way since the early days of punch cards and assembly language. With each new generation of programming paradigms, we've added layers of abstraction that m...
Learn more
Why Frequentist Significance Testing Falls Short
Statistical significance testing, specifically the use of p values, has been the cornerstone of hypothesis testing for decades. However, this frequentist approach has critical flaw...
Learn more
A natural language text can be seen as a knowledge graph
In recent years, Transformers have dominated the field of natural language processing (NLP), while Graph Neural Networks (GNNs) have proven essential for tasks involving graph stru...
Learn more
The reason why reinforcement learning is such a hard problem
Supervised Learning Objective: Maximum Likelihood In supervised learning , the goal is to learn a function that maps an input to a label . This is typically done by maximizing the...
Learn more
4bit quantization is amazing
Llama 3.1 was released by Meta a month ago, and you can easily access it . I used their quite extensively a few years ago, and haven't used it for some years. It was already an ama...
Learn more
Exploring the challenges and strategies for effective edge classification with graph neural networks (GNNs)
In this post, I'll walk you through the process of training a Graph Convolutional Network (GCN) for edge classification, a common task in graph based machine learning applications....
Learn more
Can machines think like us?
The project is the core of my PhD work. It was heavily inspired by the cognitive science theories, such as the ones from . It's about developing agents equipped with human like ext...
Learn more
This really can do a lot of things, although it's still biased.
Generating images with text I've been using it massively for everything lately. It's also integrated into VS Code, which really helps me to write code faster. Below is an example o...
Learn more
My boring old website had to be gone.
I thought that I can make a better website than the one I had. So I made this one! I'll be posting my projects and other stuff here. I hope you like it! :) I was surprised that I c...
Learn more
Unlocking the power of open-source language models
This paper was a result of the Hugging Face BigScience research workshop that I participated in 2021. The paper can be found at . Abstract : Large language models (LLMs) have been...
Learn more
A simple room classifier made using EfficientNet and PyTorch Lightning
Check out
Learn more
Revolutionizing NLP with a collaborative hub for crafting and sharing prompts
This paper was a result of the Hugging Face BigScience research workshop that I participated in 2021. The paper can be found at . Abstract : PromptSource is a system for creating,...
Learn more
We won the NeurIPS 2021 competition
Thank you, Hybrid Intelligence team! Check out
Learn more
Achieving superior zero-shot generalization through explicit multitask prompting
This paper was a result of the Hugging Face BigScience research workshop that I participated in 2021. The paper can be found at . Abstract : Large language models have recently bee...
Learn more
Achieving state-of-the-rrt emotion recognition in conversations with a simple RoBERTa-based approach
This is a paper that I wrote together with . We were able to achieve SOTA back then by simply training a RoBERTa (a variant of BERT) with speaker tokens! Check out Abstract : We pr...
Learn more
A simple age-gender classification model using Arcface embeddings and MLPs
I trained a simple age gender classification model using ArcFace embeddings and MLPs. Check out the paper and the code Multilayer Perceptrons (MLPs) are foundational neural network...
Learn more