Going beyond recurrent neural networks, the transfomer radicalized the language modeling field and became the core of many state-of-the-art large language models (LLMs) like ChatGPT, Claude, and Llama. In this post, we'll demystify that core that is the transformer.
The advent of deep neural networks and GPUs changed the language modeling landscape. In this post, we'll exploit those same deep neural networks for the task of language modeling!
CMake is the most common meta-build system used for building C/C++ libraries and applications. In this post, I'll describe the anatomy of a good-enough C++ library project structure and a good-enough way to build it using CMake.
Organizing multiple artificial neurons, I'll describe how to construct and train neural networks using the most fundamental and important algorithm in all of deep learning: backpropagation of errors.
Perceptrons are a useful pedagogical tool but have a number of limitations, particularly in training them. In this post, I'll address several issues with perceptrons and promote them into more modern artificial neurons.
Over the past decade or so, neural networks have shown amazing success across a wide variety of tasks. In this post, I'll introduce the grandfather of modern neural networks: the perceptron.
I'll introduce concept of Lie Groups and how they can be useful for working with constrained surfaces like rotations; we'll also apply them to the problem of accurate robotic state estimation.
I overview some of the fundamental deep reinforcement learning algorithms used as the basis for many of the more advanced techniques used in practice and research.