Technical Deep Dives
In-depth explorations of technical topics, research, and interesting technologies. This is where I document what I’m learning and share detailed analysis of complex subjects.
In-depth explorations of technical topics, research, and interesting technologies. This is where I document what I’m learning and share detailed analysis of complex subjects.
Intro This is an attempt to cover what I know about DL to some degree. Some stuff is very skippable, and I don’t really remember everything that I put in here, so there might be some repeating, but not much. The Foundations of Deep Learning To understand how machine learning models work, you have to completely discard the idea that it “understands” anything. A model does not read text or see images. At the absolute lowest level, a neural network is just a massively complex sequence of mathematical operations executed on silicon. To feed data into that silicon, we must first translate reality into a format that a GPU’s compute cores can process. That translation layer is the tensor. ...
Overview I have been looking into self-hosting LLMs, and this is my attempt to put everything I’ve learned about the subject in one place (so I can stop forgetting things). Alongside that, I wanted to include information about the setup I use to self-host LLMs on my laptop and the steps I took to build and optimize it. While that will come in the future, as there are still some things I am changing, and this is long enough already, I removed some of those parts to put in the next section. ...