LLM with Adjustable Temperature
For my 10-301 Introduction to Machine Learning course, I built a custom language model with an adjustable temperature parameter to explore how sampling temperature affects text generation.
For this project, I implemented a character-level recurrent neural network from scratch in PyTorch, including the RNN cell, forward pass, loss computation, and gradient-based optimization. The model learns to predict the next character in a sequence, trained on a corpus of text to capture patterns in language. After training, I added a user-controlled temperature parameter to the softmax sampling step, allowing the model to generate outputs ranging from conservative (low temperature) to creative and diverse (high temperature).
This project demonstrated core concepts in sequence modeling, neural network training, and probabilistic text generation, as well as how temperature scaling shapes model behavior.
Further Analysis
Results: