← All Projects

Historical Nanochat

ongoing ML Research

Time-locked language models trained on pre-cutoff historical texts using Karpathy's nanochat pipeline. Exploring whether small models trained exclusively on period texts can reproduce the linguistic patterns of their era.

  • 65GB historical text corpus across multiple eras
  • Time-locked training methodology (no future-leaked text)
  • RTX 3090 local training pipeline
  • Parquet-based shard management
PythonPyTorchnanochat
View on GitHub

Activity Timeline

No activity recorded yet.