When people say a model has "7 billion parameters" or "70 billion weights," they're talking about the same thing: the internal numbers that define how the model behaves. These weights are like dials on a vast control panel. During training, the model adjusts them — turning some up, some down — until it gets good at predicting the next token, classifying images, or whatever task it's learning.
Think of it like a recipe that gets refined through practice. A chef doesn't just follow fixed instructions; they learn that a pinch more salt works better for this dish, or that this oven runs hot. Weights are the model's equivalent: they capture countless subtle adjustments learned from billions of examples. The model doesn't store facts as a database would — it encodes patterns in these numbers.
Size matters. More weights generally mean more capacity to learn complex patterns, but also more compute to train and run. A 7B model might fit on a laptop; a 70B model needs serious hardware. Fine-tuning — teaching a pre-trained model new skills — works by updating a subset of these weights rather than starting from scratch.
Weights are what you get when you download a model file. They're the "brain" — the trained knowledge — separate from the architecture (the structure that defines how those weights connect). When a model hallucinates or makes mistakes, it's often because the weights have encoded a pattern that doesn't quite fit the situation. Understanding weights helps explain why model size, training data, and fine-tuning all affect behavior.