huggingface · syedmuhammadayyanibrar · Mar 6, 2026
diff --git a/chapters/en/chapter1/8.mdx b/chapters/en/chapter1/8.mdx
@@ -119,8 +119,8 @@ One common challenge with LLMs is their tendency to repeat themselves - much lik
 
 ![image](https://huggingface.co/reasoning-course/images/resolve/main/inference/2.png)  
 
-These penalties are applied early in the token selection process, adjusting the raw probabilities before other sampling strategies are applied. Think of them as gentle nudges encouraging the model to explore new vocabulary.
-
+These penalties are applied early in the token selection process, adjusting the raw probabilities before other sampling strategies are applied. Think of them as gentle nudges encouraging the model to explore new vocabulary.  
+``` LOGIT GENERATION --> PENALTY ADJUSTMENT --> SOFTMAX CALCULATION --> SAMPLING (Temperature/Top-P) --> SAMPLING (Temperature/Top-P)```
 ### Controlling Generation Length: Setting Boundaries
 
 Just as a good story needs proper pacing and length, we need ways to control how much text our LLM generates. This is crucial for practical applications - whether we're generating a tweet-length response or a full blog post.