WeightWatch Visualizer for Qwen-2.5 7B

This visualizer allows you to explore the activation patterns discovered by WeightWatch across different layers and SVD directions. For each singular direction, 10 minimally and maximally activated samples are shown (with rank #1, #4 ... #31), as well as interpretations from o4-mini. Results may be slightly different from the paper since we reran sample collection and truncated samples at maximally firing tokens.

User
Assistant
Pattern
Transcript
Extremes
Search patterns, transcripts, or extreme examples (e.g., 'russian', 'translate', 'code')...
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
Layer 6
Layer 7
Layer 8
Layer 9
Layer 10
Layer 11
Layer 12
Layer 13
Layer 14
Layer 15
Layer 16
Layer 17
Layer 18
Layer 19
Layer 20
Layer 21
Layer 22
Layer 23
Layer 24
Layer 25
Layer 26
Layer 27

$\Delta O_{\text{proj}}$ directions (U0-U19):

$\Delta W_{\text{down}}$ projections (U0-U19):

Select a layer and direction to view the data visualization.