advancedsrc · arxiv8 terms4 questions

Rhetorical Questions in LLM Representations: A Linear Probing Study.

the path

Read. Master the vocabulary. Fire two hot-takes. Then write the pitch and draw the system. End-state: you speak this like it's native.

01Brief
02Reference
03Vocabulary
04Warm-up
05The drill

The brief

read first · no peeking ahead

This study uses linear probes to investigate how LLMs internally represent rhetorical questions across different social-media datasets. Rhetorical signals emerge early in the model and are most stable in last-token representations, achieving 0.7–0.8 AUROC for binary classification. However, cross-dataset transfer reveals that rhetorical questions are encoded via multiple distinct linear directions rather than a single shared representation, with probes trained on different corpora producing conflicting rankings on the same data.

trade-offs

01Transferability without shared representation: a probe generalizes across datasets but does not capture a universal encoding, suggesting modularity comes at the cost of consistency.
02Discourse vs. syntax trade-off: models emphasizing rhetorical stance miss syntax-driven interrogatives, while those optimized for surface form overlook deeper argumentative intent.
03Early vs. late stability: rhetorical signals emerge early but are most stably captured at the final token, meaning intermediate layers may be noisier and less actionable for downstream tasks.
04Single-direction simplicity vs. multi-direction fidelity: assuming a single linear direction fails to capture the full richness of rhetorical encoding, complicating interpretability and control.

how a founder would frame it

“LLMs encode rhetoric the way humans interpret arguments in different social contexts—the same persuasive move reads differently depending on where you sit in the conversation.”

The system

study it · you'll redraw from memory

Vocabulary gym

flip · rate · repeat until all mastered

01 / 080 mastered

space: flip · ←→: nav · g: got it · r: review

term 01

Linear Probe

click or space to flip

definition

A simple linear classifier trained on frozen LLM representations to test whether a semantic property is linearly separable in the embedding space.

flip back ←

Hot-takes

one sentence each · lead with the verb

Two hot-takes. One sentence each. No hedging, no lists — just the sharpest answer you can land. The coach replies in seconds with a score and a tighter rewrite.

The paper shows that rhetorical signals stabilize at the final token representation. How would you design a multi-token aggregation strategy—mean pooling, attention-based, or contrastive—and would it improve or degrade the detectability of rhetorical phenomena?

0 / 320 · ⌘↵ to send

Cross-dataset transfer reaches 0.7–0.8 AUROC but produces conflicting rankings. If you had to deploy this system in production on an unseen corpus, how would you validate that your probe had learned rhetoric rather than spurious social-media artifacts?

0 / 320 · ⌘↵ to send

The drill

write the pitch · draw the system

prompt

Linear probes reveal that rhetorical questions in LLMs are encoded by multiple distinct linear directions, not a single shared representation. This creates a design dilemma for any system that needs to reliably detect or control rhetorical language: should you (a) train separate specialized probes for each discourse context, accepting the cost of complexity and maintenance; (b) force a single unified representation by regularizing during model training, accepting the loss of nuance and discourse sensitivity; or (c) learn a meta-probe that selects among multiple direction-sets dynamically, trading off computation and latency for adaptability? Defend your choice in light of the paper's finding that overlap between dataset-specific top instances is below 0.2. What downstream task—content moderation, debate analysis, stance detection—would most benefit from one approach, and which would suffer most from it?

essay · target 400–600 words

000 / 500

judge