Applied Language Systems · Case Study

Metaphor to Measurement · Building a Pain Tagger

Moving from figurative pain descriptions to taxonomy design, reproducible annotations, and a minimal prototype tagger.

By Stella Bullo · Updated: 2026-02-20 · Tags: pain language, taxonomy design, rule-based NLP, annotation QA

Pain language is highly metaphorical. Patients describe pain as if it were a weapon, a wild animal, or fire consuming the body. These descriptions are vivid, but they are also variable: no two people use exactly the same phrasing. For research, clinical communication, and NLP prototyping, the challenge is practical: how do you transform figurative language into something measurable without flattening meaning?

Key idea

The path from narrative to features is design work: taxonomy → annotation → QA → minimal tagger → interpretable output.

From metaphor to taxonomy

The first step is a taxonomy that covers literal and figurative pain language. In a compact schema, I used five core category families plus a layer for metaphor entailments (the conceptual consequences a metaphor implies).

  • Pain qualities
  • Body location
  • Intensity markers
  • Temporal markers
  • Figurative expressions with entailment families (violence, weight/pressure, heat, animal, containment)

In the Language of Endometriosis data, metaphor is not noise. It functions as a precision strategy when literal language feels insufficient. Capturing entailments makes it possible to quantify what the speaker is actually expressing.

From taxonomy to annotations

With the categories defined, the next step is consistent annotation and lightweight QA. Below are simplified examples showing how figurative language becomes structured data:

  • “It feels like a knife stabbing my womb” → Quality: stabbing; Location: womb; Entailment: violence
  • “A burning rope around my stomach” → Quality: burning; Location: stomach; Entailment: heat + containment
  • “A heavy rock pressing on me” → Quality: heavy/pressing; Entailment: weight/pressure

The result is a dataset where narrative variation is preserved but represented through comparable category selections.

From annotations to a prototype tagger

The first prototype is deliberately small and rule-based. Using the annotated data, I built feature lists and mappings that are easy to audit and iterate:

  • Lexicons for pain qualities (burning, stabbing, throbbing)
  • Patterns for intensity (very, unbearable, mild)
  • Mappings from figurative triggers to entailments (knife → violence, fire → heat)

On new input, the tagger scans for these signals and returns a structured pain profile:

  • Detected qualities
  • Location mentions
  • Intensity markers
  • Metaphor entailments
  • Temporal markers

Example input: “It is like fire ripping through my stomach at night.”

  • Quality: burning / ripping
  • Location: stomach
  • Entailment: heat/fire + violence
  • Temporal: at night

Why this matters

For annotators, this shows how careful tagging captures conceptual structure, not just surface words. For project leads, it demonstrates that a small, well-scoped annotation effort can power a working prototype. For developers, it provides an explicit bridge from narrative pain descriptions to computational features that can support human-facing tools.

Closing

This pain tagger is a proof of concept: figurative pain descriptions can be transformed into structured data without losing their expressive richness. The corpus provides the language, the taxonomy provides the categories, QA provides consistency, and the prototype demonstrates feasibility.