Metaphor to Measurement · Building a Pain Tagger

~8 min · Updated Sep 2025

Introduction

Pain language is highly metaphorical. Patients talk about pain as if it were a weapon, a wild animal, or a fire consuming their body. These metaphors are vivid, but they are also messy: no two patients use exactly the same words. For researchers, clinicians, or developers, this creates a challenge: how do you take figurative, varied expressions and make them measurable?

This article shows how I moved from metaphorical descriptions in the Language of Endometriosis Project to a compact taxonomy, then to consistent annotations, and finally into a tiny prototype pain tagger. The goal was not to solve everything at once, but to prove that patient language can be captured, structured, and transformed into features that power tools like the Explain My Pain app.

From metaphor to taxonomy

The first step was to define a taxonomy that handles both literal and figurative pain language. As outlined in Data Taxonomy for Pain Language, the schema includes five core categories and a layer of metaphor entailments.

Pain qualities
Body location
Intensity markers
Temporal markers
Figurative expressions with entailments such as violence, weight, heat, animal, and containment

In the Endometriosis data, metaphor was not noise. Patients reached for metaphors because literal language felt inadequate. Capturing metaphors and their entailments made it possible to quantify what the patient was actually expressing.

From taxonomy to annotations

With the categories in place, I annotated the corpus using the lightweight QA strategies in Lightweight QA and Sampling Notes. Below are simplified examples to illustrate the mapping:

“It feels like a knife stabbing my womb” → Quality: stabbing; Location: womb; Entailment: violence
“A burning rope around my stomach” → Quality: burning; Location: stomach; Entailment: heat plus containment
“A heavy rock pressing on me” → Quality: heavy or pressing; Entailment: weight or pressure

This transformed figurative descriptions into structured data points that could be counted, compared, and fed into a prototype.

From annotations to a prototype tagger

The first prototype was deliberately small and rule based. I used the annotated data to create feature lists and simple mappings:

Keyword lists for pain qualities such as burning, stabbing, throbbing
Patterns for intensity markers such as very, unbearable, mild
Mappings from metaphorical expressions to entailments such as knife → violence, fire → heat

When a new text was entered, the tagger scanned for these features and produced a structured pain profile:

Qualities detected
Location mentions
Intensity markers
Metaphor entailments
Temporal markers

Example input: “It is like fire ripping through my stomach at night.”

Quality: burning or ripping
Location: stomach
Intensity: implied escalation
Entailment: heat or fire and violence
Temporal: at night

Why this matters

For annotators, this case study shows how careful tagging captures conceptual depth, not just surface words. For recruiters and project leads, it proves that a small, well managed annotation effort can power a working prototype. For developers, it offers a bridge from narrative descriptions to computational features that can be integrated into the Explain My Pain experience.

Closing

Building the pain tagger was a proof of concept: figurative pain descriptions can be transformed into structured data without losing their richness. The Endometriosis corpus provided the metaphors, the taxonomy gave the categories, QA kept things consistent, and the prototype demonstrated feasibility.

The last article in this series shares the small pieces that make this process repeatable: Annotation Snippets and Prompts I Reuse.

De la metáfora a la medición · Construyendo un tagger de dolor

~8 min · Actualizado Sep 2025

Introducción

El lenguaje del dolor es altamente metafórico. Las personas hablan del dolor como si fuera un arma, un animal salvaje o un fuego que consume el cuerpo. Estas metáforas son vívidas, pero también desordenadas: no hay dos personas que usen exactamente las mismas palabras. Para investigación, clínica o desarrollo, el desafío es claro: ¿cómo convertir expresiones figuradas y variadas en algo medible?

Este artículo muestra cómo pasé de las descripciones metafóricas en el Language of Endometriosis Project a una taxonomía compacta, luego a anotaciones consistentes y finalmente a un pequeño prototipo de tagger de dolor. La meta no fue resolverlo todo a la vez, sino demostrar que el lenguaje de pacientes puede capturarse, estructurarse y transformarse en features que alimentan herramientas como la app Explain My Pain.

De metáforas del corpus a features estructuradas: taxonomía → anotación → prototipo → app.

De la metáfora a la taxonomía

El primer paso fue definir una taxonomía que cubra lenguaje literal y figurado. Como detallo en Taxonomía de datos para el lenguaje del dolor, el esquema incluye cinco categorías básicas y una capa de entailments metafóricos.

Cualidades del dolor
Ubicación en el cuerpo
Marcadores de intensidad
Marcadores temporales
Expresiones figuradas con entailments como violencia, peso, calor, animal y encierro

En los datos de Endometriosis, la metáfora no era ruido. Las pacientes recurrían a metáforas cuando el lenguaje literal resultaba insuficiente. Capturar metáforas y entailments permitió cuantificar lo que realmente se expresaba.

De la taxonomía a las anotaciones

Con las categorías definidas, anoté el corpus usando las estrategias de QA livianas de Notas sobre QA y muestreo livianos. Ejemplos simplificados:

“Es como un cuchillo que me apuñala el útero” → Cualidad: punzante o apuñalar; Ubicación: útero; Entailment: violencia
“Una cuerda ardiendo alrededor del estómago” → Cualidad: ardiente o quemante; Ubicación: estómago; Entailment: calor más encierro
“Una piedra pesada que me presiona” → Cualidad: pesado o presionar; Entailment: peso o presión

Esto transformó descripciones figuradas en puntos de datos estructurados que pueden contarse, compararse y alimentar un prototipo.

De las anotaciones a un prototipo de tagger

El primer prototipo fue deliberadamente pequeño y basado en reglas. Utilicé las anotaciones para crear listas de features y mapeos simples:

Listas de palabras clave para cualidades como urente, punzante, palpitante
Patrones para intensidad como muy, insoportable, leve
Mapeos de expresiones metafóricas a entailments como cuchillo → violencia, fuego → calor

Al ingresar un nuevo texto, el tagger buscaba estas señales y producía un perfil estructurado del dolor:

Cualidades detectadas
Menciones de ubicación
Marcadores de intensidad
Entailments metafóricos
Marcadores temporales

Ejemplo de entrada: “Es como fuego que me atraviesa el estómago de noche”.

Cualidad: urente o desgarrante
Ubicación: estómago
Intensidad: escalada implícita
Entailment: calor o fuego y violencia
Temporal: de noche

Por qué importa

Para anotadoras y anotadores, este caso muestra cómo un etiquetado cuidadoso capta profundidad conceptual y no solo palabras de superficie. Para recruiters y líderes de proyecto, demuestra que un esfuerzo pequeño pero bien gestionado puede sostener un prototipo funcional. Para desarrolladoras y desarrolladores, ofrece un puente entre relatos y features computacionales que pueden integrarse en la experiencia de Explain My Pain.

Cierre

Construir el tagger de dolor fue una prueba de concepto: es posible transformar descripciones figuradas en datos estructurados sin perder su riqueza. El corpus de Endometriosis aportó las metáforas, la taxonomía las categorías, el QA la consistencia y el prototipo la viabilidad.

El artículo final de la serie comparte las piezas pequeñas que vuelven esto repetible: Fragmentos y prompts de anotación que reutilizo.