Applied Language Systems

The Language of Endometriosis

A taxonomy of pain descriptors and a rule-based language system for interpretable, auditable pain communication.

By Stella Bullo · Updated: 2026-02-17 · Tags: pain language, taxonomy, rule-based NLP, health communication

Pain in endometriosis is frequently described through metaphor, evaluation, and embodied imagery. These descriptions are semantically rich yet difficult to translate into structured clinical documentation. This article presents (1) a linguistically grounded taxonomy of endometriosis pain descriptors derived from corpus-based research, and (2) the design of a rule-based language system that operationalises this taxonomy to generate structured patient and clinician summaries. The paper argues that transparent linguistic modelling offers an interpretable alternative to opaque AI-driven approaches in sensitive health contexts.

Key idea

A controlled taxonomy + explicit rules can translate lived pain language into structured outputs without losing interpretability.

1. The Translation Gap in Pain Communication

Endometriosis pain is persistent, cyclical, context-dependent, and multidimensional. Patients frequently rely on metaphor to make invisible sensations communicable:

  • “Like broken glass”
  • “A tightening band”
  • “Electric shocks”
  • “Like something burning inside”

Clinicians, however, require structured documentation, categorical clarity, and reproducible summaries. Numeric pain scales capture intensity but not mechanism. Free text preserves experience but resists standardisation. This creates a structural translation gap between lived experience and medical record.

2. Research Foundations

The taxonomy presented here derives from corpus-based analysis of endometriosis pain discourse (≈ 241,000 words). Quantitative analysis showed that pain occurred 2,131 times (≈ 8.8 per 1,000 words), over 120 times more frequently than in the British National Corpus.

31% of instances were figurative. These metaphors were not random; they formed recurring semantic patterns structured around mechanisms such as cutting, burning, pressure, invasion, and entrapment.

The taxonomy follows Conceptual Metaphor Theory and discourse-based evaluation analysis.

3. Taxonomy of Pain Descriptors

The taxonomy is curated, finite, and auditable. Each descriptor is mapped to:

  • Semantic domain
  • Metaphor category (if applicable)
  • Normalised clinical heading
  • Interpretive entailment

3.1 Sensory–Physical Domains

Cutting tools → penetration / sharp intrusion
Pressure / constriction → compression / restriction
Heat → inflammation / internal heat
Electric force → sudden discharge
Weight → gravitational burden

3.2 Emotional and Evaluative Domains

Entrapment → lack of escape or control
Predator / attack → external threat
Transformation → intrusion / loss of integrity

3.3 Contextual Occurrence

Menstruation-related
Ovulation-related
Intercourse-related
Bowel-related
Persistent / background

4. From Taxonomy to System Design

The taxonomy is operationalised in Explain My Pain, a rule-based, deterministic application built with a Flask backend and YAML taxonomy.

User selections are mapped to structured outputs across three dimensions:

  • Sensation (mechanism and quality)
  • Emotion (evaluative stance)
  • Context (trigger or cyclical pattern)

No predictive modelling is used. The system prioritises interpretability, traceability, and ethical transparency.

5. Why Rule-Based Modelling?

In sensitive health contexts, black-box AI systems risk hallucination, overgeneralisation, and loss of patient nuance. A structured taxonomy offers:

  • Explicit descriptor → category mapping
  • Controlled vocabulary
  • Auditable semantic logic
  • Reproducible outputs

This does not reject AI. It proposes semantic infrastructure beneath it.

6. Conclusion

The Language of Endometriosis taxonomy demonstrates how applied linguistics can function as infrastructure for digital health systems. Rather than replacing human interpretation, structured modelling supports it.

Interpretability can be designed deliberately. Language can be structured without being flattened.