Annotation Across Two Worlds: Linguistics vs. NLP

By Stella Bullo · Updated 4 September 2025

Introduction

Annotation sounds like one thing: adding information to data. But ask a linguist and ask an NLP engineer and you will get two very different answers. For a discourse analyst, annotation is a way of uncovering how language enacts power, identity and metaphor. For an NLP practitioner, annotation is the backbone of training data, short and consistent labels that feed machine learning models.

This piece compares the two traditions, showing how the same text can be read through the lenses of linguistic annotation and AI/NLP annotation.

Personal note

I come at annotation from both sides. My training is in linguistics and discourse analysis, where annotation is a way to unpack identity, agency and metaphor. More recently I have been working with annotation in the context of NLP projects, where the same word means something much more operational, preparing data so models can learn. Living in both of these spaces makes the differences clear, but it also shows how much the two approaches can gain from one another.

What Linguists Mean by Annotation

In linguistics and discourse studies, annotation is interpretive and theory-driven. Analysts draw on frameworks that give them categories to work with, and each framework shines a light on a different layer of meaning.

Systemic Functional Linguistics (SFL) (Halliday & Matthiessen, 2014) helps identify process types, material, mental, verbal, relational, and map how participants enact roles of actor, sayer or experiencer. This is a way of capturing agency and voice in clauses.

Critical Discourse Analysis and Appraisal theory (Fairclough, 1992; Martin & White, 2005; Wodak & Meyer, 2009) provide tools for looking at stance, evaluation and ideology. Connectives like but or even when are treated seriously, as resources that shift authority or open space for resistance.

Conceptual Metaphor Theory (CMT) (Lakoff & Johnson, 1980) focuses on how abstract experiences are structured through source domains such as weight, battle or movement. Saying push through the pain draws on the physical act of pushing against resistance to frame the experience of illness as a struggle that can be overcome.

The point is to uncover layers of meaning: who is speaking, how they position themselves, what metaphors they use and which social schemas they activate. Annotation here is less about volume and more about depth, linking the small details of grammar or word choice to wider patterns of identity, ideology and lived experience.

Example

“Doctors keep telling me it’s just stress, but I know my body better than anyone. I push through the pain to go to work, even when it feels impossible.”

Doctors keep telling me… → the patient is positioned as subordinate, doctors hold institutional authority (SFL: verbal process, asymmetry of roles).
but I know my body better than anyone → adversative stance (Appraisal: counter-expectation), reclaiming epistemic authority.
I push through the pain → material process, resilience, warrior-like identity, and a metaphor of illness as a barrier (CMT).
even when it feels impossible → concessive stance (Appraisal: concession), which intensifies the sense of agency.

Using more than one framework is not just a stylistic choice but a sign of rigour. SFL maps roles and processes, CDA and Appraisal uncover stance and ideology, and CMT reveals the metaphorical scaffolding of experience. Taken together, these approaches provide a fuller picture of how language enacts identity and resistance. They also show that linguists do not sit in silos, we borrow, adapt and combine. That interdisciplinarity is essential when annotation is used not only in academic research but also in fields like AI and health communication.

What NLP and ML Practitioners Mean by Annotation

In the AI world annotation is operational. The aim is not interpretation but structured training data that models can learn from. Annotation guidelines are designed for clarity and speed, and the focus is on consistency and scalability. Annotators are expected to tag quickly and uniformly so that the dataset is machine-readable and reliable across huge volumes of examples.

Common types

Text classification, categories like health, finance or politics, or simple sentiment values such as positive, negative or neutral.
Entity recognition, spans tagged as names, symptoms, locations or dates.
Intent classification in conversational AI, utterances marked as book a flight, ask a question or cancel an order.
Coreference and relations, linking “she” to “Mary,” or doctor to hospital.
Safety and moderation, harmful or misleading content flagged.

Layers

Token level, each word tagged for part-of-speech or entity type.
Span level, multi-word chunks such as chronic pain marked as a single unit.
Sentence level, entire utterances labelled for sentiment, intent or safety.
Document level, longer texts tagged for topic or stance.
Discourse / cross-document, references linked across sentences or documents.

Example

“Doctors keep telling me it’s just stress, but I know my body better than anyone. I push through the pain to go to work, even when it feels impossible.”

Token level: “Doctors” → PERSON, “stress” → CONDITION, “pain” → SYMPTOM.
Sentence-level sentiment:
- “Doctors keep telling me…” → negative, dismissive.
- “but I know my body…” → positive/neutral, confidence.
- “I push through the pain…” → mixed, negative experience and positive resilience.
Document-level stance: overall negative towards medical authority and positive self-assertion.

What matters here is consistency and computability. Labels need to be unambiguous, reproducible and simple enough that annotators can agree on them. While this often trims nuance, it enables datasets to scale to millions of examples. The challenge, and the opportunity, lies in designing schemas that are efficient and still sensitive to the ethical and social realities encoded in data.

Side by Side Comparison

Aspect	Linguistics / Discourse	NLP / Machine Learning
Unit of analysis	Clause or phrase	Token, sentence, document
Labels	Identity, agency, metaphor, stance	Sentiment, entity, intent, topic
Purpose	Interpretation and uncovering ideology	Dataset creation for model training
Output	Narrative and layered analysis	CSV or JSON with categorical labels
Strength	Nuance and theoretical depth	Scale and consistency
Limitation	Small scope, subjective interpretation	Oversimplification, hidden bias

What Each World Can Learn from the Other

From linguistics, NLP can gain awareness of how identity, power and metaphor shape data. Without that awareness, AI models risk reproducing the very biases they are meant to avoid. From NLP, linguistics can borrow methods of standardisation, agreement checking and scale, strengthening corpus-based research.

The most exciting possibilities come from mixing the two, hybrid annotation frameworks where automatic tagging handles scale, while human analysts add interpretive layers that capture nuance.

Conclusion

Annotation is never neutral. Whether in a linguistics seminar or an AI lab, the act of assigning labels is also the act of assigning meaning. In linguistics, annotation shows how language encodes power, identity and metaphor. In NLP, annotation creates the datasets that will shape how machines understand human communication.

Both traditions have strengths, the nuance and theoretical depth of discourse analysis, and the scale and reliability of computational annotation. Both also have limitations, subjectivity and small scope on the linguistic side, oversimplification and bias reproduction on the computational side.

The future lies in building bridges. Hybrid annotation can bring the best of both worlds, creating large-scale datasets enriched by discourse-sensitive insights. This does not just improve models, it also ensures that the human realities in language — identity, agency, resistance, resilience — are not flattened into categories.

As someone moving between these two spaces, I see annotation not as a mechanical step but as a meeting point between humanities and technology. Done well, it can produce systems that are more accurate, more ethical and more attuned to the complexities of lived experience. And if we can sneak a little discourse analysis into the machine learning pipeline along the way, that is a win for both sides.

Anotación en dos mundos: Lingüística vs. NLP

Por Stella Bullo · Actualizado 4 de septiembre de 2025

Introducción

La anotación suena a una sola cosa: agregar información a los datos. Si se lo preguntas a una lingüista y a una ingeniera de NLP, sin embargo, obtendrás respuestas muy distintas. Para el análisis del discurso, anotar es una forma de mostrar cómo el lenguaje pone en juego poder, identidad y metáfora. Para NLP, anotar es la base de los datos de entrenamiento, etiquetas breves y consistentes que alimentan a los modelos.

Este texto compara ambas tradiciones y muestra cómo el mismo fragmento puede leerse desde la lente de la anotación lingüística y la anotación para IA/NLP.

Nota personal

Llego a la anotación desde ambos lados. Me formé en lingüística y análisis del discurso, donde anotar sirve para desentrañar identidad, agencia y metáfora. Más recientemente trabajé con anotación en proyectos de NLP, donde la misma palabra significa algo mucho más operativo: preparar datos para que los modelos aprendan. Vivir en estos dos espacios deja claras las diferencias y también cuánto pueden enriquecerse entre sí.

Qué entiende la lingüística por “anotar”

En lingüística y estudios del discurso, la anotación es interpretativa y guiada por teoría. Las y los analistas se apoyan en marcos que ofrecen categorías de trabajo y cada marco ilumina una capa distinta del sentido.

Lingüística Sistémico-Funcional (SFL) (Halliday y Matthiessen, 2014) ayuda a identificar tipos de proceso —material, mental, verbal, relacional— y a mapear cómo los participantes asumen roles de actor, hablante o experienciador. Es una forma de captar agencia y voz en la cláusula.

Análisis Crítico del Discurso y teoría de la Valoración (Fairclough, 1992; Martin y White, 2005; Wodak y Meyer, 2009) ofrecen herramientas para observar postura, evaluación e ideología. Conectores como pero o aunque se tratan con seriedad, como recursos que desplazan la autoridad o abren espacio para la resistencia.

Teoría de la Metáfora Conceptual (CMT) (Lakoff y Johnson, 1980) explica cómo experiencias abstractas se estructuran mediante dominios fuente como peso, batalla o movimiento. Decir empujar a través del dolor toma el acto físico de empujar contra una resistencia para enmarcar la enfermedad como una lucha que puede superarse.

El objetivo es descubrir capas de sentido: quién habla, cómo se posiciona, qué metáforas usa y qué esquemas sociales activa. Aquí importa menos el volumen y más la profundidad, conectando los detalles de la gramática y la elección léxica con patrones más amplios de identidad, ideología y experiencia vivida.

Ejemplo

«Los médicos me repiten que es solo estrés, pero yo conozco mi cuerpo mejor que nadie. Empujo a través del dolor para ir a trabajar, incluso cuando parece imposible».

Los médicos me repiten… → la persona paciente queda subordinada, la autoridad institucional la sostienen los médicos (SFL: proceso verbal, asimetría de roles).
pero yo conozco mi cuerpo… → postura adversativa (Valoración: contraexpectativa) y recuperación de la autoridad epistémica.
Empujo a través del dolor → proceso material, resiliencia e identidad combativa, además de una metáfora de la enfermedad como barrera (CMT).
incluso cuando parece imposible → postura concesiva (Valoración: concesión) que intensifica la agencia.

Usar más de un marco no es una preferencia de estilo, es un signo de rigor. SFL mapea roles y procesos, el ACD y la Valoración muestran postura e ideología, y CMT revela el andamiaje metafórico de la experiencia. Juntos ofrecen una visión más completa de cómo el lenguaje construye identidad y resistencia. Y, de paso, muestran que la lingüística no trabaja en silos: tomamos, adaptamos y combinamos. Esa interdisciplinariedad es clave cuando la anotación se aplica no solo en investigación, sino también en IA y comunicación en salud.

Qué entiende NLP/ML por “anotar”

En IA la anotación es operativa. No busca interpretar, sino generar datos de entrenamiento estructurados. Las guías de anotación priorizan claridad y velocidad, y el foco está en la consistencia y la escalabilidad. Se espera que quienes anotan etiqueten rápido y de forma uniforme, de modo que el conjunto sea legible por máquina y fiable a gran escala.

Tipos comunes

Clasificación de texto, con categorías como salud, finanzas o política, o polaridades simples como positivo, negativo o neutral.
Reconocimiento de entidades, spans etiquetados como nombres, síntomas, lugares o fechas.
Clasificación de intención en IA conversacional, enunciados marcados como reservar un vuelo, hacer una consulta o cancelar.
Correferencia y relaciones, enlazar “ella” con “María” o doctor con hospital.
Seguridad y moderación, detección de contenido dañino o engañoso.

Capas

Nivel de token, cada palabra con etiqueta de categoría gramatical o entidad.
Nivel de span, grupos como dolor crónico tratados como una sola unidad.
Nivel de oración, enunciados completos con sentimiento, intención o seguridad.
Nivel de documento, textos largos con tema o postura.
Nivel discursivo / entre documentos, referencias enlazadas entre oraciones o textos.

Ejemplo

«Los médicos me repiten que es solo estrés, pero yo conozco mi cuerpo mejor que nadie. Empujo a través del dolor para ir a trabajar, incluso cuando parece imposible».

Nivel token: «médicos» → PERSONA, «estrés» → CONDICIÓN, «dolor» → SÍNTOMA.
Sentimiento a nivel de oración:
- «Los médicos me repiten…» → negativo, desestimación.
- «pero yo conozco mi cuerpo…» → positivo/neutral, confianza.
- «Empujo a través del dolor…» → mixto, experiencia negativa y resiliencia positiva.
Postura a nivel de documento: negativa hacia la autoridad médica y positiva respecto de la autoafirmación.

Aquí lo crucial es la consistencia y la computabilidad. Las etiquetas deben ser inequívocas, reproducibles y lo bastante simples para que haya acuerdo entre anotadores. Aunque se pierda matiz, esto permite escalar a millones de ejemplos. El reto y la oportunidad están en diseñar esquemas eficientes y, a la vez, sensibles a las realidades éticas y sociales que el dato contiene.

Comparación lado a lado

Aspecto	Lingüística / Discurso	NLP / Aprendizaje automático
Unidad de análisis	Cláusula o frase	Token, oración, documento
Etiquetas	Identidad, agencia, metáfora, postura	Sentimiento, entidad, intención, tema
Propósito	Interpretar y revelar ideología	Crear datos para entrenar modelos
Salida	Análisis narrativo y por capas	CSV o JSON con categorías
Fortaleza	Matiz y profundidad teórica	Escala y consistencia
Limitación	Alcance reducido, interpretación subjetiva	Simplificación excesiva, sesgo oculto

Lo que cada mundo puede aprender del otro

De la lingüística, NLP puede tomar una mirada atenta a identidad, poder y metáfora. Sin eso, los modelos corren el riesgo de reproducir sesgos. De NLP, la lingüística puede adoptar estandarización, chequeos de acuerdo y escala, lo que fortalece la investigación basada en corpus.

Las posibilidades más interesantes surgen al combinar ambos enfoques: marcos híbridos donde el etiquetado automático cubre la escala y el análisis humano aporta las capas interpretativas.

Conclusión

La anotación nunca es neutral. En el aula de lingüística o en un laboratorio de IA, etiquetar también es asignar sentido. En lingüística, la anotación muestra cómo el lenguaje codifica poder, identidad y metáfora. En NLP, crea los conjuntos de datos que moldearán cómo las máquinas procesan la comunicación humana.

Ambas tradiciones tienen fortalezas: el matiz y la profundidad teórica del análisis del discurso, y la escala y confiabilidad de la anotación computacional. También tienen límites: subjetividad y poco alcance del lado lingüístico, simplificación y reproducción de sesgos del lado computacional.

El futuro está en tender puentes. La anotación híbrida puede traer lo mejor de los dos mundos, conjuntos masivos enriquecidos con sensibilidad discursiva. No solo mejora los modelos, también cuida que las realidades humanas del lenguaje —identidad, agencia, resistencia, resiliencia— no queden aplastadas en categorías.

Desde mi paso entre ambos espacios, la anotación no es un trámite mecánico sino un punto de encuentro entre humanidades y tecnología. Bien hecha, produce sistemas más precisos, más éticos y más atentos a la complejidad de la experiencia. Y si de paso colamos un poco de análisis del discurso en la tubería de ML, ganamos todos.