Azure Language: NLP and Text Analytics for Documents, Mail and Tickets

Azure AI Language in one view

One resource covers sentiment, key phrases, language detection, named entities, PII detection/redaction, summarization, and trainable custom classification (CLU for intents, custom text classification for documents). For mixed German/English corpora — normal in DACH companies — automatic language handling alone saves real engineering time.

Ticket triage: sentiment + PII redaction + routing

from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

client = TextAnalyticsClient(
    endpoint="https://YOUR-RES.cognitiveservices.azure.com/",
    credential=AzureKeyCredential(KEY))

tickets = [
    "Die Maschine steht seit Montag! Rechnung 2231 ist trotzdem "
    "gekommen. Rufen Sie mich an: 0171 5550000 — Hr. Weber",
]

# 1) redact PII before the text goes anywhere else
red = client.recognize_pii_entities(tickets, language="de")[0]
clean = red.redacted_text   # "... Rufen Sie mich an: ******** — Hr. *****"

# 2) sentiment decides the queue
sent = client.analyze_sentiment([clean], language="de")[0]
queue = "eskalation" if sent.confidence_scores.negative > 0.7 \
        else "standard"

# 3) key phrases become routing tags
tags = client.extract_key_phrases([clean], language="de")[0].key_phrases
print(queue, tags)   # eskalation ['Maschine', 'Rechnung 2231', ...]

When a local model beats the API

Per-document pricing is fine at 1,000 tickets/month and painful at 100,000. Above that line — or when tickets contain data that may not leave the company at all — a fine-tuned local 3B classifier does sentiment + routing with single-digit-millisecond latency and zero marginal cost. Our rule of thumb: use Azure Language to prototype the taxonomy fast, then decide with volume data whether the steady-state runner is cloud or a small model in your rack. The redaction pattern above is also exactly what we apply before any text is allowed to reach a cloud endpoint in hybrid setups.

Want this running inside your own VPN?

Localized AI fine-tunes small open models on your data and deploys them on your hardware — GDPR by architecture, zero per-token costs. Average setup: 72 hours.

Plan my deployment