Previous data-driven work investigating the types and distributions of discourse
relation signals, including discourse markers such as 'however' or phrases such as 'as a
result' has focused on the relative frequencies of signal words within and outside text
from each discourse relation. Such approaches do not allow us to quantify the signaling
strength of individual instances of a signal on a scale (e.g. more or less
discourse-relevant instances of 'and'), to assess the distribution of ambiguity for
signals, or to identify words that hinder discourse relation identification in context
('anti-signals' or 'distractors'). In this paper we present a data-driven approach to
signal detection using a distantly supervised neural network and develop a metric, Δs
(or 'delta-softmax'), to quantify signaling strength. Ranging between -1 and 1 and
relying on recent advances in contextualized words embeddings, the metric represents
each word's positive or negative contribution to the identifiability of a relation in
specific instances in context. Based on an English corpus annotated for discourse
relations using Rhetorical Structure Theory and signal type annotations anchored to
specific tokens, our analysis examines the reliability of the metric, the places where
it overlaps with and differs from human judgments, and the implications for identifying
features that neural models may need in order to perform better on automatic discourse
relation classification.