politenessStrategies¶
Computes Politeness Strategies features.
Currently offering three strategy collections covering two languages:
- politeness_api: English politeness strategies described in A computational approach to politeness with application to social factors
- politeness_local: English politeness strategies realized through local markers as used in Facilitating the Communication of Politeness through Fine-Grained Paraphrasing
- politeness_cscw_zh: Chinese politeness strategies adapted from Studying Politeness across Cultures using English Twitter and Mandarin Weibo
Example usage:
- extracting politeness features and markers,
- understanding the (mis)use of politeness strategies in conversations gone awry on Wikipedia,
- assessing the permeability of politeness markers in machine-translated communication
-
class
convokit.politenessStrategies.politenessStrategies.
PolitenessStrategies
(parse_attribute_name: str = 'parsed', strategy_attribute_name: str = 'politeness_strategies', marker_attribute_name: str = 'politeness_markers', strategy_collection: str = 'politeness_api', verbose: int = 0)¶ Encapsulates extraction of politeness strategies from utterances in a Corpus.
Parameters: - parse_attribute_name – metadata attribute name to read parses from. Default is ‘parsed’.
- strategy_attribute_name – metadata attribute name to store politeness strategies features under during the transform() step. Default is ‘politeness_strategies’.
- marker_attribute_name – metadata attribute name to store politeness markers under during the transform() step. Default is ‘politeness_markers’.
- strategy_collection – collection of politeness strategies to extract. Options include: “politeness_api”: English politeness strategies proposed in A computational approach to politeness with application to social factors (https://www.cs.cornell.edu/~cristian/Politeness.html) “politeness_local”: English politeness strategies realized through local markers as used in Facilitating the Communication of Politeness through Fine-Grained Paraphrasing (https://www.cs.cornell.edu/~cristian/Politeness_Paraphrasing.html) “politeness_cscw_zh”: Chinese politeness strategies adapted from `Studying Politeness across Cultures using English Twitter and Mandarin Weibo (https://dl.acm.org/doi/abs/10.1145/3415190) Default is “politeness_api”.
- verbose – whether and how often to print status messages while computing features.
-
transform
(corpus: convokit.model.corpus.Corpus, selector: Union[Callable[[convokit.model.utterance.Utterance], bool], NoneType] = <function PolitenessStrategies.<lambda>>, markers: bool = False)¶ Extract politeness strategies from each utterances in the corpus and annotate the utterances with the extracted strategies. Requires that the corpus has previously been transformed by a Parser, such that each utterance has dependency parse info in its metadata table.
Parameters: - corpus – the corpus to compute features for.
- selector – a (lambda) function that takes an Utterance and returns a bool indicating whether the utterance should be included in this annotation step.
- markers – whether or not to add politeness occurrence markers
-
transform_utterance
(utt: convokit.model.utterance.Utterance, spacy_nlp: Callable[[str], spacy.tokens.doc.Doc] = None, markers: bool = False)¶ Extract politeness strategies for raw string inputs (or individual utterances)
Parameters: utt – the utterance to be annotated with politeness strategies. Spacy_nlp: if provided, will use this SpaCy object to do parsing; otherwise will initialize an object via load(‘en’). Returns: the utterance with politeness annotations.
-
summarize
(corpus: convokit.model.corpus.Corpus, selector: Callable[[convokit.model.utterance.Utterance], bool] = <function PolitenessStrategies.<lambda>>, plot: bool = False, y_lim=None)¶ Calculates strategy prevalence and plot graph if plot == True, with an optional selector that specifies which utterances to include in the analysis.
Parameters: - corpus – the target Corpus
- selector – a function (typically, a lambda function) that takes an Utterance and returns True or False (i.e. include / exclude).
By default, the selector includes all Utterances in the Corpus. :param plot: whether or not to output graph. :return: a pandas DataFrame of scores with graph optionally outputted