Speaker

class convokit.model.speaker.Speaker(owner=None, id: str = None, name: str = None, utts: MutableMapping[KT, VT] = None, convos: MutableMapping[KT, VT] = None, meta: Union[Dict[KT, VT], NoneType] = None, from_db=False, storage: Union[convokit.storage.storageManager.StorageManager, NoneType] = None)

Represents a single speaker in a dataset.

Parameters:
  • id (str) – id of the speaker.
  • utts – dictionary of utterances by the speaker, where key is utterance id
  • convos – dictionary of conversations started by the speaker, where key is conversation id
  • meta (dict) – arbitrary dictionary of attributes associated with the speaker.
Variables:
  • id – id of the speaker.
  • meta – A dictionary-like view object providing read-write access to speaker-level metadata.
add_meta(key: str, value) → None

Adds a key-value pair to the metadata of the corpus object :param key: name of metadata attribute :param value: value of metadata attribute :return: None

add_vector(vector_name: str)

Logs in the Corpus component object’s internal vectors list that the component object has a vector row associated with it in the vector matrix named vector_name. Transformers that add vectors to the Corpus should use this to update the relevant component objects during the transform() step. :param vector_name: name of vector matrix :return: None

delete_vector(vector_name: str)

Delete a vector associated with this Corpus component object. :param vector_name: :return: None

classmethod from_dbdoc(doc: convokit.storage.dbMappings.DBDocumentMapping)

Initialize a corpusComponent object with data contained in the DB document represented by doc.

Parameters:
  • cls – class to initialize: Utterance, Conversation, or Speaker
  • doc – DB document to initialize the corpusComponent from
Returns:

the initialized corpusComponent object

get_conversation(cid: str)

Get the Conversation with the specified Conversation id

Parameters:cid – The id of the Conversation
Returns:A Conversation object
get_conversation_ids(selector=<function Speaker.<lambda>>) → List[str]
Returns:a List of the ids of Conversations started by the speaker
get_conversations_dataframe(selector=<function Speaker.<lambda>>, exclude_meta: bool = False)

Get a DataFrame of the Conversations the Speaker has participated in, with fields and metadata attributes. Set an optional selector that filters for Conversations that should be included. Edits to the DataFrame do not change the corpus in any way.

Parameters:
  • exclude_meta – whether to exclude metadata
  • selector – a (lambda) function that takes a Conversation and returns True or False (i.e. include / exclude). By default, the selector includes all Conversations in the Corpus.
Returns:

a pandas DataFrame

get_info(key)

Gets attribute <key> of the corpus object. Returns None if the corpus object does not have this attribute. :param key: name of attribute :return: attribute <key>

get_utterance(ut_id: str)

Get the Utterance with the specified Utterance id

Parameters:ut_id – The id of the Utterance
Returns:An Utterance object
get_utterance_ids(selector=<function Speaker.<lambda>>) → List[str]
Returns:a List of the ids of Utterances made by the speaker
get_utterances_dataframe(selector=<function Speaker.<lambda>>, exclude_meta: bool = False)

Get a DataFrame of the Utterances made by the Speaker with fields and metadata attributes. Set an optional selector that filters for Utterances that should be included. Edits to the DataFrame do not change the corpus in any way.

Parameters:
  • exclude_meta – whether to exclude metadata
  • selector – a (lambda) function that takes a Utterance and returns True or False (i.e. include / exclude). By default, the selector includes all Utterances in the Corpus.
Returns:

a pandas DataFrame

get_vector(vector_name: str, as_dataframe: bool = False, columns: Union[List[str], NoneType] = None)

Get the vector stored as vector_name for this object. :param vector_name: name of vector :param as_dataframe: whether to return the vector as a dataframe (True) or in its raw array form (False). False

by default.
Parameters:columns – optional list of named columns of the vector to include. All columns returned otherwise. This parameter is only used if as_dataframe is set to True
Returns:a numpy / scipy array
iter_conversations(selector=<function Speaker.<lambda>>)
Returns:An iterator of the Conversations that the speaker has participated in
iter_utterances(selector=<function Speaker.<lambda>>)

Get utterances made by the Speaker, with an optional selector that selects for Utterances that should be included.

param selector:a (lambda) function that takes an Utterance and returns True or False (i.e. include / exclude). By default, the selector includes all Utterances in the Corpus.
Returns:An iterator of the Utterances made by the speaker
print_speaker_stats()

Helper function for printing the number of Utterances made and Conversations participated in by the Speaker.

Returns:None (prints output)
retrieve_meta(key: str)

Retrieves a value stored under the key of the metadata of corpus object :param key: name of metadata attribute :return: value

set_info(key, value)

Sets attribute <key> of the corpus object to <value>. :param key: name of attribute :param value: value to set :return: None