Skip to content

[Dataset] Emotional Support Conversation Dataset #315

@Alubeto

Description

@Alubeto

Basic info about the dataset

Total Conversations Total Speakers Total Utterances
1300 2600 38365

The Emotional-Support-Conversation dataset contains approximately 1,300 conversations between supporters and seekers. It is organized in a relatively clean format: the entire dataset is stored in a single JSON file, with the main conversations located under the "dialog" key. Each conversation typically consists of about 20 utterances. Each utterance includes the speaker label, researcher-provided annotations, and the actual textual content.
This is a ConvoKit-formatted version of the dataset originally distributed with the following paper:
https://arxiv.org/abs/2106.01144. Link to the original dataset https://github.com/thu-coai/Emotional-Support-Conversation

Dataset Details

Speaker-Level Information

  • id : identifier given to a speaker, it is given in a format of {role}_{i}, where role is either "seeker" or "supporter", and i stands for the i-th they belongs
  • metadata:
    • role : could be either supporter or seeker

Utterance-Level Information

  • conversation_id: the ID of the conversation this utterance belongs to
  • reply_to: the ID of the previous utterance its reply to (None if it's the first utterance)
  • speaker: speaker that initiated the utterance
  • time_stamp: null for the entirety of this corpus. Did not include in the dataset
  • text: content/text of the utterance

Conversation-Level Information

  • metadata:
    • experience_type: category describing the kind of life experience the seeker is going through
    • emotion_type: emotion expressed by the seeker in the conversation
    • problem_type: within the categories of ongoing depression | breakup with partner | job crisis | problems with friends | academic pressure | procrastination | alcohol abuse| issues with parent | sleep problems | appearance anxiety | school bullying | issues with children |
    • situation: open text describing the causes of the emotional problem.
    • survey_score: numeric score from post-conversation surveys assessing the quality of support received for both seeker and supporter. For example 'seeker': {'initial_emotion_intensity': '3', 'empathy': '3', 'relevance': '3', 'final_emotion_intensity': '2'}, 'supporter': {'relevance': '3'}
    • seeker_question1: reflection questions answered by the seeker about their experience during the conversation
    • seeker_question2: additional reflection questions answered by the seeker about their experience during the conversation
    • supporter_question1: reflection questions answered by the supporter about their experience during the conversation
    • supporter_question2: additional reflection questions answered by the supporter about their experience during the conversation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions