Dataset
nyuuzyou/characterhub

Dataset Card for CharacterHub

Dataset Summary

This dataset contains metadata for 174,158 character profiles from CharacterHub. The dataset includes metadata such as character names, descriptions, backstories, tags, attributes (like species, age, appearance), and links to profile/cover images.

Languages

The dataset is primarily in English (en), including names, descriptions, tags, and attributes.

Dataset Structure

Data Fields

Based on the example structure, this dataset includes the following fields:

  • slug: Unique identifier string for the character profile.
  • created: Timestamp of profile creation (string, ISO 8601 format).
  • info: A nested JSON object containing detailed character information:
    • name: Character's name (string).
    • tags: List of descriptive tags assigned by the user within the info section (array of strings, can be null).
    • about: Key-value pairs describing character attributes (object; keys are user-defined strings like "Eyes (Color)", "species", "age", etc., values are strings).
    • cover: URL to the character's cover image (string).
    • quote: A representative quote from or about the character (string).
    • trivia: List of trivia points about the character (array, typically of strings, can be null or empty).
    • backstory: Narrative backstory of the character (string, can be null).
    • description: General description of the character (string).
    • profilePicture: URL to the character's main profile picture (string).
    • cropProfilePicture: URL to a potentially cropped version of the profile picture (string).
  • is_nsfw: Flag indicating if the character profile is marked as Not Safe For Work (boolean).
  • tags: Top-level list of tags associated with the character profile (array of strings, can be empty []). Note: May overlap with info.tags.

Data Splits

All 174,158 examples are contained within the 'train' split.