Dataset
Dataset Card for FicWad
Dataset Summary
This dataset contains fan fiction stories collected from FicWad.com, a community platform for fan fiction authors and readers. The dataset includes complete stories with metadata such as titles, summaries, categories, ratings, genres, publication dates, and full text content.
Languages
The dataset is primarily monolingual:
- English (en): Most stories are in English
Dataset Structure
Data Fields
- id: Unique identifier for the story
- title: Title of the fan fiction piece
- summary: Description or summary of the story
- category: Main category or fandom of the story (e.g., "Sci-Fi")
- rating: Content rating (e.g., "G", "PG", "R")
- genres: Array of genres associated with the story (e.g., ["Sci"])
- published_iso: ISO format publication timestamp
- published_timestamp: Unix timestamp of publication date
- word_count: Number of words in the story
- score: Numerical rating/score of the story
- score_adjective: Descriptive rating (e.g., "Unrated")
- story_text: Full text content of the story
Data Splits
The dataset contains 158,991 entries in a single JSONL file.