-
Catalogue Entry: Video Game Data
Step charts from the video game Dance Dance Revolution, and audio files from the NES platform.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Understanding the interplay between titles, content, and communities in social media
Submissions of reddit posts (and in particular resubmissions of the same content) along with metadata.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Steam Video Game and Bundle Data
These datasets contain reviews from the Steam video game platform, and information about which games were bundled together.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Speech Recognition and Multi-Speaker Diarization of Long Conversations
This dataset contains program transcripts from This American Life. Data includes full program transcripts and associated audio.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Social Recommendation Data
These datasets include ratings as well as social (or trust) relationships between users. Data are from LibraryThing (a book review website) and epinions (general consumer reviews). -
Catalogue Entry: Recommendation on Live-Streaming Platforms: Dynamic Availability and Repeat...
This is a dataset of users consuming streaming content on Twitch. We retrieved all streamers, and all users connected in their respective chats, every 10 minutes during 43 days.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Product Exchange/Bartering Data
These datasets contain peer-to-peer trades from various recommendation platforms.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Pinterest Fashion Compatibility
This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Multi-aspect Reviews
These datasets include reviews with multiple rated dimensions. The most comprehensive of these are beer review datasets from Ratebeer and Beeradvocate, which include sensory... -
Catalogue Entry: Modeling heart rate and activity data for personalized fitness recommendation
This is a collection of workout logs from users of EndoMondo. Data includes multiple sources of sequential sensor data such as heart rate logs, speed, GPS, as well as sport...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Marketing Bias data
These datasets contain attributes about products sold on ModCloth and Amazon which may be sources of bias in recommendations (in particular, attributes about how the products...-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Learning to Discover Social Circles in Ego Networks
These datasets contain social connections and "circles" from Facebook, Twitter, and Google Plus. -
Catalogue Entry: Interview: Large-scale Modeling of Media Dialog with Discourse Patterns and...
This dataset contains interview transcripts from National Public Radio (NPR). Data includes full interview transcripts and news article headlines.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Google Restaurants
This is a mutli-modal dataset of restaurants from Google Local (Google Maps). Data includes images and reviews posted by users, as well as other metadata for each restaurant.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Google Local Reviews (2021)
This dataset contains review information from Google Maps (ratings, text, images, etc.), business metadata (address, geographic info, descriptions, category information, price,...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Google Local Reviews (2018)
These datasets contain reviews about businesses from Google Local (Google Maps). Data includes geographic information for each business as well as reviews.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Goodreads Spoilers
These datasets contain reviews from the Goodreads book review website, along with annotated "spoiler" information from each review.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Goodreads Book Reviews
These datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. Critically, these datasets have multiple levels of user...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Generating Personalized Recipes from Historical User Preferences
These datasets contain recipe details and reviews from Food.com (formerly GeniusKitchen). Data includes cooking recipes and review texts.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Clothing_Fit_Data
These datasets contain measurements of clothing fit from ModCloth and RentTheRunway.-
File available for download in the following formats:
- JSON