-
Catalogue Entry: Video Game Data
Step charts from the video game Dance Dance Revolution, and audio files from the NES platform.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Understanding the interplay between titles, content, and communities in social media
Submissions of reddit posts (and in particular resubmissions of the same content) along with metadata.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Steam Video Game and Bundle Data
These datasets contain reviews from the Steam video game platform, and information about which games were bundled together.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Speech Recognition and Multi-Speaker Diarization of Long Conversations
This dataset contains program transcripts from This American Life. Data includes full program transcripts and associated audio.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Social Recommendation Data
These datasets include ratings as well as social (or trust) relationships between users. Data are from LibraryThing (a book review website) and epinions (general consumer reviews). -
Catalogue Entry: Recommendation on Live-Streaming Platforms: Dynamic Availability and Repeat...
This is a dataset of users consuming streaming content on Twitch. We retrieved all streamers, and all users connected in their respective chats, every 10 minutes during 43 days.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Product Exchange/Bartering Data
These datasets contain peer-to-peer trades from various recommendation platforms.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Pinterest Fashion Compatibility
This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Modeling heart rate and activity data for personalized fitness recommendation
This is a collection of workout logs from users of EndoMondo. Data includes multiple sources of sequential sensor data such as heart rate logs, speed, GPS, as well as sport...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Marketing Bias data
These datasets contain attributes about products sold on ModCloth and Amazon which may be sources of bias in recommendations (in particular, attributes about how the products...-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Interview: Large-scale Modeling of Media Dialog with Discourse Patterns and...
This dataset contains interview transcripts from National Public Radio (NPR). Data includes full interview transcripts and news article headlines.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Google Local Reviews (2018)
These datasets contain reviews about businesses from Google Local (Google Maps). Data includes geographic information for each business as well as reviews.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Goodreads Spoilers
These datasets contain reviews from the Goodreads book review website, along with annotated "spoiler" information from each review.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Goodreads Book Reviews
These datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. Critically, these datasets have multiple levels of user...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Generating Personalized Recipes from Historical User Preferences
These datasets contain recipe details and reviews from Food.com (formerly GeniusKitchen). Data includes cooking recipes and review texts.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: DogWhistle: Cant Understanding Data
DogWhistle is a Chinese dataset collected from the historical records for an online game. It provides hidden words and the cant for them, with human answers. The dataset is...-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Behance Community Art Data
Likes and image data from the community art website Behance. This is a small, anonymized, version of a larger proprietary dataset. -
Catalogue Entry: Average Scheduled Monthly Payments for New Mortgage Loans
This quarterly housing data provides you with average monthly payments for new mortgage loans. You can compare the average mortgage repayments made in Canada, the provinces and...-
File available for download in the following formats:
- XLSX
-
Catalogue Entry: Amazon Question and Answer Data
This dataset contains Question and Answer data from Amazon, totaling around 1.4 million answered questions. This dataset can be combined with Amazon product review data,...-
File available for download in the following formats:
- JSON