-
Catalogue Entry: Pinterest Fashion Compatibility
This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Operating and financial statistics for major Canadian airlines, monthly
Monthly operating and financial statistics (number of thousands of: passengers, passenger-kilometres, available seat-kilometres, load factor, hours flown, turbo fuel consumed in...-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Multi-aspect Reviews
These datasets include reviews with multiple rated dimensions. The most comprehensive of these are beer review datasets from Ratebeer and Beeradvocate, which include sensory... -
Catalogue Entry: Modeling heart rate and activity data for personalized fitness recommendation
This is a collection of workout logs from users of EndoMondo. Data includes multiple sources of sequential sensor data such as heart rate logs, speed, GPS, as well as sport...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Marketing Bias data
These datasets contain attributes about products sold on ModCloth and Amazon which may be sources of bias in recommendations (in particular, attributes about how the products...-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Learning to Discover Social Circles in Ego Networks
These datasets contain social connections and "circles" from Facebook, Twitter, and Google Plus. -
Catalogue Entry: Interview: Large-scale Modeling of Media Dialog with Discourse Patterns and...
This dataset contains interview transcripts from National Public Radio (NPR). Data includes full interview transcripts and news article headlines.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Google Restaurants
This is a mutli-modal dataset of restaurants from Google Local (Google Maps). Data includes images and reviews posted by users, as well as other metadata for each restaurant.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Google Local Reviews (2021)
This dataset contains review information from Google Maps (ratings, text, images, etc.), business metadata (address, geographic info, descriptions, category information, price,...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Google Local Reviews (2018)
These datasets contain reviews about businesses from Google Local (Google Maps). Data includes geographic information for each business as well as reviews.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Goodreads Spoilers
These datasets contain reviews from the Goodreads book review website, along with annotated "spoiler" information from each review.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Goodreads Book Reviews
These datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. Critically, these datasets have multiple levels of user...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Generating Personalized Recipes from Historical User Preferences
These datasets contain recipe details and reviews from Food.com (formerly GeniusKitchen). Data includes cooking recipes and review texts.-
File available for download in the following formats:
- CSV
-
Catalogue Entry: DogWhistle: Cant Understanding Data
DogWhistle is a Chinese dataset collected from the historical records for an online game. It provides hidden words and the cant for them, with human answers. The dataset is...-
File available for download in the following formats:
- CSV
-
Catalogue Entry: Clothing_Fit_Data
These datasets contain measurements of clothing fit from ModCloth and RentTheRunway.-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Behance Community Art Data
Likes and image data from the community art website Behance. This is a small, anonymized, version of a larger proprietary dataset. -
Catalogue Entry: Amazon Question and Answer Data
This dataset contains Question and Answer data from Amazon, totaling around 1.4 million answered questions. This dataset can be combined with Amazon product review data,...-
File available for download in the following formats:
- JSON
-
Catalogue Entry: Amazon Product Reviews
This is a large crawl of product reviews from Amazon. This dataset contains 82.83 million unique reviews, from around 20 million users.-
File available for download in the following formats:
- JSON