Refugee Law Lab - Canadian Legal Data

The Refugee Law Lab supports bulk open-access to Canadian legal data to facilitate research and advocacy. Bulk open-access helps avoid asymmetrical access-to-justice and amplification of marginalization that results when commercial actors leverage proprietary legal datasets for profit -- a particular concern in the border control setting.

The Canadian Legal Data dataset includes the unofficial full text of thousands of court and tribunal decisions at the federal level. It can be used for legal analytics (i.e. identifying patterns in legal decision-making), to test ML and NLP tools on a bilingual dataset of Canadian legal materials, and to pretrain language models for various tasks.

Datasets available for download

Additional Info

Field Value
Last Updated August 20, 2024, 17:54 (UTC)
Created March 6, 2024, 19:50 (UTC)
Domain / Topic
Domain or topic of the dataset being cataloged.
Legislature
Format (CSV, XLS, TXT, PDF, etc)
File format of the dataset.
Dataset Size
Dataset size in megabytes.
Metadata Identifier
Metadata identifier – can be used as the unique identifier for catalogue entry
Published Date
Published date of the dataset.
2024-01-01
Time Period Data Span (start date)
Start date of the data in the dataset.
2003-01-10
Time Period Data Span (end date)
End date of time data in the dataset.
2023-05-25
GeoSpatial Area Data Span
A spatial region or named place the dataset covers.
Canada
Field Value
Access category
Type of access granted for the dataset (open, closed, service, etc).
Open
Limits on use
Limits on use of data.
No
Location
Location of the dataset.
https://huggingface.co/datasets/refugee-law-lab/canadian-legal-data
Data Service
Data service for accessing a dataset.
huggingface.co
Owner
Owner of the dataset.
Social Sciences and Humanities Research Council and the Law Foundation of Ontario
Contact Point
Who to contact regarding access?
infostats@statcan.gc.ca
Publisher
Publisher of the dataset.
huggingface.co
Publisher Email
Email of the publisher.
Author
Author of the dataset.
Sean Rehaag
Author Email
Email of the author.
srehaag@osgoode.yorku.ca
Accessed At
Date the data and metadata was accessed.
2024-03-04
Field Value
Identifier
Unique identifier for the dataset.
refugee-law-lab/canadian-legal-data
Language
Language(s) of the dataset
French, English
Link to dataset description
A URL to an external document describing the dataset.
https://huggingface.co/datasets/refugee-law-lab/canadian-legal-data
Persistent Identifier
Data is identified by a persistent identifier.
Yes
Globally Unique Identifier
Data is identified by a persistent and globally unique identifier.
Yes
Contains data about individuals
Does the data hold data about individuals?
Yes
Contains data about identifiable individuals
Does the data hold identifiable data about individual?
Yes
Contains Indigenous Data
Does the data hold data about Indigenous communities?
No
Field Value
Version
Version of the datatset
1.0
Source
Source of the dataset.
https://huggingface.co/datasets/refugee-law-lab/canadian-legal-data
Version notes
Version notes about the dataset.
Is version of another dataset
Link to dataset that it is a version of.
No
Other versions
Link to datasets that are versions of it.
Provenance Text
Provenance Text of the data.
Sean Rehaag
Provenance URL
Provenance URL of the data.
https://huggingface.co/datasets/refugee-law-lab/canadian-legal-data/commits/main
Temporal resolution
Describes how granular the date/time data in the dataset is.
Daily
GeoSpatial resolution in meters
Describes how granular (in meters) geospatial data is in the dataset.
GeoSpatial resolution (in regions)
Describes how granular (in regions) geospatial data is in the dataset.
Country
Field Value
Indigenous Community Permission
Who holds the Indigenous Community Permission. Who to contact regarding access to a dataset that has data about Indigenous communities.
Community Permission
Community permission (who gave permission).
The Indigenous communities the dataset is about
Indigenous communities from which data is derived.
Field Value
Number of data rows
If tabular dataset, total number of rows.
174000
Number of data columns
If tabular dataset, total number of unique columns.
11
Number of data cells
If tabular dataset, total number of cells with data.
1914000
Number of data relations
If RDF dataset, total number of triples.
Number of entities
If RDF dataset, total number of entities.
1914000
Number of data properties
If RDF dataset, total number of unique properties used by the triples.
Data quality
Describes the quality of the data in the dataset.
10/10
Metric for data quality
A metric used to measure the quality of the data, such as missing values or invalid formats.
Organization of data

0 Comments

Please login or register to comment.