This page provides thousands of free Data Science Datasets to download, discover and share cool data, connect with interesting people, and work together to solve problems faster. iLovePhd.com contains open metadata on 20 million texts, images, videos, and sounds gathered by the trusted and comprehensive resource for Datasets Download with Links in 2025.
Datasets Download Links 2025
Here is an updated and comprehensive list of free and reliable data science datasets available for download in 2025. These datasets span various domains such as machine learning, AI, economics, healthcare, and more. Each entry includes the dataset name, description, and a direct link for easy access.
Dataset Name | Description | Download Link |
---|---|---|
Kaggle Datasets | A vast collection of datasets across various domains, including competitions and user-contributed data. | kaggle.com/datasets |
UCI Machine Learning Repository | A classic repository offering datasets for machine learning research, including classification, regression, and clustering tasks. | archive.ics.uci.edu/ml |
Data.gov | The U.S. government’s open data portal, providing datasets on agriculture, climate, education, energy, and more. | data.gov |
Data.gov.in | India’s open government data platform offers datasets from various Indian government departments and ministries. | data.gov.in |
World Bank Open Data | Provides global development data, including economic indicators, health statistics, and population metrics. | data.worldbank.org |
European Data Portal | Aggregates open data from European countries, covering various sectors like economy, health, and environment. | data.europa.eu |
Common Crawl | A repository of web crawl data that can be used for research in natural language processing, machine learning, and data mining. | commoncrawl.org |
Zenodo | An open-access repository developed under the European OpenAIRE program and operated by CERN, allowing researchers to share datasets, research papers, and software. | zenodo.org |
Harvard Public Domain Books Dataset | A collection of nearly one million public-domain books released by Harvard University, suitable for training AI models. | Harvard Dataset |
WorldMove | A synthetic human mobility dataset covering over 1,600 cities worldwide, useful for urban planning and transportation research. | WorldMove Dataset |
WeatherBench | A benchmark dataset for data-driven weather forecasting, providing processed data from the ERA5 archive. | WeatherBench |
PMLB (Penn Machine Learning Benchmarks) | A collection of standardized datasets for evaluating machine learning algorithms, facilitating easy comparison of methods. | PMLB GitHub |
UK Data Archive | Hosts over 6,000 social science datasets, including large-scale surveys like the Labour Force Survey and Crime Survey for England and Wales. | UK Data Archive |
FiveThirtyEight Datasets | Datasets used in FiveThirtyEight articles, covering topics like politics, sports, science, and economics. | FiveThirtyEight GitHub |
AWS Public Datasets | A repository of large public datasets hosted on Amazon Web Services, including satellite imagery, genomic data, and web crawls. | AWS Public Datasets |
Google Dataset Search | A tool to find datasets stored across the web, facilitating access to datasets in various domains and formats. | datasetsearch.research.google.com |
YouTube Labeled Video Dataset | A dataset containing labeled YouTube videos, useful for video classification and machine learning tasks. | YouTube-8M Dataset |
Analytics Vidhya Datasets | Offers datasets for practice and competitions in data science and machine learning. | Analytics Vidhya |
Quandl | Provides financial, economic, and alternative datasets, suitable for investment and economic research. | quandl.com |
DrivenData | Hosts data science competitions for social good, providing datasets on various humanitarian topics. | DrivenData |
MNIST Database | A classic dataset of handwritten digits, widely used for training image processing systems. | MNIST Dataset |
MovieLens | Provides movie rating datasets, useful for building and evaluating recommendation systems. | MovieLens |
Jester Dataset | A dataset of joke ratings, used for research in collaborative filtering and recommendation systems. | Jester Dataset |
Awesome Public Datasets | A curated list of high-quality public datasets categorized by topic and domain. | Awesome Public Datasets |
Big Data Analytics News – 200+ Free Datasets | A comprehensive guide listing over 200 free datasets across various domains, including AI, NLP, and machine learning. | Big Data Analytics News |
These datasets are valuable resources for data scientists, researchers, and enthusiasts looking to explore and analyze data across different fields. They are freely available and can be used for various projects, including machine learning model training, data analysis, and academic research.