Data Set
In the realm of Data Science, arguably the most challenging and crucial aspect is securing and preprocessing data. Unfortunately, classes that teach what data exists in the world and how to acquire it are not common.
Mark | Subcategory |
---|---|
😀 | Easy Access |
😡 | Difficult Access |
🔰 | Recommended for Beginners |
👨🎓 | Recommended for Experts |
👍 | Highly Recommended |
While it might seem sufficient to follow the thumbs-up 👍, navigating the world of data is not always straightforward. The exact data needed for a particular task often does not exist, and compromises must be made. Having multiple alternatives, even if they are not ideal, is beneficial, and any data is better than none.
Structured Data
- Environmental Big Data Platform: Provides an environmental data market, offering both free and paid data. (2021.08.02)
- Meteorological Data Open Portal 👨🎓🔰: Offers meteorological and disaster-related data. (2021.08.03)
- 👍 Our World in Data 😀🔰: Provides hundreds of annual data types related to various societal aspects, available by country and year, for free. Notably offers extensive COVID-19 data and statistics. (2021.12.30)
Time Series
- investing.com 😀🔰: A global financial information site providing easy access to chart information for stocks like KOSPI and KOSDAQ. (2021.07.30)
- CYBOS Plus 😡👨🎓: Daewoo Securities’ Open API offering daily or real-time time series data including stock codes, closing prices, market capitalization, institutional net buying, etc. (2021.07.15)
Local Governments of Korea
- D-Data Hub 😀: Offers public data for the Daegu region, with over 4,000 datasets and 13,000 services. (2021.06.08)
- Changwon Big Data Portal: Provides 172 datasets across 12 categories, big data studio, and commercial analysis for the Changwon area. (2021.07.30)
Unstructured Data
- AI Hub 👨🎓: Provides AI training data in fields such as voice/natural language, vision, healthcare, autonomous driving, safety, agriculture, and environment, in various formats like images, videos, texts, audio, 3D, and sensor data. (2021.07.14)
- kaggle 😀🔰: The most famous open data hub globally, hosting countless datasets and hosting many competitions. (2021.07.15)
- KDX Korea Data Exchange 😡👨🎓: Unlike general data hubs, it’s a company that sells data. Offers high-quality data suitable for Korean contexts, with both paid and free options available. (2021.08.06)
Networks
- SEES:lab 👨🎓: Offers neatly organized data on networks such as airports and emails. (2021.12.31)
- Stanford Network Analysis Project 👨🎓: Maintained by Stanford University, this library for network analysis/mining provides large network data. (2022.01.04)
- OpenFlights: Provides data on global airports and airline networks. Some preprocessing may be required, but comprehensive network data of this scale is rare. (2022.01.10)
- Mark Newman’s Network Data 😡: Offers 23 types of networks related to published research by the well-known Mark Newman. (2022.01.10)
- World Pop: Offers data on the global aviation network, international migration statistics, urbanization, age, and gender structure. (2022.01.04)
Geographic Information
- ITS National Transportation Information Center 😀👨🎓: Provides domestic traffic congestion, construction accidents, CCTV, traffic prediction, vehicle detectors, VMS, traffic safety assistants, variable speed signs, vulnerable section information, and nationwide standard node links. (2021.08.03)
- 👍 GIS DEVELOPER 👨🎓: A blog run by GIS expert and developer Hyungjun Kim. For projects using Korean data, it’s said that nothing can be done without his help. (2023.01.10)
All posts
- Introduction to D-Data Hub
- Introduction to AI Hub
- How to Download Data Using Kaggle API, Solving OSError: Could Not Find kaggle.json.
- Introduction to Kaggle
- Introduction to Investment Information Open API CYBOS Plus
- CYBOS Plus Installation Tutorial
- How to Load Stock Codes with CYBOS Plus CpUtil.CpStockCode
- How to Fetch Stock Prices for Securities Using CYBOS Plus CpSysDib.StockChart
- How to Import Institutional and Foreign Trade Volume with CYBOS Plus
- How to Fetch Short Selling Trends with CYBOS Plus
- Introduction to Changwon City Big Data Portal
- Introduction to investing.com
- Introduction to the Environmental Big Data Platform
- Introduction to the Meteorological Data Open Portal
- Introduction to the ITS National Transportation Information Center
- Introduction to the Korea Data Exchange (KDX)
- Introduction to Our World in Data
- SEES:lab Introduction
- Introduction to World Pop
- Introduction to the Stanford Network Analysis Project
- Introduction to OpenFlights
- Introduction to Network Data by Mark Newman
- Introduction to GIS Developer