National Center for Data Mining Large Data Archives
Today, research in data analysis and data mining is hindered by the lack of large data sets available to the research community.
The National Center for Data Mining at the University of Illinois at Chicago is hosting the Large Data Archive, which makes available to the research community a variety of large data sets (data sets larger than 10GB).
If you are interested in providing your data set to to the Large Data Archive, please contact us at info@ncdm.uic.edu. We are currently looking for data sets that are 10 GB or larger, preferably 1 TB or larger.
The initial collection consists of three multiple terabyte data sets - the Gateway Highway Testbed containing highway sensor and related data, the Sloan Digital Sky Survey (SDSS) data, and data from the Angle Anomaly Detection Project.
We have developed a software application called Sector that is designed to simplify the downloading of large data sets over high performance networks. We have also installed clusters at several sites around the world to improve access to the data. Currently, the SDSS data set is distributed using Sector.