This guide also borrows heavily from an edit of the original from the University of Rochester.
It is licensed under a Creative Commons Attribution 4.0 International License.
Any part of it may be used as long as credit is included.
"By open data in science we mean that it is freely available on the public internet permitting any user to download, copy, analyse, re-process, pass them to software or use them for any other purpose without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself." Panton Principles, Principles for open data in science. Murray-Rust, Peter; Neylon, Cameron; Pollock, Rufus; Wilbanks, John; (19 Feb 2010). Retrieved 10/22/2018 from https://pantonprinciples.org/
The types of open data resources you will find in this LibGuide include government data, data from non-governmental organizations (NGOs), vetted data, and crowdsourced data.
An interesting review of croudsourcing is available in Wazy, K. ""Crowdsourcing" ten years in: A review", doi: 10.7189/jogh.07.020601
Please Note: This is not a comprehensive list of all available open data resources but it might be a good place for you to start! :-)
Learn about data repositories, some of which are open, others of which aren't, some of which are both. A great repository that provides both open data and data managed for member institutions is ICPSR; and it also provides excellent resources and information about data repositories which you can find here.
Learn about data file types. Wikipedia has an excellent page listing file formats: https://en.wikipedia.org/wiki/List_of_file_formats If you know the possible file format or file extension for the data that you are looking for you can use that in your search (e.g. "filetype:xlsx" or ".xlsx").
Literature reviews are your friend. Read the research literature in your field of interest and identify the data used in inquiries of interest. Research libraries typically provide access to many peer reviewed journals that provide reviews of the appropriate literature. A search of Journals and Periodicals with word "review" in them is available here. Also, try a search for "applied methods" and your discipline of interest.
General Google Search
Include search terms like "data" or "table"
Google ignores the word AND as a search operator. But, typing OR in all caps will find similar or related terms (e.g. data OR dataset OR "data set").
Search for a particular document type (e.g. filetype:xls)
Search for data on a particular site or domain (e.g. asthma site:.gov)
Exclude words by using the "-" sign in front of the word you wish to exclude
Advanced Google Search
Advanced Google search provides an interface for using many more parameters in your search first focusing on finding pages using word and number matches and then narrowing one's results using filters for language, regions, etc. Tip: Open your browser window wide enough so that you can see Google's information on what "To do in the search box". This provides specific examples.
Google Dataset Search (note: Google refers to this as Beta software)
This tool was specifically developed to assist scientists and data journalists with finding and accessing data and made available to the public on September 5, 2018.
Your library should have excellent search tools for reviewing the literature on your research subject but this is another place you can start.
DataLumos is an ICPSR archive for valuable government data resources. ICPSR has a long commitment to safekeeping and disseminating US government and other social science data. DataLumos accepts deposits of public data resources from the community and recommendations of public data resources that ICPSR itself might add to DataLumos.
UK Data Archive
The UK Data Archive is the lead organisation of the UK Data Service, which provides unified access to the UK's largest collection of social, economic and population data. Funded by the ESRC, the UK Data Service provides access to regional, national and international social and economic data, support for policy-relevant research and guidance and training for the development of skills in data use.
Published annually by the U.S. Department of Agriculture, Agriculture Statistics provides information on agricultural production, supplies, consumption, facilities, costs, and returns.
Public archive of Census of Agriculture publications published 1945-1987 in PDF format.
The USDA's National Agricultural Statistics Service (NASS) conducts hundreds of surveys every year and prepares reports covering virtually every aspect of U.S. agriculture.
The Census of Agriculture is the leading source of facts and figures about American agriculture. It is the only source of uniform, comprehensive agricultural data for every state and county in the United States.
Uniform Crime Reports - The Uniform Crime Reports (from the FBI) are produced from data provided by nearly 17,000 law enforcement agencies across the United States.
Contains data on crime and victimization, arrests, dispositions, law enforcement personnel, and more.
Contains statistics for Criminal Justice Characteristics, Public Opinion, "Crime, Victims," "Arrests, Seizures," "Courts, Prosecution, Sentencing," and "Parole, Jails, Prisons, Death Penalty."
2010 census data available for users who want to download the set of detailed tables for all of the geographies within a state and run their own analysis and rankings.
Includes data on churches and church membership, religious professionals, and religious groups (individuals, congregations and denominations).
These resources provide State and national statistics on child and family well-being indicators, such as health, child care, education, income, and marriage.
Provided by the University of Michigan, access is limited to one user at a time. Available in both Chinese and English. Includes yearly macro-economic statistics, monthly macro-economic statistics, and historical statistics (1949-) from China.
Historical, current and projected socioeconomic data for the United States, regions, and counties. Find info on population by age and race, employment by industry, earnings of employees by industry, personal income by source, households by income bracket and retail sales by kind of business. (1970-2050).
Primary source of labor force statistics for the population of the United States
Public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.
Major statistical data sources on disability
Military casualty data and Active Duty military/civilian personnel statistics by rank/grade, service totals, service by Region/Country.
Provide access to official statistical information available to the public from the Federal Government. Covers over 100 U.S. Federal agencies.
A standard 'core' of demographic, behavioral, and attitudinal questions, plus topics of special interest, that has tracked the opinions of Americans over the last four decades.
Data related to development: literacy, health, poverty, income inequality, climate change, crime, population, and more.
Assesses child well-being nationally via 16 key indicators 1990-present
Click on "International Statistical Links" to retrieve an A-Z country list with national statistical bureau links attached.
Gathers social and economic information on Mexican-US migration.
Search, customize and download datasets for national and international variables.
The National Center for Children in Poverty (NCCP) is the nation’s leading public policy center dedicated to promoting the economic security, health, and well-being of America’s low-income families and children.
Data and research studies cover all areas of social policy in the UK.
New York State’s primary and most comprehensive source for economic and demographic data, tracking the trends of the State of New York, its businesses and people.
Compendium of tables that provides data on foreign nationals, permanent legal residents, naturalized citizens and maps with various demographic characteristics (1996-present).
Includes multidimensional poverty index and specific summaries on the results of the MPI analyses in 104 developing nations.
PovcalNet is an interactive computational tool that allows you to replicate the calculations made by the World Bank's researchers in estimating the extent of absolute poverty in the world, including the $1 a day poverty measures.
UNDP is the United Nations' global development network, an organization advocating for change and connecting countries to knowledge, experience and resources to help people
build a better life.
American FactFinder provides access to detailed tables and maps for population, housing, and businesses.
A compilation of international development indicators and socio-economic data.
Provides a yearly overview of the economic, social, and environmental state of the world. Also provides detailed economic data for most countries, quality of life indicators, and other demographic and environmental information.
Geochemical data from a variety of sources worldwide. Advanced search interface allows users to define many parameters to find appropriate information.
A large archive of climate data.
Datasets from the federal government, including the USGS and other agencies.
Geospatial information from the Canadian federal government, including topographic maps and GIS data.
Bathymetry, earth observations from space, geomagnetic data & models, marine geology & geophysics, natural hazards, space weather & solar events.
ShareGeo Open is a spatial data repository that promotes data sharing between creators and users of spatial data. Geospatial data of all kinds is available.
Contains economic data for EU member states, EU candidate countries and other OECD countries (United States, Japan, Canada, Switzerland, Norway, Iceland, Mexico, Korea, Australia and New Zealand).
Current and historical data for the VIX, a barometer for measuring investor sentiment and market volatility.
Provides access to thousands of data sets holding hundreds of millions of facts and figures from a wide range of public and private data providers including the United Nations, the World Bank, Eurostat and the Economist Intelligence Unit.
Contains several thousand economic time series, produced by a number of U.S. Government agencies and distributed in a variety of formats and media
Access structural and aggregated financial information & quarterly reports on FDIC-insured institutions
Provides information regarding home mortgage lending activity.
Data on IMF lending, exchange rates, trade statistics and other economic and financial indicators.
In-depth economic analyses of the home building industry based on private and government data
Contains useful economic datasets for download
Stats for OECD countries and selected non-member economies.
Information on how Recovery funds are/were spent by recipients of contracts, grants, and loans, and the distributio n of Recovery entitlements and tax benefits under the Recovery Act.
Contains data on factors such as poverty, income, employment, and health insurance coverage
USAID data--Explains where U.S. foreign aid is invested in over 100 countries.
Interactive national, international, regional economic data or industry statistics.
Databases, Tables & Calculators by Subject for US labor statistics.
Futures and options markets data
Provides a complete historical record of all foreign assistance provided by the United States to the rest of the world.
World Investment Report (annual) covers trends and analysis in foreign direct investment
A Database of International Business Statistics with over 5000 variables from over 200 countries. Data available from mid-1990s to present. (Free registration is required to access data sets).
Data included in FedStats but " this website concentrates on "...statistics and reports on children and families."
Ed Watch Interactive is a user-friendly source of data on educational performance and equity by race and class, kindergarten through college.
Explore hundreds of measures of well-being for kids across the nation, or in your state, city, or community.
State Profiles presents key data about each state's performance in the National Assessment of Educational Progress (NAEP) in mathematics, reading, writing, and science for grades 4 and 8.
The Nation's Report Card presents the results of the National Assessment of Educational Progress (NAEP), which measures student achievement in the U.S. in various subjects over time.
Collects, analyzes and makes available data related to education in the U.S. and other nations.
The School Construction Data section includes current, forecast, and historical data about U.S. school construction. The School Building Statistics section provides answers to the most frequently asked questions about school facilities.
Contains data and statistics collected from New York schools and learning support resources.
The New York State Report Cards provide enrollment, demographic, attendance, suspension, dropout, teacher, assessment, accountability, graduation rate, post-graduate plan, career and technical education, and fiscal data for public and charter schools, districts, and the State.
Research and Data on NYC Schools.
Contains over 1,000 types of indicators and raw data on education, literacy, science and technology, culture and communication for more than 200 countries and territories
US Electoral and popular vote results from 1789 to the present for US Presidential Elections.
Contains data on voting, public opinion and political participation. Cumulative, time-series, panel and contextual data are available for download.
The Center for American Women and Politics (CAWP), a unit of the Eagleton Institute of Politics at Rutgers, The State University of New Jersey, is nationally recognized as the leading source of scholarly research and current data about American women’s political participation. This site presents data and analysis of women's voting behavior, including statistics on turnout and the gender gap in voting.
Official vote counts from 1920 to the present for presidential and congressional elections compiled by the Office of the Clerk of the U.S. House of Representatives. This site also contains links to election resources found on the websites of the Census Bureau, National Archives, Federal Election Commission, and state election offices.
Data and analyses of US elections, including data on presidential, congressional, and gubernatorial elections, political parties, campaigns, and demographic information.
Downloadable campaign finance information
A space to share and improve election data. You can create an account and either correct the cataloging information for the studies in this dataverse or upload new data files.
Four ICPSR studies that provide datasets of electoral returns for approx. 90% of all elections to the offices of president, governor, United States senator, and United States representative for all parties and candidates 1788-1990. Most returns are at the county level.
Library of Congress U.S. Election Statistics: A Resource Guide A list of online and print resources that contain U.S. election statistics for both federal and state elections.
National Archives: Historical Election Results The Office of the Federal Register at the National Archives coordinates the functions of the Electoral College on behalf of the Archivist of the United States, the States, and the Congress. This site contains the electoral votes and popular votes from 1789 to the present.
U.S. Census Bureau: Voting and Registration Voting and Registration data have been collected biennially by the U.S. Census Bureau in the November Current Population Survey (CPS). The statistics presented on this website are based on replies to survey inquiries about whether individuals were registered and/or voted in specific national elections. For the purpose of these estimates, election types are considered to be either congressional or presidential.
United States Election Project: Voter Turnout Dr. Michael McDonald, an Associate Professor at George Mason University, provides national and state voter turnout statistics from 1980 to the present.
Constituency-Level Elections Archive (CLEA) Contains approximately 1100 elections from 70+ countries at a constituency level for lower house legislative elections. Includes votes received by each candidate/party, total votes cast, number of eligible voters, and seat figures where available. Available in Stata, SPSS, and raw data formats. Dates are variable, but generally 1945+
Build data sets on national and subnational elections around the world 1940s-2010s. Accessible in multiple formats: spreadsheets, tables and GIS maps.
Information on international elections: Subnational elections of high interest; Political parties and candidates; Referendum provisions; News on election-related laws and developments around the world; Political institutions and electoral systems; Election results and voter turnout.
BP, one of the world's largest energy companies, publishes statistical reviews of world energy, projections and historical data.
Data on environmentally sustainable energy programs in developing countries.
Energy statistics by country
Monthly data for the biggest oil producing and consuming countries.
A program of the U.S. Department of Education's National Center for Education Statistics, this site provides annual and national statistics for all public elementary and secondary schools, and school districts across the U.S. Data can be located under "Quick Facts", "Data", or by searching for a specific school or school district under "School/District Locator." Fiscal and Nonfiscal reports, and working papers can be located under "Publications."
Contains both experimental and evaluated nuclear data including nuclear reaction ((the properties of interacting nuclei, e.g. cross sections) and nuclear structure (the properties of single nuclei) data.
A journal covering the global energy market (1998-present).
Company level data on the supply and disposition of natural gas in the United States, Electric power data collected by surveys, international energy statistics, energy country profiles for 217 countries, state and territory energy profiles for the U.S., financial data collected from major energy producers, short-term and historical energy outlook data & projections, and real energy prices.
Provides energy data for more than 215 countries, areas and regions on the production, trade and intermediate and final consumption for primary and secondary conventional, non-conventional and new and renewable sources of energy from 1990 onwards.
Provides data sets on energy, climate, forests, water, & sustainability.
US Food and Agriculture Organization (FAO) global water information system
CIESIN works at the intersection of the social, natural, and information sciences, and specializes in on-line data and information management, spatial data integration and training, and interdisciplinary research related to human interactions in the environment.
Provides datasets on environmental issues in the areas of population, health, society, natural systems, climate, energy, transportation, food & agriculture, and economy & policy.
Provides datasets and climate reports (land-based stations, satallite, radal, modelling, weather balloons, marine/ocian, paleoclimatology, and severe weather).
United Nations agency providing data on fresh water issues around the globe.
An interactive site from the UN Environment Programme, with data, maps, reports and more.
Reports on climate change
Provides datasets, maps, data visualizations, charts, and graphs on environmental issues.
Public health statistics
WONDER online databases utilize a rich ad-hoc query system for the analysis of public health data. Reports and other query systems are also available.
A centralized and comprehensive source of information and analyses on global health R&D activities for human disease.
Datasets, tools, and applications gathered from agencies across the Federal government
Data from 219 countries and areas on the prevalence of HIV infection and AIDS cases and deaths.
Contains current data from surveys such as the National Health Interview Survey (NHIS), the National Health and Nutrition Examination Survey (NHANES), birth and mortality detail files, National Immunization Survey, Longitudinal Study of Aging, and National Survey of Family Growth (NSFG)
International labor statistics, standards, key indicators of the labor market, labor force surveys, safety, work conditions, child labor, and labor legislation.
Information on international trade and economic development trends, markets, and labor force.
Provides international labor force and wage information and other economic statistics (1990-present).
Contains international policies and data on employment, health, families and children, pension systems, international migration and other social policies and data. Downloadable files in Excel, CSV, PC-axis, or XML.
Database covers four key elements of modern political economies in advanced capitalist societies: Institutional Characteristics of Trade Unions, Wage Setting, State Intervention and Social Pacts in 49 countries between1960 and 2010
Provides detailed, comparative tables for social protection systems for 31 countries. Areas covered include financing, healthcare, sickness, maternity, invalidity, old-age, survivors, employment injuries and occupational diseases, family, unemployment, guaranteed minimum resources and long-term care.
Highlights the principal features of social security programs in more than 170 countries. Public use data downloadable as TXT or SAS zip files.
Describes the basic principles of polling and sampling in a question and answer format.
Database of public opinion polls containing the full text of over 600,000 questions and responses, from more than 18,000 surveys and 1,700 polling organizations, conducted from 1986 through the present. This database contains a large collection of public opinion polls related to federal and state elections. Information is gathered by professional polling organizations, television networks, universities, newspapers, businesses and associations, including the Gallup Poll, the Roper Organization Poll and others.
Roper Center for Public Opinion - iPOLL
Survey results, questionnaires, and possible dataset download option for an extensive collection of public opinion polls.
Primary, national, and state exit polling data for presidential elections from 1976 to the present.
A repository of Federal, State, and Local Maps and Data
A database of spatial data sources and other GIS information.
A spatial database of the location of the world's administrative areas including countries and lower level subdivisions such as provinces, departments, bibhag, bundeslander, daerah istimewa, fivondronana, krong, landsvæðun, opština, sous-préfectures, counties, and thana.
U.S. Maps and Data
A GIS Data Depository.
Find interactive maps, GIS data sets, satellite imagery and related applications.
The Geospatial Data Gateway (GDG) is the One Stop Source for environmental and natural resources data.
A collection of 7,908 worldwide and regional geographic data layers, scanned historic maps and associated descriptive information that can be searched mapped and downloaded for use with your GIS software.
A large archive of climate data.
A gateway to resources that help visualize national geospatial data in map format.
Free aggregate census data and GIS-compatible boundary files for the United States between 1790 and 2015.
GIS data with a Meteorological slant issued by the National Weather Service
A public domain map dataset available at 1:10m, 1:50m, and 1:110 million scales. includes vector and raster data, can also create maps.
Interactive query access to geographic boundary files, Census attribute data for various levels of statistical and political geography, and predefined thematic maps illustrating various economic and demographic characteristics of the population
An open source, federated web application framework to rapidly discover, preview and retrieve geospatial data from multiple repositories.
A resource that provides access to free downloads of various layers of National Geospatial Data.
Spatial extracts from the Census Bureau's MAF/TIGER database, containing features such as roads, railroads, rivers, as well as legal and statistical geographic areas
The GEO Data Portal is the authoritative source for data sets used by UNEP and its partners in the Global Environment Outlook (GEO) report and other integrated environment assessments. Its online database holds more than 500 different variables, as national, subregional, regional and global statistics or as geospatial data sets (maps), covering themes like Freshwater, Population, Forests, Emissions, Climate, Disasters, Health and GDP.
"Provides soil data and information [in map form] produced by the National Cooperative Soil Survey."
"A component of NASAs Earth Observing System (EOS) Data and Information System (EOSDIS). LP DAAC processes, archives, and distributes land data and products derived from the EOS sensors."
Here are a few articles and guidelines for deepening your understanding of the ethical considerations in social media research:
Infographics and stats on social media topics
Public datasets and tools for processing data for researchers
Reports on mobile and social media use.
Repository of datasets from social media sites included in this repository are BlogCatalog, Twitter, MyBlogLog, Digg, StumbleUpon, del.icio.us, MySpace, LiveJournal, The Unofficial Apple Weblog (TUAW), Reddit, etc.
Social Feed Manager is open source software that harvests social media data and web resources from Twitter, Tumblr, Flickr, and Sina Weibo. Find social media data such as Tweets from 60 Twitter accounts belonging to the President, Vice President, White House and administration officials, Cabinet members, political appointees members of Congress, official offices, and other related accounts, activists for the 2017 women’s march, news outlets, and more.
Data drawn from the OPE Equity in Athletics Disclosure Website database reported by all co-educational postsecondary institutions that receive Title IV funding (i.e., those that participate in federal student aid programs) and that have an intercollegiate athletics program.
Provides statistical, historical and other data for college and professional players, leagues and teams.
Provide customized reports for public inquiries relating to equity in athletics data.
College sport program comparisons. Sort by institution, division, gender, profit margin or sport.
Rankings and championship historical data by school or sport.
Professional sports data for major and minor league baseball, NBA, NFL, NHL as well as college football, college basketball and olympic sports.
Links to sports data on American Football, Baseball, Basketball, Cricket, Cycling, Fencing, Golf, Hockey, Olympics, Poker, Racing, Rugby, Soccer, Tennis, and Volleyball.
Provides statistical and financial data for the airline industry
Contains data relevant to a range of transport policy issues including investment & maintenance, transport statistics, emissions, road taxation and safety.
Part of the division of the Research and Innovative Technology Administration of the U.S. Department of Transportation. The BTS compiles, analyzes, and publishes statistical information on U.S. transportation systems. Data downloadable as CSV file (1995-present).