Skip to Main Content

Census Data

This guide provides an introduction to U.S. census data: concepts, datasets, and data sources

Glossary

Subject categories, census geographies, and terminology are fairly consistent across most census datasets. This page provides a summary of the most common concepts. Use the Census Bureau's Glossary to find definitions for different terms.

Geography

The Census Bureau's public summary data is published for many different administrative, legal, and statistical areas. These areas fit into a hierarchy where larger areas are built from smaller ones, and smaller areas are constrained by larger ones. The diagram below illustrates hiw census geographies fit within summary levels. For example, counties are directly connected to states, which means that they fit within states and do not cross state boundaries. In contrast, places (cities and towns) do not have a connection to counties, which means they may cross county boundaries.

Census Geography Hierarchy

What about neighborhoods? Neighborhoods are areas that are locally and informally defined. The Census Bureau does not have a definition of what constitutes a neighborhood nor do they publish data for them. You would need to use a census geography to approximate the area of the neighborhood based on some consensus. A common approach is to aggregate census tracts into neighborhood-like areas. You can use either the PolicyMap or Social Explorer library databases to create customized areas (see the Library Databases on the Population Data page).

PolicyMap Interface

Geographic Resources

Subject Categories

People and houses are summarized in a number of different categories and subcategories. Data is reported for each of these different groups, so for example there are distinct tables for household income versus family income. While these terms sound commonplace, in the census they are highly specific definitions. Basic tips appear below; see the glossary or technical documentation for individual datasets (decennial census or American Community Survey) for more details.

  • Age is published in 5 or 10-year cohorts, large categories (under 18 / 65 and over), and as means and medians. The census does not use generational categories like Millennial or Gen X.
  • Sex refers to biological or anatomical sex (male / female). The census does not capture gender or sexual identity, but does capture same-sex households (married and unmarried).
  • Hispanic / Latino is not counted as a race in US government datasets, but as a separate ethnic category. There are five racial categories and a separate category for Hispanic / Latino origin. Data is published separately and in cross-tabulated tables. These categories have changed over time.
  • Households consist of 1 or more people who live in a residential setting. Families are a subset of households, defined as 2 or more related people who live together in a residential setting. People not living in households live in Group Quarters, where multiple people live together in a communal setting where they are free to leave (non-institutionalized: college dorms, military barracks, homeless shelters) or not (institutionalized: penitentiaries, mental hospitals, nursing homes).
  • Housing units are individual, self-contained domiciles where people live separately from each other and have individual access to their unit (i.e. single family homes, individual apartment and condo units, and mobile homes). Units are classified as occupied or vacant, and occupied units are classified by tenure (owners or renters).
  • Residency is measured differently in different datasets. The decennial census employs usual residency; where people live and sleep most of the time as of April 1st of the census year. The American Community Survey employs current residency, where a person is considered the resident of an address if they are staying there for at least 2 months. If an individual lives in more than one housing unit (second or seasonal homes), the units where they are not usually or currently living are counted as vacant housing units.
  • The homeless population is included in the decennial census, but the Census Bureau does not explicitly define and tabulate homelessness. Some of this population is measured in subcategories of the group quarters population.

Margins of Error

Variables published in the American Community Survey (ACS) are published with two values: an estimate and a margin of error (MOE). The estimate represents the mid-point of a range of possible values published at a 90% confidence level. The MOE indicates the range of values. For example, the estimate of people enrolled in college or university in the City of Providence was 25,912 +/- 1,100 between 2015-2019. This means this population could be as low as 24,812 or as high as 27,012. We are 90% confident that the estimate falls within this range; there is a 10% chance that the true value falls outside this range.

It's important to interpret ACS statistics as likely interval values and not as exact counts, and to scrutinize the MOE to gauge the precision and overall reliability of an estimate. A statistic called the coefficient of variation can be calculated to gauge this reliability. To reduce the size of the MOE, you can use 5-year estimates as opposed to 1-year estimates, or study a larger geographic area or population group. If you summarize or aggregate data, you must recalculate the MOE for the new values. To sum two estimates that represent totals, you add the estimates together, and then take the square root of the sum of squares of the MOEs for those estimates to create a new MOE. There are different formulas for calculating derived MOEs for proportions, ratios, and other estimates. See the resources below for additional info.