Skip to main content

Data Types and Sources

​​​Many potential researchers are unsure where they can obtain data to begin their research and analysis. We should first divide the different types of data into two major classifications.

Primary Data

When someone refers to "primary data" they are referring to data collected by the researcher himself/herself. This is data that has never been gathered before, whether in a particular way, or at a certain period of time. Researchers tend to gather this type of data when what they want cannot be find from outside sources. You can tailor your data questions and collection to fit the need of your research questions. This can be an extremely costly task and, if associated with a college or institute, requires permission and authorization to collect such data. Issues of consent and confidentiality are of extreme importance. Primary data actually follows behind secondary data because you should use current information and data before collecting more so you can be informed about what has already been discovered on a particular research topic.

Secondary Data

If the time or hassle of collecting your own data is too much, or the data collection has already been done, secondary data may be more appropriate for your research. This type of data typically comes from other studies done by other institutions or organizations. There is no less validity with secondary data, but you should be well informed about how it was collected. There are a number of free services online as well as many other made available through your current status as BYU students.

Data Sources

Harvard Dataverse (Dataverse)

A community of Harvard and worldwide researchers who share data in a communal repository.

Inter-University Consortium for Political & Social Research (ICPSR)

This online data archive provided by the Institute for Social Research at the University of Michigan is free to all current BYU students. With the help of over 700 academic institutions and research organizations, ICPSR has over 500,000 data files relating to social science fields including education, aging, criminal justice, substance abuse, and terrorism.

To register to download data from ICPSR, you first need to be at an on-campus computer because the IP address is what allows you access. Go to https://www.icpsr.umich.edu/ticketlogin and select Create Account. Fill out the information and then you can search for data that is relevant for your research.

The Institute for Quantitative Social Science (IQSS Dataverse Network)

An open-source service, this "dataverse" network is provided by The Institute for Quantitative Social Science (IQSS) at Harvard University with over 300 "dataverses" and nearly 650,000 data files available for download.

U.S. Census Bureau (DataFerrett)

Provided on behalf of the United States Census Bureau, patrons can download data from dozens of government surveys including the American Community Survey (ACS), the Decennial Census of Population and Housing (1990 and 2000 available), the National Health and Nutrition Examination Survey (NHANES), and the Survey of Income and Program Participation (SIPP). Pop-up blockers must be turned off to run the DataFerrett application. It also requires a download to your computer. You will need to use either Microsoft Internet Explorer or Mozilla Firefox as your browser. Using Google Chrome will not work for this platform.

General Social Survey (GSS)

From their website: "The GSS contains a standard 'core' of demographic, behavioral, and attitudinal questions, plus topics of special interest. Many of the core questions have remained unchanged since 1972 to facilitate time-trend studies as well as replication of earlier findings. The GSS takes the pulse of America, and is a unique and valuable resource. It has tracked the opinions of Americans over the last four decades."

Integrated Public Use Microdata Series (IPUMS)

From Wikipedia: "Integrated Public Use Microdata Series (IPUMS) is the world's largest individual-level population database. IPUMS consists of microdata samples from United States (IPUMS-USA) and international (IPUMS-International) census records. The records are converted into a consistent format and made available to researchers through a web-based data dissemination system. Additional databases in the IPUMS family include: the North Atlantic Population Project, the National Historical Geographic Information System, the Integrated Health Interview Series (IHIS), and the Integrated Public Use Microdata Series-Current Population Survey (IPUMS-CPS)."

The Association of Religion Data Archives (ARDA)

From their website: "The Association of Religion D​ata Archives (ARDA) strives to democratize access to the best data on religion. Founded as the American Religion Data Archive in 1997 and going online in 1998, the initial archive was targeted at researchers interested in American religion. The targeted audience and the data collection have both greatly expanded since 1998, now including American and international collections and developing features for educators, journalists, religious congregations, and researchers. Data included in the ARDA are submitted by the foremost religion scholars and research centers in the world."​

Common Core of Data (CCD)

From their website: "The Common Core of Data (CCD) is a program of the U.S. Department of Education's National Center for Education Statistics that annually collects fiscal and non-fiscal data about all public schools, public school districts and state education agencies in the United States. The data are supplied by state education agency officials and include information that describes schools and school districts, including name, address, and phone number; descriptive information about students and staff, including demographics; and fiscal data, including revenues and current expenditures."

EconData.net (EconData)

From their website: "We have 1,000 links to socioeconomic data sources, arranged by subject and provider, pointers to the Web's premiere data collections, and our own list of the ten best sites for finding regional economic data."

World Bank Data (WorldBank)

From their website: "At the World Bank, the Development Data Group coordinates statistical and data work and maintains a number of macro, financial and sector databases. These databases are used by teams to prepare Country Assistance Strategies, poverty assessments, research studies and other forms of economic and sector work. This site is meant to provide all users with improved access to World Bank data and to make that data easy to find and use."

Panel Study of Income Dynamics (PSID)

From their website: "The study began in 1968 with a nationally representative sample of over 18,000 individuals living in 5,000 families in the United States. Information on these individuals and their descendants has been collected continuously, including data covering employment, income, wealth, expenditures, health, marriage, childbearing, child development, philanthropy, education, and numerous other topics."

Statistics in Sports: Sports Data Resources (AMSTAT)

This site hosted by the American Statistical Association lists sports data resources around the web.