Competently using data has proven to be the path towards success for many entities across different fields. In business, it meant competitive advantage, innovation, and profit. However, in order to achieve all these benefits, companies need to understand and take advantage of different kinds of data analysis and handling practices. One important distinction to be aware of is between primary data analysis and secondary data analysis. The importance of collecting new data is often and rightly stressed. So, let’s look closer at why it’s vital to utilize secondary data as well, and what benefits can come from analyzing secondary data.

What is secondary data?

Secondary data refers to information that has already been collected, recorded, and published by another entity—typically for their own original purpose—and is later accessed and repurposed by others for new analyses. Unlike primary data, which researchers collect firsthand to answer a specific question, secondary data is pre-existing and available prior to the current research.

This type of data is often the result of primary research conducted by someone else, making the new user a secondary source. It can be obtained through public or commercial channels and may be either free or purchased. Importantly, secondary data collection doesn't involve generating new information, but rather sourcing, evaluating, and applying existing data—such as government statistics, academic studies, internal company records, or large-scale web data.

Primary data vs. secondary data

The difference between primary and secondary data is not only source type or whether they have been used before. These two types of data usually differ in their features which have important implications when choosing which type of analysis to conduct.

Data collected for primary research is raw data that can be structured according to the goals of the analysis. Secondary data usually has already been structured or processed, often more than once, thus at first, it is presented for analysis in a form that was meant to suit something else.

Qualitative data is more often used in primary research. Secondary research is more associated with quantitative data, such as administrative data or census data, often studied by social scientists. However, there are also valid qualitative data research methods that can be applied for secondary data in marketing research or other business-relevant analysis. Here are some advantages and disadvantages of secondary data analysis as compared to primary research

Aspect Primary Data Secondary Data Third-Party Data
What is it? Data you collect yourself for your specific purpose Existing data originally collected for other purposes Data collected and sold by external organizations
Examples Your own surveys, interviews, experiments Government reports, academic studies, public records Consumer databases purchased from data brokers
Main Benefits Tailored to your exact needs; you control quality Time and cost efficient; often larger sample sizes Access to large datasets without collection effort
Limitations Expensive and time-consuming to collect May not perfectly fit your research questions Limited transparency about collection methods
When to use When you need specific information not available elsewhere When suitable existing data is available for your needs When you need broad market insights quickly

Secondary data sources

Primary research is done with the data collected from authentic sources. This means that, for example, researchers conduct interviews or carry out field tests to get the data for the analysis.

Sources of secondary data, on the other hand, don’t need to be authentic. Any source information collected for whichever purpose can be a source for secondary data analysis. Naturally, this means that there are many such sources.

For businesses and other organizations, all these sources can be divided into internal and external. Internal sources are those that come from within the organization. For example, researchers may use existing data from accounting, customer feedback, or operational reports when doing marketing research to improve a firm’s marketing strategies. This data is still secondary as it was originally recorded for other purposes, but as it originates within the same company as the marketing research itself, it’s internal data.

All other sources, those that are outside of the organization, are external sources of secondary data. Of course, this group of sources is extensive and varies immensely. Here are some of the most common examples of such sources.

Source Availability Cost Reliability
Governmental websites and databases High Free Very High
Academic research Moderate to High Free to Low High
Industry reports Moderate Medium to High High
Social media & web data Very High Low to Medium Variable
Commercial data providers High Medium to High Very High

How to Evaluate the Quality of Secondary Data

  • Authoritativeness: Is the data published by a trusted, recognized institution?
  • Data freshness: Is the data current or outdated?
  • Source credibility: What is the origin of the data, and is it peer-reviewed or validated?
  • Method of collection: How was the data originally gathered, and does it align with your research needs?

primary data vs secondary data visual

Advantages of secondary research

Saving time and effort

Collecting secondary data for research is much faster and easier than primary data collection. This allows researchers to save time by going straight to the analysis process. Additionally, researchers stay focused on the research goals without having to worry about finding and utilizing primary sources, which can be a lot of work on its own.

Cost-effectiveness

Secondary research is generally the cheaper option. It is quite costly to organize focus groups, hire people to question persons of interest, or build and maintain various sensors able to record large amounts of data. Meanwhile, secondary data may cost next to nothing to get as all the data one could use is already available and often easily accessible from free institutions like public libraries. Even when such data is not enough and one has to turn to data providers or otherwise spend money to acquire secondary data, it’s still cheaper than primary data collection.

Cleaned and structured data

Secondary data has often been cleaned before using it for primary purposes. This means that the data already ascends to at least some data quality standards. There may be many quality issues with just gathered primary data. Thus researchers have to put additional resources to clean it. Additionally, secondary data is usually structured, which, as mentioned, may not suit the particular requirements of secondary research at hand, but it does bring some organization and readability, which can prove time-saving.

The large volume of data

Finally, there’s only so much primary data that researchers can collect before having to start the actual analysis. With secondary data, there’s no such limit. There is more information available in secondary sources than one could handle in a lifetime of data analysis. Thus, secondary data researchers certainly don’t have many restrictions on what sources to choose from.

Disadvantages of secondary research

Differing requirements

The biggest among the disadvantages of secondary data research is that one can’t quite be sure that the data will suit the goals of the research exactly. Primary data analysts can gather exactly what they need. Secondary researchers, on the other hand, work with what they were able to find from what is available.

Control over the collection process

Secondary data analysts can’t be completely sure that the data was collected according to rigid standards and therefore is valid and representative. They may check the source and try to find out as much about the collection as possible, but there will always be a degree of uncertainty.

Lacking uniqueness

Primary researchers work on unique data that no one else has had before. Therefore they have a greater chance of arriving at unique insights. Secondary data analysis can be unique too, but only for as long as no one else uses the same data for the same research purposes.

disadvantages of secondary research visual

Five Metrics for evaluating and analyzing secondary data

The first step of secondary data analysis is the evaluation of data. Although, as mentioned, it’s impossible to have complete quality control over secondary data, researchers can still exercise some control. The following criteria are crucial when evaluating secondary data in order to determine their suitability for the analysis at hand.

  1. Reliability of the source
    How trusty is the data source? Is it a reputable data provider or an established publisher? Researchers should also check to find out as much as possible about the circumstances of data collection.
  2. Relevance
    Not all trustworthy information is relevant data for a particular analysis. Researchers must first establish clear analysis goals to determine data relevance and then check what kind of information particular data sources hold.
  3. Overall quality
    Of course, analysts need to pay attention to any errors, redundancies, or other possible issues with the data they’re considering for usage. Poor data quality costs businesses between $9.7 million and $14.2 million every year. 
  4. Freshness
    How new is the data? When was it last updated? Outdated information may no longer answer the questions raised by the analysis goals.
  5. Accessibility
    The format of the data and how it is accessed are also pivotal for data analysis. The easier it is to access data, the more efficient and reliable secondary research will be.

The importance of secondary data analysis in business

For years business heads and data analysts have been lamenting the fact that most data never get to be analyzed. For example, a few years ago, it was estimated that only about 0.5% of all data is ever analyzed and utilized.

Having this in mind, one can’t help but wonder whether it’s worth spending money on additional data production when so much existing data never gets used. Of course, primary research is often necessary, for example, when new qualitative data is required, but it is equally important not to overlook the potential of secondary data.

Especially when it comes to secondary quantitative data, the large volumes of public web data already available would suggest first going for secondary research. Thus, combining the two research methods is the surest way for businesses to benefit from data analysis.

Wrapping up

Researchers can either collect new data for analysis or get secondary data from some of the many diverse sources. Whichever path is chosen, the key to success and business benefits is, as always, attention to data quality and choosing the right method for the right goals.