Coresignal logo
largest professional network

Professional network data

firmographic data

Firmographic data

employee data

Employee data

job posting data

Job posting data

startup data

Startup data

company employee reviews data

Employee reviews

funding data

Company funding data

technographic data

Technographic data

tech product reviews data

Tech product reviews

arrow right
arrow right
community and repository data

Developer community and repository data

  • GitHub, Docker Hub, and more sources
  • Parsed, clean & accurate data
  • Delivery in JSON
  • Organization, projects, location and many other data points
  • Valuable for investment, market research, and HR intelligence
community and repository data

4 total data sources

data records

874M total data records


Always fresh, updated data

Top repository data sources


GitHub data consists of over 866M records and provides you with data points such as the developer's username, location, hireability, projects' information, and more.

Docker Hub

Docker Hub data consists of over 2M records and provides you with data points such as the developer's name, location, company, repository information, and more.

Stay ahead of the game with fresh web data

Coresignal's data helps companies achieve their goals

Why do you need community and repository data?

Community and repository data allows you to find software projects and the best talent in IT who work with different developer languages, such as Python, Java, C++, and more. We offer data from coding, programming, web development, app development, software development communities, and more. Use this information to source IT talent or find investment opportunities by analyzing information about software projects.

The data is collected from public web sources such as GitHub, Docker Hub, Kaggle, and Stack Exchange.

Main data fields

Here are some examples of the data fields you will find in our community and repository data.

Information Description Example values*
company_name Name of the company where the person is employed X Software
location Highlights the location of the person Netherlands
repos_summary Data block on repositories including names, descriptions, sizes, watcher counts, licenses, programming languages, etc. NA
scripts_summary Data block on scripts by the user NA
datasets_summary Data block on datasets by the user NA

Hire the top tech talent

Find the best IT specialists for your company by leveraging our community and repository data. Sourcing tech talent might be difficult in today's competitive environment. Save your time and resources by finding talent in developer communities.

sourcing tech talent visual

Find fastest growing software companies

Once you find a talented and motivated developer, you can see what company they work for. Such data might indicate that the company has the potential for fast growth if the developer keeps their pace. In turn, it signals a potential investment opportunity. For a more in-depth view of the company, you could also opt for firmographic data.

trendline indicating growth

Largest tech communities data

The data is gathered from the world's largest repositories/communities of developers, data scientists, programmers, and other IT professionals. In this dataset, you will be sure to find top tech talent, new software projects, and even discover new investment opportunities.

developer and tech communities data

Community and repository data use cases


Identify and track noteworthy software projects that might be ready for the next stage of growth.


Source the perfect candidate for your project by analyzing similar projects and their communities.

Data delivery


Tell us what
you need

First, we discuss your specific needs. Optionally, we can offer a sample dataset. Then, you can either request the full dataset or data specific to selected countries and regions.


Get the requested data

The requested data is then uploaded in CSV or JSON formats as a web link or a file, directly to your preferred data storage.


Keep it fresh

Outdated data loses relevance. With Coresignal, get monthly or quarterly data updates.

But don't take us at our word. Listen to our clients.

Transparent Quote

Venture capital client

"Coresignal has strong demographic and firmographic datasets both on quality and volume while keeping the data as fresh as it can be. We've been using Coresignal for years and we can only speak highly about the product and team behind it. Highly recommended."

Transparent Quote

Lead generation client

"We are using Coresignal to enrich our AI platform for Sales Pipeline Growth. We proactively recommend sales-ready opps, interested buyers, warm intros, and trusted actions, which results in +25% in net new pipeline in 2 months, and +40% after 6 months."

Transparent Quote

Venture capital client

"Before we started working with Coresignal, the percentage of investments that we made that had data influence was around 2% and currently it’s around 65%."

Why Coresignal?


In the market since 2016

Our team includes some of the most experienced web data extraction professionals.

new tech projects

Discover new and promising tech projects

In community and repository data you can find innovative projects that may redefine existing standards.

tech projects

Follow open-source projects

If you discover a new open-source project that may bring you profit, you can buy and monetize it.

accurate data

Find content gap from developer topics

Developers share a lot of information in the communities that may help you discover certain content gaps.

growth monitoring

Monitor software project growth

Regularly check the developers that caught your interest and see how their projects progress over time.


Track trending topics among developers

See what developers are talking about and what topics or projects seem to be gaining traction.

contact us

Stay ahead of the game with fresh web data

Coresignal's data helps companies achieve their goals

Frequently asked questions

What is a developer community?

A developer community is a place where developers share their projects, knowledge, progress, and advice, among other things.

What are Coresignal's developer community and repository data sources?

Coresignals developer community and repository data sources include GitHub, Docker Hub, Kaggle, and Stack Exchange.

Where to find tech talent?

You can find tech talent in community and repository data or employee data.

How is community and repository data collected?

We collect community and repository data from various public web sources and put it into several databases. Different data sources have separate datasets of respective community and repository data records.

Who uses community and repository data?

Coresignal’s community and repository data is being used by investors and HR platforms that use it to generate investment signals and source talent.

How secure is the data?

Data security is one of the main priorities. We store data in a protected dataset to avoid breaches and leaks of sensitive information. 

Unlock new business opportunities with Coresignal. Let’s get in touch.

Contact us

Follow us:



Terms and conditions

Coresignal © 2023 All Rights Reserved