Cohort and Data

FinnGen uses biobank samples and health register data

FinnGen has collected and correlated genome and longitudinal health data of more than 500 000 Finns, which is almost 10% of the Finnish population. FinnGen has used samples from existing biobank collections and received new samples from biobanks to produce genotype data. Data from health registers has been used to define clinical endpoints. FinnGen data therefore allows the study of associations between phenotypes and genotypes. In addition to imputed genotype data, NGS data is available for some of the individuals. The cohort is intentionally enriched with disease cases, as the samples have been collected from hospital biobanks. A subset of samples collected through the Finnish Red Cross Blood Service Biobank represent healthy individuals.

The final FinnGen cohort consists of over 500 000 individuals

The combined amount of the legacy samples and newly collected samples is 520 0000. The median age of the participants when donating was 53 years and 43% are men, and 57% women. 

Illustration of rows of stylized silhouettes of people, featuring diverse colors and shapes to represent a diverse group of individuals.

FinnGen Data

FinnGen is a biobank study

This means that the study has not recruited the participants for this specific research project, but the samples FinnGen utilises have been collected by the Finnish biobanks.

Read more about the recruitment

Clinical endpoints

During FinnGen, a significant effort has been put into creating meaningful clinical endpoints based on the digital health record data from Finnish health registries. 

More information about the endpoint definitions

Genetic data

Genome variant data from most of the samples has been produced using a customised genotyping chip with about 700 000 markers combined with imputation. NGS data is available for some of the individuals.

Read more about the genetic data

Health register data

Most of the phenotype data in FinnGen comes from the national health registers covering the entire lifespan of the study subjects. This covers data from more than 10 registers.

Read more about the health registers

FinnGen is expanding to other -omics and clinical data

Through the expansion areas of FinnGen 2 and during FinnGen 3 (in 2023-2027), the data resource will be expanded to include other omics data (proteomics, metabolomics and single cell ATAC sequencing), clinical data and laboratory values from the national KANTA register. 
 

Read more about the laboratory values

The data added from the Kanta register include all basic laboratory tests used in medical care as well as a number of less common laboratory results for more than 400 000 FinnGen participants.

Read more about additional phenotype data

FinnGen has supplemented register data with additional phenotype data, including clinical and questionnaire data from a subset of individuals.

Read more about other biological data

FinnGen is expanding to generate other biological data types, such as proteomics, of subset of its participants.