Latest revision as of 13:56, 10 June 2025

BiomarkerKB collects data from many different resources. The data that is collected is not always directly integrated into the data model and data from a resource is sometimes just added as valuable contextual annotations or cross references.

Other resources to be explored: CADSR Cancer, https://themarker.idrblab.cn/, biomarker.org, ResMarkerDB, SalivaDB, https://glycanage.com/publications, https://www.cancergenomeinterpreter.org/biomarkers, loinc.org (MW effort), EDRN Cancer Biomarkers (EDRN effort)

Please contact us at mazumder_lab@gwu.edu and daniallmasood@gwu.edu if you have any other resources that may contain biomarker data

CIViC

Status: Direct Integration into Data Model

Clinical Interpretation of Variants in Cancer (CIViC)
Provides cancer biomarkers in form of DNA mutations (dbSNPs)
Platform provides clinicians treatment options for patients based on unique tumor profile
License: Creative Commons Attribution-NonCommercial 4.0 International License.

ClinVar

Status: Direct Integration into Data Model

public archive of reports of human variations classified for diseases and drug responses
Provides biomarker for all disease, but we have only curated cancer biomarkers for now
- dbSNPs
- File is really big but will go back and use existing script to map all biomarkers from here into the data model
License: Creative Commons Attribution-NonCommercial 4.0 International License.

GWAS

Status: Direct Integration into Data Model

published genome-wide association studies (GWAS)
Provides biomarkers in form of SNPs
GWAS Catalog contains SNPs for a vast amount of diseases
- Preliminary curation only focused on cancer
- Will use existing script to map all biomarkers into data model
License: Creative Commons Attribution-NonCommercial 4.0 International License.

HPO

Status: Cross-Reference

HPO provides disease and entity associations
Does not provide a change within the entity
So we cannot collect biomarker data from here
However we can use it as a cross reference within our cross referencing section
Provides cross reference to OMIM, SNOMED, and MONDO

MarkerDB

Status: Direct Integration into Data Model

Provides a lot of useful biomarker data and cross-references other resources as well
License: Creative Commons Attribution-NonCommercial 4.0 International License.
Information includes: panel information, abnormal levels of biomarkers by disease, structural information, etc
Annotations that can be cross-referenced include the above
By cross-referencing, BiomarkerKB will allow users to find more information for specific biomarkers and move towards the goal of being a comprehensive resource for biomarkers

Metabolomics Workbench

Status: Direct integration into the model

Data provided by Metabolomics Workbench

Metabolite biomarkers utilized in the uniform newborn screening program
detect treatable disorders
- that are life threatening or having long-term morbidity, before they become symptomatic.

OncoKB

Status: Cross reference

Provides useful information on drugs and therapy options for different biomarker entities
Also provides information based on what condition the entity is related to
License: A license is required to use OncoKB for commercial and/or clinical purposes, and to access OncoKB data programmatically for academic purposes.
Paid license is required
Cross reference from biomarkers in BiomarkerKB to the appropriate drug information and therapy information is the best solution

OncoMX

Status: Direct Integration into Data Model

integrated cancer mutation and expression resource for exploring cancer biomarkers
Manual curation effort by GWU and JPL
Over 600 single and panel biomarkers
License: Creative Commons Attribution-NonCommercial 4.0 International License.

OpenTargets

Status: Direct Integration into Data Model

Collects potential drug targets and therapeutic targets
Some effort was required to find the correct biomarker data
1200 biomarkers collected
- dbSNPs related to cancer and other disease
License: Creative Commons Attribution-NonCommercial 4.0 International License.

PubMed Central Biomarker Gene Set Curation

Status: Direct Integration into Data Model

Data provided by Avi Ma'ayan's LINCS group

This data set was created through manual curation of biomarker gene sets on Pubmed Central using the results of gene sets returned from Rummagene.
Using the outputted search results within the Rummagene web server, we manually identified publications that associated different conditions and environmental exposures to biomarker gene sets.
The biomarker gene sets were retrieved through the validation of the gene mentioned within each of the publications.
The primary use case for this data is to identify biomarker panels/ gene sets associated with conditions.

UniProtKB

Status: Direct Integration into Data Model

Can provide biomarker (change in entity), entity, condition, and sampling data
This data is in a text file that has to be reviewed fully and to make sure it will be able to be automatically extracted
Contextual information can be imputed if necessary
License is Creative Commons Attribution 4.0 International (CC BY 4.0)
In UniProt there are found_in and entries that are actual biomarkers
- found_in will get an cross reference
- actual biomarkers will be directly integrated
Manual curation of 56 reviewed entries with mention of "biomarker" in flat text file

@@ Line 1: / Line 1: @@
 BiomarkerKB collects data from many different resources. The data that is collected is not always directly integrated into the data model and data from a resource is sometimes just added as valuable contextual annotations or cross references.
-=UniProtKB=
+Other resources to be explored: [https://cadsr.cancer.gov/onedata/Home.jsp CADSR Cancer], https://themarker.idrblab.cn/, biomarker.org, ResMarkerDB, SalivaDB, https://glycanage.com/publications, https://www.cancergenomeinterpreter.org/biomarkers, loinc.org (MW effort), EDRN Cancer Biomarkers (EDRN effort)
+Please contact us at mazumder_lab@gwu.edu and daniallmasood@gwu.edu if you have any other resources that may contain biomarker data
+=CIViC=
+Status: Direct Integration into Data Model
+* Clinical Interpretation of Variants in Cancer (CIViC)
+* Provides cancer biomarkers in form of DNA mutations (dbSNPs)
+* Platform provides clinicians treatment options for patients based on unique tumor profile
+* License: Creative Commons Attribution-NonCommercial 4.0 International License.
+=ClinVar=
+Status: Direct Integration into Data Model
+* public archive of reports of human variations classified for diseases and drug responses
+* Provides biomarker for all disease, but we have only curated cancer biomarkers for now
+** dbSNPs
+** File is really big but will go back and use existing script to map all biomarkers from here into the data model
+* License: Creative Commons Attribution-NonCommercial 4.0 International License.
+=GWAS=
 Status: Direct Integration into Data Model
-* Can provide biomarker (change in entity), entity, condition, and sampling data
+* published genome-wide association studies (GWAS)
-* This data is in a text file that has to be reviewed fully and to make sure it will be able to be automatically extracted
+* Provides biomarkers in form of SNPs
-* Contextual information can be imputed if necessary
+* GWAS Catalog contains SNPs for a vast amount of diseases
-* License is Creative Commons Attribution 4.0 International (CC BY 4.0)
+** Preliminary curation only focused on cancer
-* In UniProt there are found_in and entries that are actual biomarkers
+** Will use existing script to map all biomarkers into data model
-** found_in will get an cross reference
+* License: Creative Commons Attribution-NonCommercial 4.0 International License.
-** actual biomarkers will be directly integrated
+=HPO=
+Status: Cross-Reference
+* HPO provides disease and entity associations
+* Does not provide a change within the entity
+* So we cannot collect biomarker data from here
+* However we can use it as a cross reference within our cross referencing section
+* Provides cross reference to OMIM, SNOMED, and MONDO
 =MarkerDB=
-Status: Cross Reference
+Status: Direct Integration into Data Model
 * Provides a lot of useful biomarker data and cross-references other resources as well
 * License: Creative Commons Attribution-NonCommercial 4.0 International License.
 * Information includes: panel information, abnormal levels of biomarkers by disease, structural information, etc
-* Annotations that can be cross reference include the above
+* Annotations that can be cross-referenced include the above
 * By cross-referencing, BiomarkerKB will allow users to find more information for specific biomarkers and move towards the goal of being a comprehensive resource for biomarkers
+=Metabolomics Workbench=
+Status: Direct integration into the model
+''Data provided by Metabolomics Workbench''
+* Metabolite biomarkers utilized in the uniform newborn screening program
+* detect treatable disorders
+** that are life threatening or having long-term morbidity, before they become symptomatic.
 =OncoKB=
@@ Line 30: / Line 70: @@
 * Cross reference from biomarkers in BiomarkerKB to the appropriate drug information and therapy information is the best solution
+=OncoMX=
+Status: Direct Integration into Data Model
+* integrated cancer mutation and expression resource for exploring cancer biomarkers
+* Manual curation effort by GWU and JPL
+* Over 600 single and panel biomarkers
+* License: Creative Commons Attribution-NonCommercial 4.0 International License.
+=OpenTargets=
+Status: Direct Integration into Data Model
-Other resources to be explored: [https://cadsr.cancer.gov/onedata/Home.jsp CADSR Cancer], https://themarker.idrblab.cn/, biomarker.org, ResMarkerDB, SalivaDB
+* Collects potential drug targets and therapeutic targets
+* Some effort was required to find the correct biomarker data
+* 1200 biomarkers collected
+** dbSNPs related to cancer and other disease
+* License: Creative Commons Attribution-NonCommercial 4.0 International License.
+=PubMed Central Biomarker Gene Set Curation=
+Status: Direct Integration into Data Model
-Please contact us at mazumder_lab@gwu.edu if you have any other
+''Data provided by Avi Ma'ayan's LINCS group''
+* This data set was created through manual curation of biomarker gene sets on Pubmed Central using the results of gene sets returned from Rummagene.
+* Using the outputted search results within the Rummagene web server, we manually identified publications that associated different conditions and environmental exposures to biomarker gene sets.
+* The biomarker gene sets were retrieved through the validation of the gene mentioned within each of the publications.
+* The primary use case for this data is to identify biomarker panels/ gene sets associated with conditions.
+=UniProtKB=
+Status: Direct Integration into Data Model
+* Can provide biomarker (change in entity), entity, condition, and sampling data
+* This data is in a text file that has to be reviewed fully and to make sure it will be able to be automatically extracted
+* Contextual information can be imputed if necessary
+* License is Creative Commons Attribution 4.0 International (CC BY 4.0)
+* In UniProt there are found_in and entries that are actual biomarkers
+** found_in will get an cross reference
+** actual biomarkers will be directly integrated
+*Manual curation of 56 reviewed entries with mention of "biomarker" in flat text file

BiomarkerKB Resource Integration: Difference between revisions

Latest revision as of 13:56, 10 June 2025

Contents

CIViC

ClinVar

GWAS

HPO

MarkerDB

Metabolomics Workbench

OncoKB

OncoMX

OpenTargets

PubMed Central Biomarker Gene Set Curation

UniProtKB

Navigation menu

BiomarkerKB Resource Integration: Difference between revisions

Latest revision as of 13:56, 10 June 2025

CIViC

ClinVar

GWAS

HPO

MarkerDB

Metabolomics Workbench

OncoKB

OncoMX

OpenTargets

PubMed Central Biomarker Gene Set Curation

UniProtKB

Navigation menu

Search