<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.biomarkerkb.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=RajaMazumder</id>
	<title>BiomarkerKB Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.biomarkerkb.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=RajaMazumder"/>
	<link rel="alternate" type="text/html" href="https://wiki.biomarkerkb.org/Special:Contributions/RajaMazumder"/>
	<updated>2026-05-08T14:14:08Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.42.3</generator>
	<entry>
		<id>https://wiki.biomarkerkb.org/index.php?title=Data_Release_Notes&amp;diff=148</id>
		<title>Data Release Notes</title>
		<link rel="alternate" type="text/html" href="https://wiki.biomarkerkb.org/index.php?title=Data_Release_Notes&amp;diff=148"/>
		<updated>2025-12-15T17:41:14Z</updated>

		<summary type="html">&lt;p&gt;RajaMazumder: /* Versioning Format */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Versioning Format ==&lt;br /&gt;
The versioning format follows a three-digit structure: X.Y.Z.&lt;br /&gt;
* The first digit (X) changes when a major update is introduced, such as changes in the data model.&lt;br /&gt;
* The second digit (Y) increments when new data is added.&lt;br /&gt;
* The third digit (Z) is updated for bug fixes or minor changes.&lt;br /&gt;
&lt;br /&gt;
== Version 2.1.0 ==&lt;br /&gt;
Date: December 11, 2025&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* Added the LLM-extracted glycan biomarker dataset provided by Cyrus Chun Hong Au Yeung.&lt;br /&gt;
=== Backend and Infrastructure Updates ===&lt;br /&gt;
* The incorrect download links on the [https://data.biomarkerkb.org Data Portal] have been fixed.&lt;br /&gt;
* LOINC codes are no longer tied to specimen IDs.&lt;br /&gt;
&lt;br /&gt;
== Version 2.0.2 ==&lt;br /&gt;
Date: December 4, 2025&lt;br /&gt;
=== Bug Fixes ===&lt;br /&gt;
* LOINC codes are no longer tied to specimen (UBERON) IDs.&lt;br /&gt;
* For biomarkers that could not be mapped to [[Controlled Vocabulary and Keywords|Controlled Vocabulary]] the original biomarker name is displayed, followed by &amp;quot;in review&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== Version 2.0.1 ==&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* Added cross-references to the Common Fund Data Ecosystem ([https://commonfund.nih.gov/dataecosystem CFDE]) Data Coordinating Centers and other resources:&lt;br /&gt;
** [https://www.gtexportal.org/home/ GTEx]&lt;br /&gt;
** [https://pharos.nih.gov/ Pharos]&lt;br /&gt;
** [https://reactome.org/ Reactome]&lt;br /&gt;
** [https://undiagnosed.hms.harvard.edu/ Undiagnosed Diseases Network]&lt;br /&gt;
** [https://idg.reactome.org/ Illuminating the Druggable Genome (IDG) Reactome Portal]&lt;br /&gt;
** [https://www.metabolomicsworkbench.org/ Metabolomics Workbench]&lt;br /&gt;
** [https://maayanlab.cloud/sigcom-lincs SigCom LINCS]&lt;br /&gt;
&lt;br /&gt;
== Version 2.0.0 ==&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* The biomarker field is now standardized using controlled vocabulary terms.&lt;br /&gt;
* Added metabolite as an &amp;lt;code&amp;gt;assessed_entity_type&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;mw_loinc_biomarkers.tsv&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Added [https://rnacentral.org/ RNAcentral] cross-reference support.&lt;br /&gt;
* Added Electronic Health Records Normal ranges data from Oracle Health for Troponin I as an example.&lt;br /&gt;
&lt;br /&gt;
== Version 1.0.6 ==&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* Added a new dataset: MW LOINC biomarkers (&amp;lt;code&amp;gt;mw_loinc_biomarkers.tsv&amp;lt;/code&amp;gt;).&lt;br /&gt;
* Added [https://ncithesaurus.nci.nih.gov/ National Cancer Institute Thesaurus] and [https://www.rcsb.org/ Protein Data Bank] cross-references.&lt;br /&gt;
=== Backend and Infrastructure Updates ===&lt;br /&gt;
* Added the &amp;lt;code&amp;gt;display_name&amp;lt;/code&amp;gt; field to the &amp;lt;code&amp;gt;format-converter&amp;lt;/code&amp;gt; so data source names appear with correct casing.&lt;br /&gt;
&lt;br /&gt;
== Version 1.0.5 ==&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* Updated the Troponin biomarker value &amp;lt;code&amp;gt;assessed_biomarker_entity&amp;lt;/code&amp;gt; for consistency.&lt;br /&gt;
* Added normal ranges from Electronic Health Records provided by the University of New Mexico for Troponin biomarkers.&lt;br /&gt;
* Added Cell Ontology and Protein Ontology cross-references.&lt;br /&gt;
=== Backend and Infrastructure Updates ===&lt;br /&gt;
* Updated all script paths to use &amp;lt;code&amp;gt;data_source.conf&amp;lt;/code&amp;gt; and validated data source names.&lt;br /&gt;
&lt;br /&gt;
== Version 1.0.4 ==&lt;br /&gt;
This release introduces new datasets, cross-references, and bug fixes.&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* Added Cancer Genome Interpreter data on cancer biomarkers from MetaKB.&lt;br /&gt;
* Added Metabolomics Workbench LOINC data on metabolite biomarkers.&lt;br /&gt;
* Added Cell Ontology and Protein Ontology cross-references.&lt;br /&gt;
=== Bug Fixes ===&lt;br /&gt;
* Fixed issue where cookie preferences weren&#039;t being saved when selecting &amp;quot;Allow&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== Version 1.0.3 ==&lt;br /&gt;
This release introduces new cross-references and updates to ensure compatibility with external resources.&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* NCBI cross-references added across gene biomarker entries.&lt;br /&gt;
* ChEBI cross-references integrated for small molecules and metabolites.&lt;br /&gt;
=== Backend and Infrastructure Updates ===&lt;br /&gt;
* ChEBI API migration: Updated all programmatic links from the legacy SOAP services to the new REST API endpoints, following ChEBI’s platform migration.&lt;br /&gt;
** Old services retired 1 September 2025.&lt;br /&gt;
** New stable API: [https://www.ebi.ac.uk/chebi/backend/api/docs ChEBI REST API docs]&lt;br /&gt;
** New data products and beta interface available at [https://www.ebi.ac.uk/chebi/beta/ ChEBI 2.0].&lt;br /&gt;
== Version 1.0.2 ==&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* Published updated [https://www.metabolomicsworkbench.org/ Metabolomics Workbench] data.&lt;br /&gt;
* Published sample data from the [https://edrn.nci.nih.gov/ Early Detection Research Network].&lt;br /&gt;
=== Backend and Infrastructure Updates ===&lt;br /&gt;
* &amp;lt;code&amp;gt;evidence_source&amp;lt;/code&amp;gt; database names now retain their original casing for accuracy and consistency.&lt;br /&gt;
* EDRN identifiers were added to the [https://github.com/clinical-biomarkers/format-converter/blob/main/mapping_data/namespace_map.json namespace map].&lt;br /&gt;
* [https://www.genenames.org/ HUGO Gene Nomenclature Committee] (HGNC) was added to the cross-reference JSON file.&lt;br /&gt;
* Fixed an issue where &amp;lt;code&amp;gt;evidence_source&amp;lt;/code&amp;gt; values without tags were previously dropped; these are now preserved.&lt;br /&gt;
* Added a user-guided spelling correction function to improve data entry quality.&lt;br /&gt;
* The TSV-to-JSON converter now automatically checks for header spelling errors.&lt;br /&gt;
* Introduced &amp;lt;code&amp;gt;_suggest_header_corrections&amp;lt;/code&amp;gt; to flag and propose fixes for misspelled headers.&lt;br /&gt;
* Enhanced &amp;lt;code&amp;gt;_stream_tsv&amp;lt;/code&amp;gt; with a call to &amp;lt;code&amp;gt;_check_header_spelling&amp;lt;/code&amp;gt; to prevent invalid headers from being processed.&lt;br /&gt;
&lt;br /&gt;
== Version 1.0.1 ==&lt;br /&gt;
=== Data Updates ===&lt;br /&gt;
* Added &amp;lt;code&amp;gt; xrefs.tsv&amp;lt;/code&amp;gt; to the list of datasets.&lt;br /&gt;
=== Backend &amp;amp; Infrastructure Updates ===&lt;br /&gt;
* Fixed ID formatting issues in NCBI and UniProt references within &amp;lt;code&amp;gt; oncomx.tsv&amp;lt;/code&amp;gt;, removing erroneous spaces (e.g., &amp;lt;code&amp;gt; NCBI: 3288&amp;lt;/code&amp;gt; → &amp;lt;code&amp;gt; NCBI:3288&amp;lt;/code&amp;gt;) and extraneous text (e.g., &amp;lt;code&amp;gt;&amp;quot;(composition)&amp;quot;&amp;lt;/code&amp;gt;). Affected biomarkers included AN6295-1, AN6756-1, AN6728-1, and others.&lt;br /&gt;
* Merged assessed entity type synonyms.&lt;br /&gt;
&lt;br /&gt;
== Version 1.0.0 ==&lt;br /&gt;
* BiomarkerKB data portal available with OncoMX, OpenTargets, MarkerDB, ClinVar, PubMed Central Biomarker Gene Set Curation, MW, UniProtKB, GWAS, CIViC biomarker data.&lt;/div&gt;</summary>
		<author><name>RajaMazumder</name></author>
	</entry>
	<entry>
		<id>https://wiki.biomarkerkb.org/index.php?title=BiomarkerKB_Resource_Integration&amp;diff=118</id>
		<title>BiomarkerKB Resource Integration</title>
		<link rel="alternate" type="text/html" href="https://wiki.biomarkerkb.org/index.php?title=BiomarkerKB_Resource_Integration&amp;diff=118"/>
		<updated>2025-10-23T19:25:46Z</updated>

		<summary type="html">&lt;p&gt;RajaMazumder: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;BiomarkerKB collects data from many different resources. The data that is collected is not always directly integrated into the data model and data from a resource is sometimes just added as valuable contextual annotations or cross references.&lt;br /&gt;
&lt;br /&gt;
Other resources to be explored: [https://search.cancervariants.org/ MetaKB], [https://cadsr.cancer.gov/onedata/Home.jsp CADSR Cancer], https://themarker.idrblab.cn/, biomarker.org, ResMarkerDB, SalivaDB, https://glycanage.com/publications, https://www.cancergenomeinterpreter.org/biomarkers, [https://github.com/issues/assigned?issue=clinical-biomarkers%7Cbiomarker-issue-repo%7C248 Glycan Biomarkers] ([https://github.com/glygener/CarboCurator code])&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Please contact us at mazumder_lab@gwu.edu and daniallmasood@gwu.edu if you have any other resources that may contain biomarker data &lt;br /&gt;
&lt;br /&gt;
=CIViC=&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
* Clinical Interpretation of Variants in Cancer (CIViC).&lt;br /&gt;
* Provides cancer biomarkers in form of DNA mutations (dbSNPs).&lt;br /&gt;
* Platform provides clinicians treatment options for patients based on unique tumor profile.&lt;br /&gt;
* License: Creative Commons Attribution-NonCommercial 4.0 International License.&lt;br /&gt;
&lt;br /&gt;
=ClinVar=&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
* Public archive of reports of human variations classified for diseases and drug responses.&lt;br /&gt;
* Provides biomarkers for all disease, but we have only curated cancer biomarkers for now.&lt;br /&gt;
** dbSNPs&lt;br /&gt;
** File is really big but will go back and use existing script to map all biomarkers from here into the data model.&lt;br /&gt;
* License: Creative Commons Attribution-NonCommercial 4.0 International License.&lt;br /&gt;
&lt;br /&gt;
=EDRN=&lt;br /&gt;
Status: Sample Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
* Cancer biomarkers.&lt;br /&gt;
&lt;br /&gt;
=GWAS=&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
* Published genome-wide association studies (GWAS).&lt;br /&gt;
* Provides biomarkers in form of SNPs.&lt;br /&gt;
* GWAS Catalog contains SNPs for a vast amount of diseases.&lt;br /&gt;
** Preliminary curation only focused on cancer.&lt;br /&gt;
** Will use existing script to map all biomarkers into data model.&lt;br /&gt;
* License: Creative Commons Attribution-NonCommercial 4.0 International License.&lt;br /&gt;
&lt;br /&gt;
=HPO=&lt;br /&gt;
&lt;br /&gt;
Status: Cross-Reference&lt;br /&gt;
&lt;br /&gt;
* HPO provides disease and entity associations.&lt;br /&gt;
* Does not provide a change within the entity so we cannot collect biomarker data from here.&lt;br /&gt;
* However we can use it as a cross-reference within our cross-referencing section.&lt;br /&gt;
* Provides cross-reference to OMIM, SNOMED, and MONDO.&lt;br /&gt;
&lt;br /&gt;
=LOINC=&lt;br /&gt;
Status: Cross-Reference&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;Data provided by Metabolomics Workbench&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=MarkerDB=&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
* Provides a lot of useful biomarker data and cross-references other resources as well.&lt;br /&gt;
* Information includes: panel information, abnormal levels of biomarkers by disease, structural information, etc.&lt;br /&gt;
* Annotations that can be cross-referenced include the above.&lt;br /&gt;
* By cross-referencing, BiomarkerKB will allow users to find more information for specific biomarkers and move towards the goal of being a comprehensive resource for biomarkers.&lt;br /&gt;
* License: Creative Commons Attribution-NonCommercial 4.0 International License.&lt;br /&gt;
&lt;br /&gt;
=Metabolomics Workbench=&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;Data provided by Metabolomics Workbench&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Metabolite biomarkers utilized in the uniform newborn screening program.&lt;br /&gt;
* Detect treatable disorders that are life threatening or having long-term morbidity, before they become symptomatic.&lt;br /&gt;
&lt;br /&gt;
=OncoKB=&lt;br /&gt;
Status: Cross-Reference&lt;br /&gt;
&lt;br /&gt;
* Provides useful information on drugs and therapy options for different biomarker entities.&lt;br /&gt;
* Also provides information based on what condition the entity is related to.&lt;br /&gt;
* License: A license is required to use OncoKB for commercial and/or clinical purposes, and to access OncoKB data programmatically for academic purposes.&lt;br /&gt;
* Paid license is required&lt;br /&gt;
* Cross-reference from biomarkers in BiomarkerKB to the appropriate drug information and therapy information is the best solution.&lt;br /&gt;
&lt;br /&gt;
=OncoMX=&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
* integrated cancer mutation and expression resource for exploring cancer biomarkers&lt;br /&gt;
* Manual curation effort by GWU and JPL&lt;br /&gt;
* Over 600 single and panel biomarkers&lt;br /&gt;
* License: Creative Commons Attribution-NonCommercial 4.0 International License.&lt;br /&gt;
&lt;br /&gt;
=OpenTargets=&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
* Collects potential drug targets and therapeutic targets.&lt;br /&gt;
* Some effort was required to find the correct biomarker data.&lt;br /&gt;
* 1200 biomarkers collected.&lt;br /&gt;
** dbSNPs related to cancer and other disease&lt;br /&gt;
* License: Creative Commons Attribution-NonCommercial 4.0 International License.&lt;br /&gt;
&lt;br /&gt;
=PubMed Central Biomarker Gene Set Curation=&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;Data provided by Avi Ma&#039;ayan&#039;s LINCS group&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* This data set was created through manual curation of biomarker gene sets on Pubmed Central using the results of gene sets returned from Rummagene. &lt;br /&gt;
* Using the outputted search results within the Rummagene web server, we manually identified publications that associated different conditions and environmental exposures to biomarker gene sets. &lt;br /&gt;
* The biomarker gene sets were retrieved through the validation of the gene mentioned within each of the publications. &lt;br /&gt;
* The primary use case for this data is to identify biomarker panels/ gene sets associated with conditions.&lt;br /&gt;
&lt;br /&gt;
=UniProtKB=&lt;br /&gt;
&lt;br /&gt;
Status: Direct Integration into Data Model&lt;br /&gt;
&lt;br /&gt;
* Can provide biomarker (change in entity), entity, condition, and sampling data.&lt;br /&gt;
* This data is in a text file that has to be reviewed fully and to make sure it will be able to be automatically extracted.&lt;br /&gt;
* Contextual information can be imputed if necessary.&lt;br /&gt;
* In UniProt there are found_in and entries that are actual biomarkers:&lt;br /&gt;
** found_in will get a cross-reference;&lt;br /&gt;
** actual biomarkers will be directly integrated.&lt;br /&gt;
* Manual curation of 56 reviewed entries with mention of &amp;quot;biomarker&amp;quot; in flat text file.&lt;br /&gt;
* License is Creative Commons Attribution 4.0 International (CC BY 4.0).&lt;/div&gt;</summary>
		<author><name>RajaMazumder</name></author>
	</entry>
</feed>