Data Release Notes

From BiomarkerKB Wiki
Jump to navigation Jump to search

Versioning Format

The versioning format follows a three-digit structure: X.Y.Z.

  • The first digit (X) changes when a major update is introduced, such as changes in the data model.
  • The second digit (Y) increments with each new release.
  • The third digit (Z) is updated for bug fixes or minor changes.

Version 1.0.4

This release introduces new datasets, cross-references, and bug fixes.

Data Updates

  • Added Cancer Genome Interpreter data on cancer biomarkers from MetaKB.
  • Added Metabolomics Workbench LOINC data on metabolite biomarkers.
  • Added Cell Ontology and Protein Ontology cross-references.

Bug Fixes

  • Fixed issue where cookie preferences weren't being saved when selecting "Allow".

Version 1.0.3

This release introduces new cross-references and updates to ensure compatibility with external resources.

Data Updates

  • NCBI cross-references added across gene biomarker entries.
  • ChEBI cross-references integrated for small molecules and metabolites.

Backend and Infrastructure Updates

  • ChEBI API migration: Updated all programmatic links from the legacy SOAP services to the new REST API endpoints, following ChEBI’s platform migration.

Version 1.0.2

Data Updates

Backend and Infrastructure Updates

  • evidence_source database names now retain their original casing for accuracy and consistency.
  • EDRN identifiers were added to the namespace map.
  • HUGO Gene Nomenclature Committee (HGNC) was added to the cross-reference JSON file.
  • Fixed an issue where evidence_source values without tags were previously dropped; these are now preserved.
  • Added a user-guided spelling correction function to improve data entry quality.
  • The TSV-to-JSON converter now automatically checks for header spelling errors.
  • Introduced _suggest_header_corrections to flag and propose fixes for misspelled headers.
  • Enhanced _stream_tsv with a call to _check_header_spelling to prevent invalid headers from being processed.

Version 1.0.1

Data Updates

  • Added xrefs.tsv to the list of datasets.

Backend & Infrastructure Updates

  • Fixed ID formatting issues in NCBI and UniProt references within oncomx.tsv, removing erroneous spaces (e.g., NCBI: 3288 NCBI:3288) and extraneous text (e.g., "(composition)"). Affected biomarkers included AN6295-1, AN6756-1, AN6728-1, and others.
  • Merged assessed entity type synonyms.

Version 1.0.0

  • BiomarkerKB data portal available with OncoMX, OpenTargets, MarkerDB, ClinVar, PubMed Central Biomarker Gene Set Curation, MW, UniProtKB, GWAS, CIViC biomarker data.