Knowledge Graph
Recommended development process:
- Revise the edge and node files for BIOMARKER.
- Optional: Upload the edge and node files to the Globus folder.
- Take a copy of the latest set of ontology CSVs of the Data Distillery minus the Biomarker data (DD-no-BIOMARKER) and add it to your ETL environment.
- Add your new edge and node files to the folder that corresponds to the download folder of your Globus Connect Personal setup. Your copy of edges_nodes.ini should point to this folder. For example, I download everything from Globus to a subfolder of my Documents folder on my MacOs machine. My ini file looks like:
- Run the ingestion script to generate a new set of ontology CSVs with the new BIOMARKER (./build_csv.sh -v BIOMARKER), integrating your version of BIOMARKER with the DD-no-BIOMARKER.
- Using the ontology CSVs generated in step 5, execute the workflow described in ubkg-neo4j to build a Docker container. As you've probably experienced, the longest waits are in the import of the CSVs and the time spent to create the relationship indexes. (Pro [or maybe jaded amateur] tip: if you find the import taking forever, especially for relationships, you're probably running into memory issues. Reboot and do over.)