Created
April 8, 2022 06:47
-
-
Save wagenrace/97f634093c9630a3b4e7a441a77987b5 to your computer and use it in GitHub Desktop.
Loading the GI50 of NCI60 into neo4j
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// CSV file can be downloaded here: | |
// https://wiki.nci.nih.gov/download/attachments/147193864/GI50.zip?version=2&modificationDate=1649214698000&api=v2 | |
LOAD CSV WITH HEADERS FROM 'file:///GI50.csv' AS row | |
MERGE (chem:Chemical {nsc: toInteger(row.NSC)}) | |
MERGE (cell:CellLine {name: row.CELL_NAME}) | |
MERGE (dis:Disease {name: row.PANEL_NAME}) | |
WITH chem, cell, dis, row | |
MERGE (chem)-[:GI50 {concentration: row.AVERAGE, research: "NCI60", unit: row.CONCENTRATION_UNIT, experiment_id: row.EXPID, count: row.COUNT}]->(cell) | |
MERGE (cell)-[:CELL_LINE_OF]->(dis); |
I think creating experiments would make sense, I stayed away from these question till I have more experience to set it up clearer. I want to add graph datascience (similarity and community detection) first and then revisit the problem again with more experience
I don't think it makes sense to create that benchmark, perhaps as a description of a learning experience, but I guess you have better things to spend your time on. (we have docs/courses on graphacademy that explain these things)
You might have even used our data import tool http://data-importer.graphapp.io/
If you want to try it here is a model + csv file that can be loaded from the "..." top right.
https://drive.google.com/file/d/1EPFRGPhvDSoE9hKGa05-iPHq2fztBQxd/view?usp=sharing
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@jexp Thank you for you help. I got it to work (need :auto before USING PERIODIC COMMIT in desktop version) this is extremely fast compared that what I was doing.
I was busy creating a benchmark between Redis and Neo4j where Neo4j is 15x slower with my stupid stupid query compared to redis. Is that noteworthy?
But with match normal queries they are the same
https://github.com/wagenrace/medical_data_blog/tree/bench_mark/Adding_NCI60/bench_mark