DSE Graph schema management examples taken from the excellent DS330: DataStax Enterprise Graph course.
List all graph names:
system.graphs(); // => KillrVideo
Describe all graphs:
system.describe(); // => system.graph("KillrVideo").create()
Describe a specific graph:
system.graph("KillrVideo").describe(); // => system.graph("KillrVideo").create()
Examining schema:
schema.describe()
schema.edgeLabel('includedIn').describe()
schema.describe().split('\n').grep(~/.*vertexLabel.*/)
Create a graph:
system.graph("KillrVideo").create()
system.graph("KillrVideo").ifNotExists().create()
Create a graph with replication:
system
.graph("KillrVideo")
.option("graph.replication_config")
.set("{'class' : 'NetworkTopologyStrategy','DC-East' : 3,'DC-West' : 5}")
.option("graph.system_replication_config")
.set("{'class' : 'NetworkTopologyStrategy','DC-East' : 3,'DC-West' : 3}")
.ifNotExists()
.create()
Create a graph in development mode:
system.graph("KillrVideo").create()
schema.config().option("graph.schema_mode").set("Development")
Check if a graph exists:
system.graph("KillrVideo").exists() // => true
Drop a specific graph:
system.graph("SomeGraph").drop();
Define a property key for a scalar integer:
// Property key with a single integer value
schema.propertyKey("year").Int().single().create();
// single() is assumed by default and can be omitted
schema.propertyKey("year").Int().create();
Defina a multi-property (note that multi-properties can only be associated with vertices):
// Property key that allows many text values for film production companies
schema.propertyKey("production").Text().multiple().create();
Define a meta-property (note that meta-properties can only be associated with vertex properties):
// Property key definitions
schema.propertyKey("source").Text().create();
schema.propertyKey("date").Timestamp().create();
// Multi-property key with two meta-properties
schema.propertyKey("budget").Text().multiple().properties("source","date").create();
Creating vertex labels:
// Vertex label with six associated property keys
schema
.vertexLabel("movie")
.properties("movieId","title","year", "duration","country","production")
.create();
// Different vertex labels cna use the same property key
// Property key "name" definition
schema.propertyKey("name").Text().create();
// Property key "name" is associated with both vertex labels
schema.vertexLabel("genre").properties("genreId","name").create();
schema.vertexLabel("person").properties("personId","name").create();
Creating Edge Labels
// A single cardinality edge label defintion
// A movie can be rated by a user at most once
schema
.edgeLabel("rated")
.single()
.properties("rating")
.connection("user","movie")
.create();
// A multi-cardinality edge label definition
// Acting in a movie in multiple roles is possible
schema
.edgeLabel("actor")
.multiple()
.connection("movie","person")
.create();
// multiple() is assumed by default and can be omitted
schema
.edgeLabel("actor")
.connection("movie","person")
.create();
// A multi-connection edge label definition
// Edge label with different domains and ranges
schema
.edgeLabel("knows")
.single()
.connection("user","user")
.connection("user","person")
.connection("person","user")
.create();
Dropping graph schemas:
// Dropping graph schema will also result in loosing all graph data!
schema.clear()
Retrieve vertext with a Default ID:
g.V().hasId("{~label=movie, member_id=0, community_id=63341568}")
// Sample output:
// v[{~label=movie, member_id=0, community_id=63341568}]
g.V("{~label=movie, member_id=0, community_id=63341568}")
// Sample output:
// v[{~label=movie, member_id=0, community_id=63341568}]
Defining Custom Vertex IDs
// Property keys
schema.propertyKey("username").Text().create();
schema.propertyKey("age").Int().create();
schema.propertyKey("gender").Text().create();
// Vertex label with a custom ID
schema
.vertexLabel("user")
.partitionKey("username")
.properties("age","gender")
.create();
// Property keys
schema.propertyKey("movieId").Text().create();
schema.propertyKey("title").Text().create();
schema.propertyKey("year").Int().create();
schema.propertyKey("duration").Int().create();
schema.propertyKey("country").Text().create();
// Vertex label with a custom ID
schema.vertexLabel("movie").
partitionKey("year","country").
clusteringKey("movieId").
properties("title","duration").create();
Retrieving a Vertex with a Custom ID
g.V("{~label=user, username=agent007}") // or
g.V().hasId("{~label=user, username=agent007}")
// Sample output:
// v[{~label=user, username=agent007}]
g.V("{country=United States, movieId=m267, ~label=movie, year=2010}")
// or
g.V().hasId("{country=United States, movieId=m267, ~label=movie, year=2010}")
// Sample output:
// v[{country=United States, movieId=m267, ~label=movie, year=2010}]
Create a materialized view index on a high cardinality property
// Indexing movies by movieId
schema.vertexLabel("movie").index("moviesById").materialized().by("movieId").add()
// Find a movie with a given movieId
// Both vertex label and property key-value must be
// specified for a traversal to use an index.
g.V().hasLabel("movie").has("movieId","m267")
// or
g.V().has("movie","movieId","m267")
Create a secondary index on a low cardinality property
// Indexing movies by year
schema.vertexLabel("movie").index("moviesByYear").secondary().by("year").add()
// Find movies from a given year
g.V().hasLabel("movie").has("year",2010)
// or
g.V().has("movie","year",2010)
// Note: A traversal with no explicitly specified vertex label,
// e.g., g.V().has("year",2010), cannot take advantage of a vertex index.
Create a full text search index on a text property
// Indexing movies by title
schema.vertexLabel("movie").index("search").search().by("title").asText().add()
// Find movies with words that start with "Wonder" in their titles
g.V().has("movie","title",Search.tokenRegex("Wonder.*"))
// Indexed properties can be queried using token(), tokenPrefix(), and tokenRegex().
Create a string search index on a text property
//Indexing movies by country
schema.vertexLabel("movie").index("search").search().by("country").asString().add()
//Find movies from countries that start with letter "U"
g.V().has("movie","country",Search.prefix("U"))
//Indexed properties can be queried using prefix(), regex(), eq() and neq().
Create additional specialized search index capabilities
// Spacial indexes: asCartesian() and asGeo()
// Other non-text indexes: no special index type
// One search index, many properties
// Indexing users by name, location, and age
schema
.vertexLabel("user")
.index("search")
search()
by("name").asText()
by("location").asCartesian(0.0,0.0,100.0,100.0)
by("age").add()
List vertex indexes
// Listing all graph schema information
schema.describe()
// Listing schema information for a particular vertex label
schema.vertexLabel("movie").describe()
// Sample output:
// schema.vertexLabel("movie").properties("movieId", "title", "year", "duration",
// "country", "production").create()
// schema.vertexLabel("movie").index("moviesById").materialized().by("movieId").add()
// schema.vertexLabel("movie").index("moviesByYear").secondary().by("year").add()
// schema.vertexLabel("movie").index("search").search().by("title").asText(),
// .by("country").asString().add()
// ...
Drop a vertex index
// Dropping a materialized view index
schema.vertexLabel("movie").index("moviesById").remove()
// Dropping a secondary index
schema.vertexLabel("movie").index("moviesByYear").remove()
// Dropping a specific search index property
schema.vertexLabel("user").index("search").
search().properties("location").remove()
// Dropping a search index
schema.vertexLabel("user").index("search").remove()
Efficiently retrieve properties of a known vertex that have associated meta-properties whose values are known or fall into a known range using materialized views in Cassandra.
Create and use a Property Index
// Indexing movie budget estimates by source
schema.vertexLabel("movie").index("movieBudgetBySource")
.property("budget").by("source").add()
// Querying movie budget estimates based on source
g.V().has("movie","movieId","m267").properties("budget")
.has("source","Los Angeles Times").value()
// Indexing movie budget estimates by date
schema.vertexLabel("movie").index("movieBudgetByDate")
.property("budget").by("date").add()
// Querying movie budget estimates based on date
g.V().has("movie","movieId","m267").properties("budget")
.has("date", gt(Instant.now().minusSeconds(86400 * 365)))
.value()
Efficiently traverse edges that are incident to a known vertex, have a known label, and have properties whose values are known or fall into a known range using a materialized view in Cassandra.
Create and use an Edge Index
// Find how many users rated a particular movie with an 8-star rating
schema.vertexLabel("movie")
.index("toUsersByRating")
.inE("rated").by("rating")
.add();
g.V().has("movie","movieId","m267")
.inE("rated").has("rating",8).count()
// Find movies rated with a greater-than-7 rating by a particular user
schema.vertexLabel("user")
.index("toMoviesByRating")
.outE("rated").by("rating")
.add();
g.V().has("user","userId","u1")
.outE("rated").has("rating",gt(7)).inV()
// Both incoming and outgoing edges of a vertex can be indexed by
// specifying bothE() when creating an edge index.
Change the schema mode for a graph:
schema.config().option("graph.schema_mode").get() // Production
schema.config().option("graph.schema_mode").set("Development")
schema.config().option("graph.schema_mode").get() // Development
Enabling graph scans in production mode
// This is OK for scanning small portions of data, but caution is advised
schema.config().option("graph.schema_mode").get() // Production
g.V().hasLabel("genre").values("name")
// Could not find a suitable index ... and graph scans are disabled
schema.config().option("graph.allow_scan").set(true)
g.V().hasLabel("genre").values("name")
// Action Adventure Animation Comedy ... 18 genres in total
Executa a query and track profiling data
OLTP traversal example
g.V().has("person","name","Johnny Depp").in("actor").values("title").profile()
Traversal Metrics
Step Count Traversers Time (ms) % Dur
============================================================================================
DsegGraphStep([~label.eq(person), name.eq(Johnn... 1 1 1.103 16.25
query-optimizer 0.117
query-setup 0.004
index-query 0.330
DsegVertexStep(IN,[actor],vertex) 14 14 1.312 19.33
query-optimizer 0.068
query-setup 0.001
vertex-query 0.519
DsegPropertiesStep([title],value) 14 14 4.373 64.43
query-optimizer 0.154
query-setup 0.000
vertex-query 0.161
query-setup 0.000
vertex-query 0.150
query-setup 0.000
vertex-query 0.179
query-setup 0.000
vertex-query 0.209
query-setup 0.000
>TOTAL - - 6.789 -
OLAP traversal example
g.E().groupCount().by(label).profile()
Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================
GraphStep(edge,[]) 69054 69054 4161.214 98.50
GroupCountStep(label) 1 1 63.277 1.50
>TOTAL - - 4224.492 -
Switching to the OLTP traversal engine
:remote config alias g KillrVideo.g
g.V().has("person","name","Johnny Depp").in("actor").values("title")
Switching to the OLAP traversal engine
:remote config alias g KillrVideo.a
g.E().groupCount().by(label)