Evaluation Scenarios
We defined 12 scenarios on 6 graphs for the evaluation the graph-native approach to LPG normalization
Detailed results of the evaluation presented in the paper (http://arxiv.org/abs/2603.02995) can be found here.
Graphs
-
London Public Transport
The London Public Transport graph's sources are available on Zenodo. Adapted from 2, it models the public transport network of London. -
Northwind
The Northwind graph is used in 1 and is available on GitHub. It contains the "Northwind Traders" retail data set. -
No Socks
The No Socks graph originates from the motivating example of 1. -
Offshore
The Offshore graph is used in 1 and is available on GitHub. It contains the "The Offshore Leaks Database and guide from the International Consortium of Investigative Journalists (ICIJ)". -
University
The University graph originate from Figure 1a of the graph-native LPG normalization paper. -
Train Services
The Train Services graph is based on data from Rijden de Treinen and models train services performed by different operators that stop at train stations. It is available on Zenodo.
Scenarios
Evaluation scenarios combine graphs with dependency sets. In the following we list the minimal covers (i.e., minimal dependency sets) used for each graph. The dependencies are shown as expressions, like in the paper use an ASCII syntax (see Grammar) for expressing GO-FDs.
\(S_\text{Lon}\)
Graph: London Public Transport
Minimal cover of dependencies:
- \((s:\{\mathsf{Station}\}:\{\mathsf{zone},\mathsf{zoneOriginal}\})::s.\mathsf{zoneOriginal}\Rightarrow s.\mathsf{zone}\)
- \((s:\{\mathsf{Station}\}:\{\mathsf{latitude},\mathsf{longitude}\})::s.\mathsf{latitude},s.\mathsf{longitude}\Rightarrow s\)
- \((s:\{\mathsf{Station}\}:\emptyset)\xrightarrow{c:\{\mathsf{CONNECTED\_THROUGH}\}:\{\mathsf{line},\mathsf{color},\mathsf{type}\}}()::c.\mathsf{line}\Rightarrow c.\mathsf{color},c.\mathsf{type}\)
(s:Station:zone&zoneOriginal)::s.zoneOriginal=>s.zone(s:Station:latitude&longitude)::s.latitude,s.longitude=>s(s:Station)-[c:CONNECTED_THROUGH:line&color&type]->()::c.line=>c.color,c.type
\(S_\text{Nw}\)
Graph: Northwind
Minimal cover of dependencies:
- \((o:\{\mathsf{Order}\}:\{\mathsf{orderID}\})::o.\mathsf{orderID}\Rightarrow o\)
- \((o:\{\mathsf{Order}\}:\{\mathsf{orderID},\mathsf{orderDate},\mathsf{customerID},\mathsf{shipCity},\mathsf{shipPostalCode},\mathsf{shipCountry},\mathsf{shipAddress},\mathsf{shipRegion}\})::o.\mathsf{customerID}\Rightarrow o.\mathsf{shipCity},o.\mathsf{shipPostalCode},o.\mathsf{shipCountry},o.\mathsf{shipAddress},o.\mathsf{shipRegion}\)
(o:Order:orderID)::o.orderID=>o(o:Order:orderID&orderDate&customerID&shipCity&shipPostalCode&shipCountry&shipAddress&shipRegion)::o.customerID=>o.shipCity,o.shipPostalCode,o.shipCountry,o.shipAddress,o.shipRegion
\(S_\text{No-1}\)
Graph: No Socks
Minimal cover of dependencies:
- \((e:\{\mathsf{Event}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{company},e.\mathsf{time}\Rightarrow e\)
- \((e:\{\mathsf{Event}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{name}\Rightarrow e.\mathsf{company}\)
- \((e:\{\mathsf{Event}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{name},e.\mathsf{time}\Rightarrow e.\mathsf{venue}\)
- \((e:\{\mathsf{Event}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{time},e.\mathsf{venue}\Rightarrow e.\mathsf{name}\)
(e:Event:company&name&time&venue)::e.company,e.time=>e(e:Event:company&name&time&venue)::e.name=>e.company(e:Event:company&name&time&venue)::e.name,e.time=>e.venue(e:Event:company&name&time&venue)::e.time,e.venue=>e.name
\(S_\text{No-2}\)
Graph: No Socks
Minimal cover of dependencies:
- \((e:\{\mathsf{Event}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{company},e.\mathsf{time} \Rightarrow e.\mathsf{venue}\)
- \((e:\{\mathsf{Event}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{name} \Rightarrow e.\mathsf{company}\)
- \((e:\{\mathsf{Event}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{name},e.\mathsf{time}\Rightarrow e.\mathsf{venue}\)
- \((e:\{\mathsf{Event}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{time},e.\mathsf{venue}\Rightarrow e.\mathsf{name}\)
(e:Event:company&name&time&venue)::e.company,e.time=>e.venue(e:Event:company&name&time&venue)::e.name=>e.company(e:Event:company&name&time&venue)::e.name,e.time=>e.venue(e:Event:company&name&time&venue)::e.time,e.venue=>e.name
\(S_\text{No-3}\)
Graph: No Socks
Minimal cover of dependencies:
- \((e:\{\mathsf{Event},\mathsf{Confirmed}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{company},e.\mathsf{time}\Rightarrow e\)
- \((e:\{\mathsf{Event},\mathsf{Confirmed}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{name} \Rightarrow e.\mathsf{company}\)
- \((e:\{\mathsf{Event},\mathsf{Confirmed}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{name},e.\mathsf{time} \Rightarrow e.\mathsf{venue}\)
- \((e:\{\mathsf{Event},\mathsf{Confirmed}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{time},e.\mathsf{venue}\Rightarrow e.\mathsf{name}\)
- \((e:\{\mathsf{Event},\mathsf{Confirmed}\}:\{\mathsf{company},\mathsf{name},\mathsf{time},\mathsf{venue}\})::e.\mathsf{name},e.\mathsf{company} \Rightarrow e.\mathsf{time}\)
(e:Event&Confirmed:company&name&time&venue)::e.company,e.time=>e(e:Event&Confirmed:company&name&time&venue)::e.name=>e.company(e:Event&Confirmed:company&name&time&venue)::e.name,e.time=>e.venue(e:Event&Confirmed:company&name&time&venue)::e.time,e.venue=>e.name(e:Event&Confirmed:company&name&time&venue)::e.name,e.company=>e.time
\(S_\text{Off-1}\)
Graph: Offshore
Minimal cover of dependencies:
- \((e:\{\mathsf{Entity}\}:\{\mathsf{jurisdiction\_description},\mathsf{countries},\mathsf{service\_provider},\mathsf{country\_codes}\}):: e.\mathsf{countries} \Rightarrow e.\mathsf{country\_codes}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{jurisdiction\_description},\mathsf{countries},\mathsf{service\_provider},\mathsf{country\_codes}\}):: e.\mathsf{country\_codes}\Rightarrow e.\mathsf{countries}\)
(e:Entity:jurisdiction_description&countries&service_provider&country_codes):: e.countries=>e.country_codes(e:Entity:jurisdiction_description&countries&service_provider&country_codes):: e.country_codes=>e.countries
\(S_\text{Off-2}\)
Graph: Offshore
Minimal cover of dependencies:
- \((e:\{\mathsf{Entity}\}:\{\mathsf{jurisdiction\_description},\mathsf{valid\_until},\mathsf{countries},\mathsf{sourceID},\mathsf{country\_codes}\})::e.\mathsf{countries},e.\mathsf{jurisdiction\_description}\Rightarrow e.\mathsf{country\_codes}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{jurisdiction\_description},\mathsf{valid\_until},\mathsf{countries},\mathsf{sourceID},\mathsf{country\_codes}\}):: e.\mathsf{countries},e.\mathsf{sourceID}\Rightarrow e.\mathsf{country\_codes}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{jurisdiction\_description},\mathsf{valid\_until},\mathsf{countries},\mathsf{sourceID},\mathsf{country\_codes}\})::e.\mathsf{countries},e.\mathsf{valid\_until}\Rightarrow e.\mathsf{country\_codes}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{jurisdiction\_description},\mathsf{valid\_until},\mathsf{countries},\mathsf{sourceID},\mathsf{country\_codes}\}):: e.\mathsf{country\_codes},e.\mathsf{sourceID}\Rightarrow e.\mathsf{countries}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{jurisdiction\_description},\mathsf{valid\_until},\mathsf{countries},\mathsf{sourceID},\mathsf{country\_codes}\}):: e.\mathsf{country\_codes},e.\mathsf{valid\_until}\Rightarrow e.\mathsf{countries}\)
(e:Entity:jurisdiction_description&valid_until&countries&sourceID&country_codes):: e.countries,e.jurisdiction_description=>e.country_codes(e:Entity:jurisdiction_description&valid_until&countries&sourceID&country_codes):: e.countries,e.sourceID=>e.country_codes(e:Entity:jurisdiction_description&valid_until&countries&sourceID&country_codes):: e.countries,e.valid_until=>e.country_codes(e:Entity:jurisdiction_description&valid_until&countries&sourceID&country_codes):: e.country_codes,e.sourceID=>e.countries(e:Entity:jurisdiction_description&valid_until&countries&sourceID&country_codes):: e.country_codes,e.valid_until=>e.countries
\(S_\text{Off-3}\)
Graph: Offshore
Minimal cover of dependencies:
- \((e:\{\mathsf{Entity}\}:\{\mathsf{countries} ,\mathsf{service\_provider},\mathsf{country\_codes},\mathsf{jurisdiction\_description},\mathsf{sourceID},\mathsf{valid\_until}\})::e.\mathsf{countries} \Rightarrow e.\mathsf{country\_codes}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{countries} ,\mathsf{service\_provider},\mathsf{country\_codes},\mathsf{jurisdiction\_description},\mathsf{sourceID},\mathsf{valid\_until}\})::e.\mathsf{country\_codes} \Rightarrow e.\mathsf{countries}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{countries} ,\mathsf{service\_provider},\mathsf{country\_codes},\mathsf{jurisdiction\_description},\mathsf{sourceID},\mathsf{valid\_until}\})::e.\mathsf{service\_provider} \Rightarrow e.\mathsf{sourceID}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{countries} ,\mathsf{service\_provider},\mathsf{country\_codes},\mathsf{jurisdiction\_description},\mathsf{sourceID},\mathsf{valid\_until}\})::e.\mathsf{sourceID} \Rightarrow e.\mathsf{valid\_until}\)
- \((e:\{\mathsf{Entity}\}:\{\mathsf{countries} ,\mathsf{service\_provider},\mathsf{country\_codes},\mathsf{jurisdiction\_description},\mathsf{sourceID},\mathsf{valid\_until}\})::e.\mathsf{valid\_until}\Rightarrow \mathsf{e.sourceID}\)
(e:Entity:countries&service_provider&country_codes&jurisdiction_description&sourceID&valid_until):: e.countries=>e.country_codes(e:Entity:countries&service_provider&country_codes&jurisdiction_description&sourceID&valid_until):: e.country_codes=>e.countries(e:Entity:countries&service_provider&country_codes&jurisdiction_description&sourceID&valid_until):: e.service_provider=>e.sourceID(e:Entity:countries&service_provider&country_codes&jurisdiction_description&sourceID&valid_until):: e.sourceID=>e.valid_until(e:Entity:countries&service_provider&country_codes&jurisdiction_description&sourceID&valid_until):: e.valid_until=>e.sourceID
\(S_\text{Ts-1}\)
Graph: Train Services
Minimal cover of dependencies:
- \(()\xrightarrow{t:\{\mathsf{STOPS\_AT}\}:\{\mathsf{code}\}}(s:\{\mathsf{Station}\}:\{\mathsf{name}\})::s.\mathsf{name}\Rightarrow t.\mathsf{code}\)
- \((ts:\{\mathsf{TrainService}\}:\{\mathsf{date},\mathsf{number},\mathsf{type}\})::ts.\mathsf{number},ts.\mathsf{date}\Rightarrow ts.\mathsf{type}\)
- \((ts:\{\mathsf{TrainService}\}:\{\mathsf{date},\mathsf{number},\mathsf{operator}\})\xrightarrow{:\{\mathsf{STOPS\_AT}\}:\emptyset}():: ts.\mathsf{number},ts.\mathsf{date}\Rightarrow ts.\mathsf{operator}\)
- \((ts:\{\mathsf{TrainService}\}:\{\mathsf{serviceid}\})\xrightarrow{:\{\mathsf{STOPS\_AT}\}:\emptyset}()::ts.\mathsf{serviceid} \Rightarrow ts\)
- \(()\xrightarrow{t:\{\mathsf{STOPS\_AT}\}:\{\mathsf{stopid}\}}()::t.\mathsf{stopid} \Rightarrow t\)
- \(()\xrightarrow{t:\{\mathsf{STOPS\_AT}\}:\{\mathsf{departure},\mathsf{stopid}\}}(s)::t.\mathsf{stopid}\Rightarrow s\)
()-[t:STOPS_AT:code]->(s:Station:name)::s.name=>t.code(ts:TrainService:date&number&type)::ts.number,ts.date=>ts.type(ts:TrainService:date&number&operator)-[:STOPS_AT]->()::ts.number,ts.date=>ts.operator(ts:TrainService:serviceid)-[:STOPS_AT]->()::ts.serviceid=>ts()-[t:STOPS_AT:stopid]->()::t.stopid=>t()-[t:STOPS_AT:departure&stopid]->(s)::t.stopid=>s
\(S_\text{Ts-2}\)
Graph: Train Services
Minimal cover of dependencies:
- \(()\xrightarrow{t:\{\mathsf{STOPS\_AT}\}:\{\mathsf{code}\}}(s:\{\mathsf{Station}\}:\{\mathsf{name}\})::s.\mathsf{name}\Rightarrow t.\mathsf{code}\)
- \(()\xrightarrow{t:\{\mathsf{STOPS\_AT}\}:\{\mathsf{code}\}}(s:\{\mathsf{Station}\}:\{\mathsf{name}\})::t.\mathsf{code}\Rightarrow s.\mathsf{name}\)
- \((ts:\{\mathsf{TrainService}\}:\{\mathsf{date},\mathsf{number},\mathsf{type}\})::ts.\mathsf{number},ts.\mathsf{date}\Rightarrow ts.\mathsf{type}\)
- \((ts:\{\mathsf{TrainService}\}:\{\mathsf{date},\mathsf{number},\mathsf{operator}\})\xrightarrow{:\{\mathsf{STOPS\_AT}\}:\emptyset}():: ts.\mathsf{number},ts.\mathsf{date}\Rightarrow ts.\mathsf{operator}\)
- \((ts:\{\mathsf{TrainService}\}:\{\mathsf{serviceid}\})\xrightarrow{:\{\mathsf{STOPS\_AT}\}:\emptyset}()::ts.\mathsf{serviceid} \Rightarrow ts\)
- \(()\xrightarrow{t:\{\mathsf{STOPS\_AT}\}:\{\mathsf{stopid}\}}()::t.\mathsf{stopid} \Rightarrow t\)
- \(()\xrightarrow{t:\{\mathsf{STOPS\_AT}\}:\{\mathsf{departure},\mathsf{stopid}\}}(s)::t.\mathsf{stopid}\Rightarrow s\)
()-[t:STOPS_AT:code]->(s:Station:name)::s.name=>t.code()-[t:STOPS_AT:code]->(s:Station:name)::t.code=>s.name(ts:TrainService:date&number&type)::ts.number,ts.date=>ts.type(ts:TrainService:date&number&operator)-[:STOPS_AT]->()::ts.number,ts.date=>ts.operator(ts:TrainService:serviceid)-[:STOPS_AT]->()::ts.serviceid=>ts()-[t:STOPS_AT:stopid]->()::t.stopid=>t()-[t:STOPS_AT:departure&stopid]->(s)::t.stopid=>s
\(S_\text{Uni-1}\)
Graph: University
Minimal cover of dependencies:
- \((c:\{\mathsf{Course}\}:\emptyset)\xleftarrow{t:\{\mathsf{TEACHES}\}:\{\mathsf{usingBook}\}}()::c=>t.\mathsf{usingBook}\)
(c:Course)<-[t:TEACHES:usingBook]-()::c=>t.usingBook
\(S_\text{Uni-2}\)
Graph: University
Minimal cover of dependencies:
- \((c:\{\mathsf{Course}\}:\{\mathsf{title}\})\xleftarrow{t:\{\mathsf{TEACHES}\}:\{\mathsf{usingBook}\}}()::c.\mathsf{title}\Rightarrow t.\mathsf{usingBook}\)
(c:Course:title)<-[t:TEACHES:usingBook]-()::c.title=>t.usingBook
Queries
In our query experiment we run in total 17 queries before and after normalization on 4 scenarios. For several queries we used different variants after normalization.
\(S_\text{Lon}\)
Query 1
Original query
PROFILE MATCH ()-[c:CONNECTED_THROUGH]->() RETURN DISTINCT c.line, c.color
After normalization
PROFILE MATCH (c:Ccolorclinectype) RETURN DISTINCT c.Cline, c.CcolorPROFILE MATCH ()-[]-(d:CONNECTED_THROUGH)-[]-(), (d)-[]-(c:Ccolorclinectype) RETURN DISTINCT c.Cline, c.Ccolor
Query 2
Original query
PROFILE MATCH (s:Station)-[c:CONNECTED_THROUGH]-() WHERE s.name = "Piccadilly Circus" RETURN DISTINCT c.line
After normalization
PROFILE MATCH (s:Station)-[]-(:CONNECTED_THROUGH)-[:CCOLORCLINECTYPE]-(c:Ccolorclinectype) WHERE s.name = "Piccadilly Circus" RETURN DISTINCT c.Cline
Query 3
Original query
PROFILE MATCH (s:Station) WHERE s.zoneOriginal IS NOT NULL RETURN DISTINCT s.zone, s.zoneOriginal
After normalization
PROFILE MATCH (s:SzoneszoneOriginal) WHERE s.SzoneOriginal IS NOT NULL RETURN DISTINCT s.Szone, s.SzoneOriginalPROFILE MATCH (s:Station)-[]-(s:SzoneszoneOriginal) WHERE s.SzoneOriginal IS NOT NULL RETURN DISTINCT s.Szone, s.SzoneOriginal
Query 4
Original query
PROFILE MATCH (s:Station)-[c:CONNECTED_THROUGH]-(:Station) WHERE s.zone = "6" RETURN DISTINCT c.line
After normalization
PROFILE MATCH (s:Station)-[]-(c:CONNECTED_THROUGH)-[]-(:Station), (s)-[]-(z:SzoneszoneOriginal), (c)-[]-(d:Ccolorclinectype) WHERE z.Szone = "6" RETURN DISTINCT d.Cline
Query 5
Original query
PROFILE MATCH (s:Station)-[c:CONNECTED_THROUGH]->(t:Station) WHERE s.name = "Piccadilly Circus" AND c.line = "Bakerloo" RETURN DISTINCT s.name, t.name
After normalization
PROFILE MATCH (s:Station)-[]->(c:CONNECTED_THROUGH)-[]->(t:Station), (c)-[]-(d:Ccolorclinectype) WHERE s.name = "Piccadilly Circus" and d.Cline ="Bakerloo" RETURN t.name
Query 6
Original query
PROFILE MATCH ()-[c:CONNECTED_THROUGH]->() WHERE c.line = "Bakerloo" SET c.color = "Rainbow"
After normalization
PROFILE MATCH (c:Ccolorclinectype) WHERE c.Cline = "Bakerloo" SET c.Ccolor = "Rainbow"PROFILE MATCH (:CONNECTED_THROUGH)-->(c:Ccolorclinectype) WHERE c.Cline = "Bakerloo" SET c.Ccolor = "Rainbow"PROFILE MATCH (:CONNECTED_THROUGH)-->(c:Ccolorclinectype) WHERE c.Cline = "Bakerloo" WITH DISTINCT c SET c.Ccolor = "Rainbow"
\(S_\text{Nw}\)
Query 1
Original query
PROFILE MATCH (o:Order) RETURN DISTINCT o.orderID, o.customerID
After normalization
PROFILE MATCH (o:Order)-[]-(c:OcustomerIDoshipAddressoshipCityoshipCountryoshipPostalCodeoshipRegion) RETURN DISTINCT o.orderID, c.OcustomerID
Query 2
Original query
PROFILE MATCH (o:Order) WHERE o.customerID IS NOT NULL AND o.shipCity IS NOT NULL AND o.shipPostalCode IS NOT NULL AND o.shipCountry IS NOT NULL AND o.shipAddress IS NOT NULL AND o.shipRegion IS NOT NULL RETURN DISTINCT o.customerID
After normalization
PROFILE MATCH (o:OcustomerIDoshipAddressoshipCityoshipCountryoshipPostalCodeoshipRegion) WHERE o.OcustomerID IS NOT NULL AND o.OshipCity IS NOT NULL AND o.OshipPostalCode IS NOT NULL AND o.OshipCountry IS NOT NULL AND o.OshipAddress IS NOT NULL AND o.OshipRegion IS NOT NULL RETURN DISTINCT o.OcustomerIDPROFILE MATCH (:Order)-->(o:OcustomerIDoshipAddressoshipCityoshipCountryoshipPostalCodeoshipRegion) WHERE o.OcustomerID IS NOT NULL AND o.OshipCity IS NOT NULL AND o.OshipPostalCode IS NOT NULL AND o.OshipCountry IS NOT NULL AND o.OshipAddress IS NOT NULL AND o.OshipRegion IS NOT NULL RETURN DISTINCT o.OcustomerID
Query 3
Original query
PROFILE MATCH (o:Order) WHERE o.customerID = "FOLKO" SET o.shipCountry = "Rainbow"
After normalization
PROFILE MATCH (o:Order)-[]-(c:OcustomerIDoshipAddressoshipCityoshipCountryoshipPostalCodeoshipRegion) WHERE c.OcustomerID = "FOLKO" SET c.OshipCountry = "Rainbow"PROFILE MATCH (o:Order)-[]-(c:OcustomerIDoshipAddressoshipCityoshipCountryoshipPostalCodeoshipRegion) WHERE c.OcustomerID = "FOLKO" WITH DISTINCT c SET c.OshipCountry = "Rainbow"PROFILE MATCH (c:OcustomerIDoshipAddressoshipCityoshipCountryoshipPostalCodeoshipRegion) WHERE c.OcustomerID = "FOLKO" SET c.OshipCountry = "Rainbow"
\(S_\text{Off-1}\)
Query 1
##### Original query
* `PROFILE MATCH (e:Entity) WHERE e.countries IS NOT NULL AND e.jurisdiction_description IS NOT NULL AND e.country_codes IS NOT NULL RETURN DISTINCT e.countries, e.country_codes`
##### After normalization
* `PROFILE MATCH (c:EcountriesecountryCodes) WHERE c.Ecountries IS NOT NULL AND c.EcountryCodes IS NOT NULL RETURN DISTINCT c.Ecountries, c.EcountryCodes`
* `PROFILE MATCH (:Entity)-[]->(c:EcountriesecountryCodes) WHERE c.Ecountries IS NOT NULL AND c.EcountryCodes IS NOT NULL RETURN DISTINCT c.Ecountries, c.EcountryCodes`
Query 2
Original query
PROFILE MATCH (e:Entity) WHERE e.countries IS NOT NULL AND e.service_provider IS NOT NULL AND e.country_codes IS NOT NULL RETURN DISTINCT e.countries,e.service_provider, e.country_codes
After normalization
PROFILE MATCH (e:Entity)-[]-(c:EcountriesecountryCodes) WHERE c.Ecountries IS NOT NULL AND c.EcountryCodes IS NOT NULL RETURN DISTINCT c.Ecountries, c.EcountryCodes, e.service_provider
Query 3
Original query
PROFILE MATCH (e:Entity) WHERE e.countries IS NOT NULL AND e.jurisdiction_description IS NOT NULL AND e.country_codes IS NOT NULL AND e.countries = "Singapore" SET e.country_codes = "Sp"
After normalization
PROFILE MATCH (c:EcountriesecountryCodes) WHERE c.Ecountries IS NOT NULL AND c.EcountryCodes IS NOT NULL AND c.Ecountries = "Singapore" SET c.EcountryCodes = "Sp"PROFILE MATCH (:Entity)-[]->(c:EcountriesecountryCodes) WHERE c.Ecountries IS NOT NULL AND c.EcountryCodes IS NOT NULL AND c.Ecountries = "Singapore" SET c.EcountryCodes = "Sp"PROFILE MATCH (:Entity)-[]->(c:EcountriesecountryCodes) WHERE c.Ecountries IS NOT NULL AND c.EcountryCodes IS NOT NULL AND c.Ecountries = "Singapore" WITH DISTINCT c SET c.EcountryCodes = "Sp"
\(S_\text{Ts-2}\)
Query 1
Original query
PROFILE MATCH ()-[t:STOPS_AT]->(s:Station) WHERE s.name IS NOT NULL AND t.code IS NOT NULL RETURN DISTINCT t.code, s.name
After normalization
PROFILE MATCH (n:Snametcode) RETURN DISTINCT n.Sname, n.TcodePROFILE MATCH (s:Station)-->(n:Snametcode), (t:STOPS_AT)-->(n), (t)-->(s) RETURN DISTINCT n.Sname, n.Tcode
Query 2
Original query
PROFILE MATCH (ts:TrainService)-[t:STOPS_AT]->(s:Station) WHERE ts.type CONTAINS "Nightjet" RETURN DISTINCT ts.number, ts.type, t.code
After normalization
PROFILE MATCH (ts:TrainService)-->(t:STOPS_AT)-->(s:Station), (ts)-->(d:Tsdatetsnumbertstype), (s)-->(n:Snametcode), (t)-->(n) WHERE d.Tstype CONTAINS "Nightjet" RETURN DISTINCT d.Tsnumber, d.Tstype, n.Tcode
Query 3
Original query
PROFILE MATCH (ts:TrainService) WHERE ts.type CONTAINS "Nightjet" RETURN DISTINCT ts.number, ts.completely_cancelled
After normalization
PROFILE MATCH (ts:TrainService)-->(d:Tsdatetsnumbertstype) WHERE d.Tstype CONTAINS "Nightjet" RETURN DISTINCT d.Tsnumber, ts.completely_cancelled
Query 4
Original query
PROFILE MATCH (ts:TrainService) WHERE ts.type CONTAINS "Sprinter" RETURN DISTINCT ts.number, ts.operator
After normalization
PROFILE MATCH (ts:TrainService)-->(d:Tsdatetsnumbertstype), (ts)-->(e:Tsdatetsnumbertsoperator) WHERE d.Tstype CONTAINS "Sprinter" RETURN DISTINCT d.Tsnumber, e.TsoperatorPROFILE MATCH (ts:TrainService)-->(d:Tsdatetsnumbertstype), (ts)-->(e:Tsdatetsnumbertsoperator) WHERE d.Tstype CONTAINS "Sprinter" RETURN DISTINCT d.Tsnumber, e.Tsoperator
Query 5
Original query
PROFILE MATCH ()-[t:STOPS_AT]->(s:Station) WHERE s.name IS NOT NULL AND t.code IS NOT NULL AND s.name = "Innsbruck Hbf" SET t.code = "Rainbow"
After normalization
PROFILE MATCH (n:Snametcode) WHERE n.Sname = "Innsbruck Hbf" SET n.Tcode = "Rainbow"PROFILE MATCH (s:Station)-->(n:Snametcode), (t:STOPS_AT)-->(n), (t)-->(s) WHERE n.Sname = "Innsbruck Hbf" SET n.Tcode = "Rainbow"PROFILE MATCH (s:Station)-->(n:Snametcode), (t:STOPS_AT)-->(n), (t)-->(s) WHERE n.Sname = "Innsbruck Hbf" WITH DISTINCT n SET n.Tcode = "Rainbow"
-
Philipp Skavantzos and Sebastian Link. 2025. Third and Boyce-Codd normal form for property graphs. VLDB J. 34, 2 (2025), 23. ↩↩↩
-
https://github.com/neo4j-partners/neo4j-transport-for-london ↩