Subscribe to DSC Newsletter

Exploring €1.3 trillion in public contracts with graph visualization

The European Union is giving free access to detailed information about public purchasing contracts. That data describes which European institutions are spending money, for what and who benefits from it. We are going to explore the network of public institutions and suppliers and look for interesting patterns with graph visualization.

Public spending in Europe is a €1.3 trillion business

Every year, more than €1.3 trillion in contracts are awarded by public entities in Europe. In an effort to make these public contracts more transparent, the European Union has decided to make the tenders announcements public. The information can be found online through the EU’s open data portalOpenTED, a non profit organization, has gone a step further and made the data available in a CSV format.

Public contracts are complex though. It involves at least one commercial entity which is awarded a contract by a public authority. The public entity may be acting as a “delegate” for another public entity. The contract can be disputed in a certain jurisdiction of appeal.

We have multiple entities and relationships. What this means is that the tenders data describe a graph or network. We are going to explore the tenders graph with Neo4j and Linkurious.

Modeling public contracts as a graph

We will focus on the 2015 tenders. There are 73,269 tenders in one single CSV file with 45 columns.

tenders excel

We decided to model the graph in the following way:

tender data model neo4j

The graph model above highlights the relationships between the contracts, appeal bodies, operators, delegates and authorities in our data.

To put it in a Neo4j database, we wrote a script that you can see here.

This script takes the 2015 tenders data and turn it into a Neo4j graph database with 161,541 nodes and 536,936 edges. Now that it’s in Neo4j, we can search, explore and visualize it with Linkurious. It’s time to start asking questions to the data!

The biggest tenders and the authorities and companies which are involved

As a first step, let’s try to identify the big public contracts which happened in Europe in 2015 and what organizations they involved. In order to get that answer, we’ll use Cypher(the Neo4j query language) within Linkurious.

Here’s how to find the 10 biggest contracts and the public authorities and commercial providers they involved.

// The top 10 biggest contracts and who they involve
WHERE b.contract_initial_value_cost_eur <> ‘null’
ORDER BY b.contract_initial_value_cost_eur DESC
RETURN a, c, b

The result is the following graph.

biggest public contracts

We can see for example that Ministry of Defence has awarded a large contract toBabcock International Group Plc.

Missed connections

The graph structure of the data also allows us to spot missed opportunities. Let’s take for example KPMG. It’s one of the biggest recipient of public contracts within our dataset. What other opportunities the company could have been awarded?

To answer that question, we can identify which of KPMG’s customers awarded contracts to its competitors.

Let’s identify the biggest missed opportunities of KPMG:

// KPMG’s biggest missed opportunities
MATCH (a:OPERATOR {official_name: ‘KPMG’})-[:IS_OPERATOR_OF]->(b:CONTRACT)<-[:IS_AUTHORITY_OF]-(customer:AUTHORITY)-[:IS_AUTHORITY_OF]->(lost_opportunity:CONTRACT)<-[:IS_OPERATOR_OF]-(competitor:OPERATOR)
WHERE NOT a-[:IS_OPERATOR_OF]->competitor
RETURN a,b,customer,lost_opportunity,competitor

We can visualize the result in Linkurious:

KPMG's network.

KPMG’s network.

KPMG is highlighted in red. It is connected to 10 contracts by 9 customers. These customers have awarded 246 other contracts to 181 firms.

This visualization could be used by KPMG to identify its “unfaithful” customers and its competitors. Specifically we may want to filter the visualization to focus on the contracts for similar services to the ones KPMG offer. To do that we will use the CPV code, a procurement taxonomy of products and services.

Here is a visualization filtered to only display contracts with similar CPV codes as KMPG’s contracts:

Filtering on contracts KPMG could have won.

Filtering on contracts KPMG could have won.

If we zoom in, we can see for example that GCS Uniha, one of KMPG’s customer, has also awarded contracts to some of KPMG’s competitor (Ernst and YoungPwC and Deloitte).

GCS Uniha has also awarded contracts to Ernst and Young, PwC and Deloitte.

GCS Uniha has also awarded contracts to Ernst and Young, PwC and Deloitte.

Exploring the French IT sector

Finally, we can use Linkurious to visualize a particular economic sector. Let’s focus for example on IT spending in France. Who are the biggest spenders in that sector? Which companies are capturing these contracts? Finally what are the relationships between all these organizations?

Using a Cypher query, we can identify the customers and suppliers linked to IT contracts (which all have a CPV code starting with “48”):

//The French public IT market
WHERE b.contract_cpv_code STARTS WITH “48” AND = ‘FR’
RETURN a,b,c

We can visualize the result directly in Linkurious.

Public IT contracts in France in 2015.

Public IT contracts in France in 2015.

If we zoom in, we can see that Conseil Général des Bouches du Rhône has awarded 9 IT contracts in 2015, mostly to Dalkia and Idex énérgies.

Conseil Général des Bouches du Rhône has awarded 9 IT contracts in 2015.

Conseil Général des Bouches du Rhône has awarded 9 IT contracts in 2015.

We have dived into €1.3 trillion of public spending using Neo4j and Linkurious. We were able to identify some key actors of the public contracts market and map interesting ecosystems. Want to explore and understand your graph data? Simply try the demo of Linkurious!

Views: 2204


You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Kouassi Eli on February 6, 2016 at 3:00am

Great job

Comment by Alfred on January 8, 2016 at 4:47am

good job!

Follow Us

On Data Science Central

On DataViz

On Hadoop

© 2016 is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Terms of Service