Validating RDF Data: a Community Survey

This webpage gives an overview of the results from the RDF Validation Survey, specifically for the answers collected between December 2024 and March 2025. The survey itself is still available here.

All graphs have been generated based on the cleaned data available as CSV files, or a ready-made DuckDB files on the associated Git repository. You can find the SQL queries retrieving the data for the graphs below every graph. Try them out in our database shell directly in the browser: copy the SQL queries associated with the charts and paste them in the DuckDB browser shell!

This study is a direct result of the community's engagement, and we offer our warmest thanks to every participant who took the time to share their perspective!

Highlighted Insights

This section highlights key figures and tables as presented in the accompanying paper.

Professional Background

Professional Background of Respondents (2025 vs 2022)

Show/Hide Source SQL Query

Application Domains

Main Application Domains for SHACL or ShEx

Show/Hide Source SQL Query

Years of Experience

Years of Experience with RDF Validation/Shape Technologies by Professional Background

Show/Hide Source SQL Query

Language Familiarity

Familiarity with RDF Validation Technologies

Show/Hide Source SQL Query

Methods for Shape Creation

Comparison of Shape Creation Methods (2025 vs 2022)

Show/Hide Source SQL Query

Validation Usage Frequency

Frequency of Working with RDF Shape Languages by Professional Background

Show/Hide Source SQL Query

SHACL-SPARQL Usage Frequency

Frequency of Using SPARQL-based Constraint Components

Show/Hide Source SQL Query

SHACL-SPARQL Motivation

Motivation for Using SPARQL-based Constraint Components

Show/Hide Source SQL Query

Methods for Validating Evolving KGs

Adapting Validation Processes for Evolving Knowledge Graphs

Show/Hide Source SQL Query

Graph Size vs. Performance Concerns

Performance Concerns in Relation to Graph Size at Validation Time

Show/Hide Source SQL Query

Frequency of Use vs. Advanced Features

Usage of Advanced SHACL Features Based on Overall Usage Frequency

Show/Hide Source SQL Query

Shape Generation/Extraction Tools

Tools Used for Shape Generation/Extraction (2025 vs 2022)

Show/Hide Source SQL Query

SHACL Validators

SHACL Validation Software/Framework Usage by Professional Background

Show/Hide Source SQL Query

Advanced SHACL Feature Usage

Usage of Advanced SHACL Features by Professional Background

Show/Hide Source SQL Query

Graph Size at Validation Time

Typical RDF Graph Size at Validation Time by Professional Background

Show/Hide Source SQL Query

Perceived Advantages

Main Perceived Advantages of SHACL/ShEx

Show/Hide Source SQL Query

Reported Issues and Limitations

Reported Issues, Limitations, and Desired Enhancements

Show/Hide Source SQL Query

Validation Report Usefulness

Features to Improve Usefulness of Validation Reports

Show/Hide Source SQL Query

Desired Future Language Features

Desired Future Extensions of SHACL Standard

Show/Hide Source SQL Query

Overview

This overview shows the number of respondents per answer for each of the questions from the original survey.

1 Background and Demographics

Question 1: What is your professional background?

Show/Hide Source SQL Query

Question 2: How many years of experience do you have with RDF validation/shape technologies?

Show/Hide Source SQL Query

Question 3: Do you work with different graph models or RDF standards? Please select all that apply.

Show/Hide Source SQL Query

2 Usage Context and Specifics

Question 4: What RDF validation technologies do you have experience with? Please select all that apply.

Show/Hide Source SQL Query

Question 5: How frequently do you work with RDF shape languages?

Show/Hide Source SQL Query

Question 6: What are the main data domains you use SHACL or ShEx for? Please select all that apply.

Show/Hide Source SQL Query

Question 7: In your projects, how do SHACL and ShEx affect data quality and consistency?

Show/Hide Source SQL Query

Question 8: If you had to choose one language for future projects, which would it be?

Show/Hide Source SQL Query

3 Validation Usage

Question 9: When validating, how many distinct classes and properties does your data graph usually have?

Show/Hide Source SQL Query

Question 10: When validating, how large is your RDF graph usually?

Show/Hide Source SQL Query

Question 11: If you are a user of SHACL, which validation software/framework do you use?

Show/Hide Source SQL Query

Question 12: If you are a user of ShEx, which validation software/framework do you use?

Show/Hide Source SQL Query

Question 13: What do you consider the main advantages of SHACL/ShEx in your work or projects? Please select all that apply.

Show/Hide Source SQL Query

Question 14: Have you encountered any limitations or challenges when using SHACL/ShEx?

Show/Hide Source SQL Query

Question 15: Have you worked with evolving or dynamic RDF knowledge graphs? If yes, how do you adapt your validation processes?

Show/Hide Source SQL Query

4 Specific Features of SHACL

Question 16: How often do you use SHACL-core constraint components (for example sh:mincount etc)?

Show/Hide Source SQL Query

Question 17: Do you use SPARQL-based constraint components?

Show/Hide Source SQL Query

Question 18: If you use SPARQL-based constraint components, please specify the reason:

Show/Hide Source SQL Query

Question 19: Is the information provided by the validation report sufficient for your use cases?

Show/Hide Source SQL Query

Question 20: Which of the SHACL advanced features do you use?

Show/Hide Source SQL Query

Question 21: Which extensions of the SHACL standard do you wish for?

Show/Hide Source SQL Query

5 Shape Creation and Extraction

Question 22: How do you generate validating shapes SHACL/ShEx?

Show/Hide Source SQL Query

Question 23: Which tools or method do you use to extract/create validating shapes (SHACL/ShEx)?

Show/Hide Source SQL Query

Question 24: How many shapes do you usually generate?

Show/Hide Source SQL Query

Question 25: How large is your ontology usually in the context of shape extraction?

Show/Hide Source SQL Query

Question 26: How many distinct predicate types are present in your RDF graph usually?

Show/Hide Source SQL Query

Question 27: How large is your RDF graph usually in the context of shape extraction?

Show/Hide Source SQL Query

Question 28: Do you generate shapes for the entire graph or only for some portions?

Show/Hide Source SQL Query

6 Future Development

Question 29: What are the shortcomings you experienced when working with SHACL/ShEx? Please select all that apply.

Show/Hide Source SQL Query

Question 30: What features would improve the usefulness of validation reports? Please select all that apply.

Show/Hide Source SQL Query

Methodology

The online survey targets practitioners, researchers, and developers experienced with RDF data validation, SHACL, or ShEx. The data was collected from December 2024 to March 2025 and the survey was administered anonymously using Google Forms to ensure participant privacy.

Distribution of the survey was achieved through a multi-channel approach to reach a broad yet relevant audience. We leveraged social media platforms including the SHACL community Discord group, and the authors' LinkedIn and Twitter profiles. The survey was additionally disseminated via the Semantic Web mailing list. A targeted outreach was performed by executing a DBLP query to identify authors of relevant publications; these authors were then manually contacted via email (if mentioned in the papers) and invited to participate. As we make use of a completely anonymous survey structure, the results rely on the honesty of the respondents.

The questionnaire was designed for flexibility. Most questions were optional and featured closed-ended answer options, often allowing for multiple selections. To capture richer, more nuanced feedback, some questions also included an option for free-text responses. When trends were observed, they were additionally coded as part of the available dataset.

Source data

The raw and cleaned data used for generating these charts can be found on the associated Git repository.