FME 2003 poster: Data is Power

FME 2003 poster: Data is Power. With great data comes great responsibility.

为了释放斯坦·李:伟大的数据带来了巨大的责任。作为行业专业人士,部分责任确保了我们的数据质量优良。取决于一个人工作的性质那making decisions based on bad data could have extreme ramifications.

Before trying to use a dataset for some important task, all aspects should be checked for completeness, correctness, consistency, and compliance.良好的质量数据意味着检查它是否满足要求,然后修复它不会通过。

以下是一个数据质量清单,可以帮助您验证和修复您的地理空间数据。验证步骤将根据数据类型(2D,GIS,栅格等)而有所不同,但此列表将提供良好的指导。您可以使用手动验证和开箱即用的工具来验证您的数据,或者您可以use FME to detect and repair problems automatically


The first step is to make sure the data model, or schema, is correct for your destination system. This includes enforcing the correct:


☑Attribute names and types.

☑Coordinate system.





☑Is the value within the valid range or part of a domain or enumerated list?


☑检查nulls。Are there mandatory values, or are null / empty values allowed? Are the null types consistent (NaN, infinity, empty strings, etc.)?


Invalid geometry comes in many forms. Problems might include:






☑Invalid solid boundaries (could include unclosed boundaries, invalid projection, incorrect face orientation, unused vertices, free faces).


☑Non-planar surfaces, i.e. vertices are not on the same plane in 3D space.

☑Duplicate consecutive points, in 2D or 3D.

Compliance to standards

Sharing data usually means ensuring it meets a set of standards, or enforcing its compliance with an initiative. For example:


INSPIREcompliance involves a few things. Thankfully, we have a博客帖子and其他资源帮助您完成此操作。

☑其他特定国际标准ortrade standardsfor data.

☑Your company’s standards. For example, you might need to design your own tests to ensure your topology or attributes meet your established design constructs.

Format-specific QA/QC

You should perform quality checks that are tailored to the format of your destination system, especially if you’ve converted from another data type. Examples include:

CADdata: ensure the robust extraction of layers, geometry, text, line types, blocks, extended entity data, etc.



☑Databases: check the data and geometry before attempting to load it into a central repository.


Workflow-based validation

If you work with data in an environment that demands a workflow—real timeself serve那automated uploads, or otherwise—your data would likely benefit from other validation techniques. For example:

Detect differencesin an updated version of the same data.

☑Validate submitted data (via email, upload, directory watch, scheduled task) and immediately give feedback to stop bad data from being processed.

☑检查提交的包是否包含所需的文件和格式,执行模式检查和数据级别检查,然后执行zip并将包加密,以便为客户端下载 - 作为示例自助工作流。亚搏在线您可能有其他要求。



映射架构to fit the destination data model.


☑Enforce compliance with your company’s standards. For example: remove duplicate features, filter out the wrong kind of geometry, test for specific attribute values or ranges of values.



☑Measure and describe the quality of the data in a standardized way, e.g. data download or PDF.


Printable graphic

Click the image to see the enlarged, printable version of The Ultimate Geospatial Data Validation Checklist.

Data QA Checklist

Automatic quality control with FME

很容易就会被ab的大小ove list, but the important thing I want to stress here is thatFME帮助您完成所有这些自动地。For example, all geometry validation listed above can be done in one step using theGeometryValidatortransformer, while custom checks against your own standards can also be set up using the intuitiveTester变压器。浏览这些演示文稿幻灯片to see exactly which transformers can help you validate different parts of the above checklist.

The following presentations and case studies provide some good examples of real-world data validation workflows.

What steps do you take to validate your data? What can you add to the above list?

FME交换机广告 - 灵感来自艾伦菲纷

Data Quality Data Validation Spatial Data

Tiana Warner



  Pierre says:

    Hey, great idea & nice poster !

    If you post / send the source file, I will be happy to translate it in french

  2. Of course FME is a great tool to check the quality of your data. I have used it for that purpose in multiple cases.
    To be able to check the quality, you should first know how your data should look like. What is is the quality needed for your purpose? When is it fit for use?
    Let me know if you are interested.

    Maarten Storm.
    Geo-information Specialist & Quality Coordinator
    Alterra, Wageningen UR

