FME 2003 poster: Data is Power

FME 2003 poster: Data is Power. With great data comes great responsibility.

为了释放斯坦·李:伟大的数据带来了巨大的责任。作为行业专业人士,部分责任确保了我们的数据质量优良。取决于一个人工作的性质那making decisions based on bad data could have extreme ramifications.

Before trying to use a dataset for some important task, all aspects should be checked for completeness, correctness, consistency, and compliance.良好的质量数据意味着检查它是否满足要求,然后修复它不会通过。

以下是一个数据质量清单,可以帮助您验证和修复您的地理空间数据。验证步骤将根据数据类型(2D,GIS,栅格等)而有所不同,但此列表将提供良好的指导。您可以使用手动验证和开箱即用的工具来验证您的数据,或者您可以use FME to detect and repair problems automatically

检查架构

The first step is to make sure the data model, or schema, is correct for your destination system. This includes enforcing the correct:

☑特征类型(即图层,级别,表,要素类)名称。

☑Attribute names and types.

☑Coordinate system.

☑允许的几何形状。

检查数据值

检查特定于数据集的属性,特征,特征,行和/或其他属性的内容。

☑数据类型是否正确为此字段?

☑Is the value within the valid range or part of a domain or enumerated list?

☑检查重复项,例如唯一键。

☑检查nulls。Are there mandatory values, or are null / empty values allowed? Are the null types consistent (NaN, infinity, empty strings, etc.)?

验证几何形状

Invalid geometry comes in many forms. Problems might include:

☑Self-intersections.

☑退化或腐败的几何形状。例如,没有孔的甜甜圈,一个没有零件的多个,终点位置与实际弧不一致。

☑空几何。

☑具有缺失的正常的顶点。

☑在具有纹理的几何形状的情况下,检查相关纹理坐标。

☑Invalid solid boundaries (could include unclosed boundaries, invalid projection, incorrect face orientation, unused vertices, free faces).

☑无效的固体空隙。例如,断开连接的壳内部。

☑Non-planar surfaces, i.e. vertices are not on the same plane in 3D space.

☑Duplicate consecutive points, in 2D or 3D.

Compliance to standards

Sharing data usually means ensuring it meets a set of standards, or enforcing its compliance with an initiative. For example:

OGC遵守包括检查自交叉点,重复点,未操作的几何等。

INSPIREcompliance involves a few things. Thankfully, we have a博客帖子and其他资源帮助您完成此操作。

☑其他特定国际标准ortrade standardsfor data.

☑Your company’s standards. For example, you might need to design your own tests to ensure your topology or attributes meet your established design constructs.

Format-specific QA/QC

You should perform quality checks that are tailored to the format of your destination system, especially if you’ve converted from another data type. Examples include:

CADdata: ensure the robust extraction of layers, geometry, text, line types, blocks, extended entity data, etc.

XML/杰森:验证语法或架构。

☑表格数据:确保值通过逻辑测试;检查与空间详细信息的集成。

☑Databases: check the data and geometry before attempting to load it into a central repository.

☑点云:检查是否正确的组件和值。

Workflow-based validation

If you work with data in an environment that demands a workflow—real timeself serve那automated uploads, or otherwise—your data would likely benefit from other validation techniques. For example:

Detect differencesin an updated version of the same data.

☑Validate submitted data (via email, upload, directory watch, scheduled task) and immediately give feedback to stop bad data from being processed.

☑检查提交的包是否包含所需的文件和格式,执行模式检查和数据级别检查,然后执行zip并将包加密,以便为客户端下载 - 作为示例自助工作流。亚搏在线您可能有其他要求。

修复和报告不良数据

当然,在找到不一致的错误和/或符合上述任何检查中的遵守情况后,下一步是修复它不会通过的数据集。修复可能包括:

映射架构to fit the destination data model.

☑几何操作。例如:扣环悬挂线关闭,夹式交叉线,填充条纹,删除尖峰,如果它们在彼此一定的距离内,请一起带来顶点。

☑Enforce compliance with your company’s standards. For example: remove duplicate features, filter out the wrong kind of geometry, test for specific attribute values or ranges of values.

☑只需标记错误的数据并将其返回人类分析。

无论验证检查的结果是什么,应创建一个直观的报告,可以与有关方面共享。因此,最终步骤是:

☑Measure and describe the quality of the data in a standardized way, e.g. data download or PDF.

☑通过电子邮件,短信,形式等发送报告。

Printable graphic

Click the image to see the enlarged, printable version of The Ultimate Geospatial Data Validation Checklist.

Data QA Checklist

Automatic quality control with FME

很容易就会被ab的大小ove list, but the important thing I want to stress here is thatFME帮助您完成所有这些自动地。For example, all geometry validation listed above can be done in one step using theGeometryValidatortransformer, while custom checks against your own standards can also be set up using the intuitiveTester变压器。浏览这些演示文稿幻灯片to see exactly which transformers can help you validate different parts of the above checklist.

The following presentations and case studies provide some good examples of real-world data validation workflows.

What steps do you take to validate your data? What can you add to the above list?

FME交换机广告 - 灵感来自艾伦菲纷

关于数据 Data Quality Data Validation Spatial Data

Tiana Warner

天籁是安全软件的高级营销专家。亚搏在线她在计算机编程和创意爱好中的背景使她成为安全软件的创造性内容的主要生产商之一。亚搏在线天纳花了她的空闲时间,写幻想小说,骑马,与她的救援小狗,乔伊探索自然。

Comments

2 Responses to “The Ultimate Geospatial Data Validation Checklist”

  1. Pierre says:

    Hey, great idea & nice poster !

    If you post / send the source file, I will be happy to translate it in french

  2. Of course FME is a great tool to check the quality of your data. I have used it for that purpose in multiple cases.
    To be able to check the quality, you should first know how your data should look like. What is is the quality needed for your purpose? When is it fit for use?
    让您的数据100%好,它可能会花费很多,并且当例如,它可能需要更少的时间和金钱,对于您的工作流/决定/用例,它足够好。亚搏在线
    在Alterra,我们创建了一个数据质量的框架,以帮助在该过程中确定特定用例所需的质量标准。
    Let me know if you are interested.

    亲切的问候,
    Maarten Storm.
    Geo-information Specialist & Quality Coordinator
    Alterra, Wageningen UR

Leave a Reply

Your email address will not be published.必需的地方已做标记*

相关文章