span8
span4
span8
span4
FME'sAdobe Geospatial PDF Readercan extract much information from PDF documents.Imagery,rasters,vector data,text,spatial information and attributes can be read.
However,extracting information from a PDF document can be complex.One of the complications with PDF is that it is a document format.PDF document contents can vary greatly: you may have much information spread over many pages,or maps (basically an embedded picture),or maybe it contains a CAD drawing with many lines all over the place.So it's hard to know how to read the PDF document before seeing it and knowing what you need to extract from it.Sometimes you may be concerned about where information is on the page of the PDF,other times you may simply want to extract the content,the location doesn't matter.
A PDF document in FME Data Inspector (left);the same PDF document in Adobe PDF Reader (right)
The PDF Reader has many options for extracting data.Your PDF may contain:
The main choice is about whether to read the PDF as spatial or non-spatial (tabular).In other words,does the location of each feature on the page matter,or are you simply concerned about the page as a whole.Additionally,it is possible to select both Spatial and Non-Spatial (tabular) PDF Reader options at the same time.
Detailed information about the Spatial parameter options can be found in thehelp documentation.
The Spatial section refers to the fact that the PDF document may contain information which has some sort of particular location on the page,which may translate to a specific location on the earth if there is a coordinate system or coordinate systems defined for the PDF document.PDF documents can contain multiple coordinate systems per page.
If you would like to display PDF data in the Data Inspector with a Background map,it is necessary to set Coordinate Units to Geospatial (if possible).It's only possible to display PDF data with a background map in Data Inspector if a coordinate system exists.
Detailed information about the Non-Spatial parameter options can be found in thehelp documentation.
If your PDF document contains tabular data,it is possible to extract metadata,text and even rasterize the entire PDF page.The Non-Spatial Metadata parameter can be useful to extract information such as attributes,or information about the document including creation date.
This article covers how to read a simple PDF which contains a title,a couple of maps,some text,and a table.
Learn how to inspect and extract the content of PDF map frames.
More PDF reading articles are in progress and coming soon!
© 2019 亚搏在线Safe Software Inc |Legal