Ideabyfmelizard·Oct 29,2015 at 03:34 PM· readers pdf

PDF Reader

Complimenting the PDF writer (which is being unified from having separate 2D/3D variants),this one would read vector/raster features out of geospatial PDFs

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

daleat亚搏在线safecommented·Jan 04,2018 at 12:57 AM

Hi all -- what better way to start the year than to try out the new PDF reader in FME 2018 betas.Builds 18236 and later have it.Get it fromhttp://www.亚搏在线safe.com/downloadand let us know what you think.@ciarab @marko @redgeographics @geospatiallover @gschleusner @sigtill @cartoscro @dannymatranga @zubairsmFYI

Add comment · Show 1

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

ciarab ·Jan 04,2018 at 10:24 AM 0

@croningarrettour long awaited PDF reader ;)

cartoscrocommented·Dec 15,2017 at 01:24 PM

I'm late to the party,but I vote for this.My primary use would be change detection between two GeoPDF's.

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

paalpedcommented·Dec 07,2017 at 09:58 AM

I use poppler to read PDF as Raster.Basically it just converts pdf files to jpgs and then u read the jpg.

https://poppler.freedesktop.org/

Add comment · Show 1

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

ottadini ·May 16,2018 at 12:41 AM 0

Me also,but on MS windows the latest binary I could find was for v0.51,quite a way behind the latest.Not that it seems to matter for simple image extraction.

13gamatcommented·Aug 11,2017 at 09:26 PM

Please build something for PDF converter!

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

13gamatcommented·Aug 11,2017 at 08:28 PM

is there a PDF to Excel reader in FME?

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

stalknechtcommented·Jul 17,2017 at 09:07 AM

There is also a custom reader at the hub:

PDF2TextReader

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

helmoetcommented·Jul 16,2017 at 07:50 PM

I tried to read text from a pdf file using a PythonCaller and the pdfminer plugin,and it went pretty well.For a start?Like this:

import fmeimport fmeobjectsimport sysimport chardetfrom pdfminer.pdfinterp import PDFResourceManager,PDFPageInterpreterfrom pdfminer.pdfpage import PDFPagefrom pdfminer.converter import XMLConverter,HTMLConverter,TextConverterfrom pdfminer.layout import LAParamsfrom cStringIO import StringIO # Template Function interface:# When using this function,make sure its name is set as the value of# the 'Class or Function to Process Features' transformer parameterdef processFeature(feature):     data = FME_MacroValues['SourcePdfFile']     fp = file(data,'rb')    rsrcmgr = PDFResourceManager()    retstr = StringIO()    codec = 'utf-8'    laparams = LAParams()    device = TextConverter(rsrcmgr,retstr,codec=codec,laparams=laparams)    # Create a PDF interpreter object.interpreter = PDFPageInterpreter(rsrcmgr,device)    # Process each page contained in the document.for page in PDFPage.get_pages(fp):        interpreter.process_page(page)        data =  retstr.getvalue()      e = chardet.detect(data)    u = None    try:        if e['confidence'] > 0.3:            u = unicode(data,e['encoding'])    except:        pass     if u:        feature.setAttribute('pdfcontent',u)     else:        feature.setAttribute('pdfcontent',data)pass

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

dannymatrangacommented·Jul 12,2017 at 02:51 PM

Any notable progress on the PDF reader?

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

erik_jancommented·Apr 17,2017 at 05:38 PM

At this moment I have no need for a PDF reader.

But I will vote for it as it might speed up the improvements for the PDF writer that I do need:

https://knowledge.亚搏在线safe.com/idea/38680/better-pdf-writer-support.html

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

davidwesstromcommented·Apr 13,2017 at 11:14 AM

We have been using A-PDF Data extractor to extract data from pdfs.We use a system caller to connect to the app.We hope to see a similiar feature directly in FME without the need of a 3rd party app.

Add comment

10 |4000 characters needed characters left characters exceeded

Attachments:Up to 10 attachments (including images) can be used with a maximum of 4.0 MB each and 4.0 MB total.

13People are following this .

PDF Reader

17Comments

Your Opinion Counts

IDEA DETAILS

Related Ideas