span8
span4
span8
span4
daleat亚搏在线safecommented·
Hi all -- what better way to start the year than to try out the new PDF reader in FME 2018 betas.Builds 18236 and later have it.Get it fromhttp://www.亚搏在线safe.com/downloadand let us know what you think.@ciarab @marko @redgeographics @geospatiallover @gschleusner @sigtill @cartoscro @dannymatranga @zubairsmFYI
cartoscrocommented·
I'm late to the party,but I vote for this.My primary use would be change detection between two GeoPDF's.
paalpedcommented·
I use poppler to read PDF as Raster.Basically it just converts pdf files to jpgs and then u read the jpg.
helmoetcommented·
I tried to read text from a pdf file using a PythonCaller and the pdfminer plugin,and it went pretty well.For a start?Like this:
import fmeimport fmeobjectsimport sysimport chardetfrom pdfminer.pdfinterp import PDFResourceManager,PDFPageInterpreterfrom pdfminer.pdfpage import PDFPagefrom pdfminer.converter import XMLConverter,HTMLConverter,TextConverterfrom pdfminer.layout import LAParamsfrom cStringIO import StringIO # Template Function interface:# When using this function,make sure its name is set as the value of# the 'Class or Function to Process Features' transformer parameterdef processFeature(feature): data = FME_MacroValues['SourcePdfFile'] fp = file(data,'rb') rsrcmgr = PDFResourceManager() retstr = StringIO() codec = 'utf-8' laparams = LAParams() device = TextConverter(rsrcmgr,retstr,codec=codec,laparams=laparams) # Create a PDF interpreter object.interpreter = PDFPageInterpreter(rsrcmgr,device) # Process each page contained in the document.for page in PDFPage.get_pages(fp): interpreter.process_page(page) data = retstr.getvalue() e = chardet.detect(data) u = None try: if e['confidence'] > 0.3: u = unicode(data,e['encoding']) except: pass if u: feature.setAttribute('pdfcontent',u) else: feature.setAttribute('pdfcontent',data)pass
erik_jancommented·
At this moment I have no need for a PDF reader.
But I will vote for it as it might speed up the improvements for the PDF writer that I do need:
https://knowledge.亚搏在线safe.com/idea/38680/better-pdf-writer-support.html
davidwesstromcommented·
We have been using A-PDF Data extractor to extract data from pdfs.We use a system caller to connect to the app.We hope to see a similiar feature directly in FME without the need of a 3rd party app.
Share your great idea,or help out by voting for other people's ideas.
© 2019 亚搏在线Safe Software Inc |Legal