span8
span4
span8
span4
Hi,
I have a table containing Id's and url's referring to online pdf's.For each record in the table I want to search a specific string in the associated pdf.If the string has a match in the pdf,I want to retrieve the Id from the record containing the url to the pdf.I tried using feature reader en pdf reader but didn't get far.So any help would be welcome.
Idris
Hi@idrispeiren,
Please could you provide a some sample data to help us understand exactly what you are trying to achieve?
For some reason it won't work if I plug in the url in a FeatureReader,but if I use a HTTPCaller to save a local copy of the PDF and then open that using the FeatureReader it does work.
Note that I strongly recommend a Decelerator.You will be hitting the webserver that hosts the PDF once per feature,so that's over 1700 times for this dataset.If you do that at FME's regular speed it might overload it or be seen as a DDOS attack (I've once done that).
You're also very much dependent on how the PDF is structured.The first one that I've used as a sample appears to be a fairly good one,but there's no guarantee they'll all be like that.If it's a scanned form you're out of luck.
A very important parameter is in the FeatureReader,make sure that in the PDF parameters there you set the Spatial Text one to "Feature Per Block".That way it tries to make one text object per line.
Thanks for the sample,that's what I wanted to achieve!One more question: how can I keep the initial attributes for the matched records (seems to be in the initiator port)?
Check the accumulation mode of the FeatureReader,if you set it to "Merge initiator and result" it should do the trick.
Hi,I'm deploying the model for all of the data and I'm encountering another problem.The fme model is stopping for some reason,although "Ingore Failed Readers"is set to "Yes".
Any suggestions here?
Idris
DSI_terreinen_in_planning_v6_stringsearch_categoriebedrijvigheid.fmw
Something different now: "PDF Reader: Failed to open documentxxx.pdf' because the file is not in PDF format,or because it is corrupted."
Ignore failed readers is set to yes,but the translation is aborted.
Hi Holly,
Here's a screenshot of the table:
For each record in the table I want to search a specified string (eg.bedrijvigheid) in the attribute "Stedenbouwkundige voorschriften" (sorry for the ducth terms).If the string has a match in the pdf,I want to keep that record.A sample file is included as attachment.Link_pdf.gdb.zip
下载多个ZIP文件的url5 Answers
Expose fme_basename on PDF's with FeatureReader2 Answers
Use ESRI Basemap in PDF1 Answer
Using output of PDFPageFormatter1 Answer
multiple PDF with a wms server map service as a background ?0 Answers
© 2019 亚搏在线Safe Software Inc |Legal