Gamingforce Interactive Forums
35864 29819

Go Back   Gamingforce Interactive Forums > Gamingforce Computing > OS and Software
Register FAQ GFWiki Members List Donate Arcade ChocoJournal Mark Forums Read

Welcome to the Gamingforce Interactive Forums.
GFF is a community of gaming and music enthusiasts. We have a team of dedicated moderators, constant member-organized activities, and plenty of custom features, including our unique journal system. If this is your first visit, be sure to check out the FAQ or our GFWiki. You will have to register before you can post. Membership is completely free (and gets rid of the pesky advertisement unit underneath this message).


Searching for words within PDFs
Reply
 
LinkBack Thread Tools
Chocobo


Member 5982

Level 14.36

Apr 2006


Reply With Quote
Old Jul 27, 2007, 08:30 PM #1 (permalink) of 7
Searching for words within PDFs

I have a collection of 100+ .pdf documents that are interrelated. Sometimes I would like to find a common word within these documents. Is there a way in which I can search within all of them to produce a result? Thank you
blue baby burgers!


Member 512

Level 20.64

Mar 2006


Reply With Quote
Old Jul 30, 2007, 12:20 AM #2 (permalink) of 7
You could try "find | xargs grep wordswordswords" but that'd require linux command line and text files. I'm pretty sure PDFs encrypt text so that's moot.
Try looking for a program to rip text out of a PDF and then use that command above.
Ah, here we go. Google to the rescue.

Last edited by neus : Jul 30, 2007 at 12:23 AM.
Mountain Chocobo


Member 6745

Level 27.96

May 2006


Reply With Quote
Old Jul 30, 2007, 08:13 AM Local time: Jul 30, 2007, 01:13 PM #3 (permalink) of 7
There are pdf2txt utitilities on linux that extract the text from a pdf, so this shouldn't be the problem when scripting. You only get in trouble when the text in the pdf is in fact no text but a bitmap. The even the acrobat reader will fail searching for text.
Professional Mac-head


Member 277

Level 15.11

Mar 2006


Reply With Quote
Old Jul 30, 2007, 09:23 AM Local time: Jul 30, 2007, 06:23 AM #4 (permalink) of 7
Macs content-index PDFs automatically, searchable in Spotlight.

I was under the impression that any of the desktop search programs for Windows did the same thing (MSN/Google Desktop Search for XP, or the built-in search in Vista). Do they not?
killmoms - Well, don't really.
Vista be fakin' the funk on Front Street.
iTunesRegistry.com: 11,293 tracks, 19.940 diversity
Mountain Chocobo


Member 6745

Level 27.96

May 2006


Reply With Quote
Old Jul 30, 2007, 06:33 PM Local time: Jul 30, 2007, 11:33 PM #5 (permalink) of 7
I think not. At least the normal search in win2k interprets any file except standard text files as binary data.
hey YOU!


Member 1790

Level 16.72

Mar 2006


Reply With Quote
Old Aug 2, 2007, 12:25 AM #6 (permalink) of 7
Use Foxit. Free Download
Chocobo


Member 23806

Level 9.15

Aug 2007


Reply With Quote
Old Aug 12, 2007, 02:46 AM Local time: Aug 12, 2007, 02:46 PM #7 (permalink) of 7
There are pdf2txt utitilities on linux that extract the text from a pdf, so this shouldn't be the problem when scripting. You only get in trouble when the text in the pdf is in fact no text but a bitmap. The even the acrobat reader will fail searching for text.
Like what LiquidAcid said, there's no certain way to search for words if the words are actually image files. But if they ARE text, there's an option to search within pdf files in an entire directory in Adobe Acrobat Reader itself. The function's the Full Reader Search I believe.
Reply


Thread Tools

Gamingforce Interactive Forums > Gamingforce Computing > OS and Software > Searching for words within PDFs

Forum Jump



All times are GMT -5. The time now is 11:10 PM.


Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0