-
Notifications
You must be signed in to change notification settings - Fork 6
Sample Document URL Schemes
Andy Jackson edited this page Aug 10, 2015
·
2 revisions
The “Document URL Scheme” serves as a filter that ensures that unwanted PDFs are not included in the Watched Target crawl. The general rule for defining a Document URL Scheme is to start with the URL of an interesting Document, remove the leading or , and then trim the result string to the part that you expect to be common to all related PDFs.
Several examples of this process are given below:
| Property | URL |
|---|---|
| Document URL | http://www.ofsted.gov.uk/sites/default/files/documents/surveys-and-good-practice/a/Are%20you%20ready%20Good%20practice%20in%20school%20readiness.pdf |
| Document URL Scheme | www.ofsted.gov.uk/sites/default/files |
| Document URL | http://www.ifs.org.uk/uploads/publications/comms/r96.pdf |
| Document URL Scheme | www.ifs.org.uk/uploads/publications/comms |
| Document URL | https://www.gov.uk/government/uploads/system/uploads/attachment\_data/file/325491/family-resources-survey-statistics-2012-2013.pdf |
| Document URL Scheme | www.gov.uk/government/uploads/system/uploads/attachment\_data/file |
| Document URL | http://bmdynamics.com/issue\_pdf/bmd110501-%2001-14.pdf |
| Document URL Scheme | bmdynamics.com/issue_pdf |
| Document URL | http://www.cipd.co.uk/binaries/labour-market-outlook\_2014-summer.pdf |
| Document URL Scheme | www.cipd.co.uk/binaries |