Skip to content

axel-angel/UrlObs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UrlObs

Url Observer: monitor changes of web pages according to user-defined XPath rules. Written in Perl, use YAML config format. Dead simple.

Require (cpan install):

  • LWP::UserAgent
  • HTML::TreeBuilder::XPath
  • XML::XPath
  • XML::XPath::XMLParser
  • YAML::Syck
  • Digest::MD5
  • Text::Diff

Use:

perl urlobs.pl example.yaml

YAML format is a list of entry where:

  • url: is the url
  • xpath: XPath rule selecting monitored nodes (text extracted)
  • title: used for humans [optional]
  • interval: is the minimum fetching interval (seconds) [optional]
  • type: whether it is html or xml [missing means html]
  • no_order: whether we do a sorting (effectively ignoring the output ordering)
  • keep_old: whether we should ignore deletions by keeping old entries (useful for youtube RSS, they blink). This option forces no_order by default.
  • user_agent: Specify the browser signature, useful to circumvent certain useless site protections [optional].
  • cookie: raw data sent directly as the Cookie header [optional]

The file is modified by the program as a database. Look at example.yml for an example.

About

Url Observer (Perl+YAML, detect change)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published