@@ -14,7 +14,6 @@ Example
1414
1515 id: myjob
1616 time_limit: 60 # seconds
17- proxy: 127.0.0.1:8000 # point at warcprox for archiving
1817 ignore_robots: false
1918 max_claimed_sites: 2
2019 warcprox_meta:
@@ -219,16 +218,6 @@ enforced at the seed level. If a time limit is specified at the top level, it
219218is inherited by each seed as described above, and enforced individually on each
220219seed.
221220
222- ``proxy ``
223- ~~~~~~~~~
224- +--------+----------+---------+
225- | type | required | default |
226- +========+==========+=========+
227- | string | no | *none * |
228- +--------+----------+---------+
229- HTTP proxy, with the format ``host:port``. Typically configured to point to
230- warcprox for archival crawling.
231-
232221``ignore_robots ``
233222~~~~~~~~~~~~~~~~~
234223+---------+----------+-----------+
@@ -259,8 +248,8 @@ to contact the operator if the crawl is causing problems.
259248+============+==========+===========+
260249| dictionary | no | ``false `` |
261250+------------+----------+-----------+
262- Specifies the ``Warcprox-Meta`` header to send with every request, if ``proxy``
263- is configured . The value of the ``Warcprox-Meta`` header is a json blob. It is
251+ Specifies the ``Warcprox-Meta`` header to send with every request, if warcprox
252+ is enabled . The value of the ``Warcprox-Meta`` header is a json blob. It is
264253used to pass settings and information to warcprox. Warcprox does not forward
265254the header on to the remote site. For further explanation of this field and
266255its uses see
0 commit comments