Skip to content

Commit 0b8a8f0

Browse files
committed
live rewrite: catch errors from live rewrite and raise a new LiveResourceError with a 400 error code,
indicating bad request for live resource. Add test for invalid live rewrite requests
1 parent 2f50a3e commit 0b8a8f0

4 files changed

Lines changed: 28 additions & 4 deletions

File tree

CHANGES.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
pywb 0.5.0 changelist
22
~~~~~~~~~~~~~~~~~~~~~
33

4+
* Catch live rewrite errors and display more friendly pywb error message.
5+
46
* LiveRewriteHandler and WBHandler refactoring: LiveRewriteHandler now supports a root search page html template.
57

6-
* Proxy mode option: 'unaltered_replay' to proxy archival data with no modifications (no banner, no server or client side rewriting)
8+
* Proxy mode option: 'unaltered_replay' to proxy archival data with no modifications (no banner, no server or client side rewriting).
79

810
* Fix client side rewriting (wombat.js) for proxy mode: only rewrite https -> http in absolute urls.
911

@@ -13,7 +15,7 @@ pywb 0.5.0 changelist
1315

1416
The handler, specified via the ``fallback`` option, can be the name of any other replay handler. Typically, it can be used with a live rewrite handler to fetch missing content from live instead of showing a 404.
1517

16-
* Live Rewrite can now be included as a 'collection type' in a pywb deployment by setting index path to ``$liveweb``
18+
* Live Rewrite can now be included as a 'collection type' in a pywb deployment by setting index path to ``$liveweb``.
1719

1820
* ``live-rewrite-server`` has optional ``--proxy host:port`` param to specify a loading live web data through an HTTP/S proxy, such as for use with a recording proxy.
1921

pywb/cdx/cdxserver.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ def _check_cdx_iter(self, cdx_iter, query):
5858
return self.load_cdx(**fuzzy_query_params)
5959

6060
msg = 'No Captures found for: ' + query.url
61-
raise NotFoundException(msg)
61+
raise NotFoundException(msg, url=query.url)
6262

6363
def _calc_search_keys(self, query):
6464
return calc_search_range(url=query.url,

pywb/webapp/live_rewrite_handler.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,14 @@
66

77
from replay_views import RewriteLiveView
88

9+
from pywb.utils.wbexception import WbException
10+
11+
12+
#=================================================================
13+
class LiveResourceException(WbException):
14+
def status(self):
15+
return '400 Bad Live Resource'
16+
917

1018
#=================================================================
1119
class RewriteHandler(SearchPageWbUrlHandler):
@@ -17,7 +25,13 @@ def __call__(self, wbrequest):
1725
if wbrequest.wb_url_str == '/':
1826
return self.render_search_page(wbrequest)
1927

20-
return self.rewrite_view(wbrequest)
28+
try:
29+
return self.rewrite_view(wbrequest)
30+
31+
except Exception as exc:
32+
url = wbrequest.wb_url.url
33+
msg = 'Could not load the url from the live web: ' + url
34+
raise LiveResourceException(msg=msg, url=url)
2135

2236
def __str__(self):
2337
return 'Live Web Rewrite Handler'

tests/test_live_rewriter.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,12 @@ def test_live_rewrite_frame(self):
2323
assert '<iframe ' in resp.body
2424
assert 'src="/rewrite/mp_/http://example.com/"' in resp.body
2525

26+
def test_live_invalid(self):
27+
resp = self.testapp.get('/rewrite/mp_/http://abcdef', status=400)
28+
assert resp.status_int == 400
29+
30+
def test_live_invalid_2(self):
31+
resp = self.testapp.get('/rewrite/mp_/@#$@#$', status=400)
32+
assert resp.status_int == 400
33+
2634

0 commit comments

Comments
 (0)