Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline support with service worker #978

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

Offline support with service worker #978

wants to merge 22 commits into from

Conversation

xtuc
Copy link
Member

@xtuc xtuc commented Nov 15, 2016

First naive implementation to support offline mode.

Refers to issue #647

Some points:

  • Search is broken (search will work only for already searched items because Algolia doesn't support offline mode)
  • Handle trailing slashes in URLs (cache first strategy won't get invalid URL)
  • Remove worker development logs
  • Add offline indicator ?
  • If the ressource is not in the cache, show an offline page ?

Note that it is necessary to know the files to be cached in advance. That's why I need Jekyll templating in service-worker.js.

@thejameskyle @marcobiedermann What do you think ?

@boopathi
Copy link
Member

Why not use sw-toolbox ? ... which provides a declarative way of handling routes

@xtuc xtuc changed the title Offiline support with service worker Offline support with service worker Nov 19, 2016
var swOptions = {
debug: true
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be good to use a versioned cache name.

const VERSION = 1;

toolbox.cache.name = "Babel-Cache-" + VERSION;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also,

toolbox.cache.maxEntries = 20 // or 15

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will add cache versioning.

What do you think about a max age ? Like 7 days ?

I don't know if limiting entries is a good idea. I pre-cache already all pages.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL for sw-toolbox and sw-precache here. Great to see you using it :)

If you're going to set maxEntries to 20, it means that after the 21st entry is cached, the least-recently used entry would be automatically deleted. Unless you're storing a number of really large assets, I'd consider whether this is really that valuable. I don't think the Babel site would benefit from setting this all that much.

I do think there's use in imposing a maxAgeSeconds age. @boopathi can probably comment to the length of time, but a week doesn't sound like a bad ballpark.


toolbox.precache(preCachedRessources);

toolbox.router.get('/(.*)', toolbox.cacheFirst, swOptions);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see requests to these domains as well,

toolbox.router.get("/*", toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" });
toolbox.router.get("/*", toolbox.cacheFirst, { origin: "cdn.jsdelivr.net" });
toolbox.router.get("/*", toolbox.cacheFirst, { origin: "unpkg.com" }); // for repl

Copy link
Member Author

@xtuc xtuc Nov 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will add these origins.

I'm currently searching a way to respond to Algolia queries. It seems to not support offline mode. For pre-caching I already use a list of all pages, it should be easy to search in it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Algolia queries are POST queries - toolbox.router.post(...)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how you want to handle post requests, you could do toolbox.router.post('/endpoint', toolbox.networkOnly). This means you're not just returning a cached response but are instead stating that requests made to endpoint need to be network only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do networkOnly, but @xtuc wanted that the user be able to search even when offline - as mostly it is just static data updated via commits. So, it could be networkFirst with a separate cache name and maxEntries.

'{{ "/css/main.css" | prepend: site.baseurl }}?t={{ site.time | date_to_xmlschema }}"',
{% for page in site.pages %}
'{{ page.url }}',
{% endfor %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the following is valid. I see this to be valid at-least for page transitions from and to REPL.

The Service Worker shouldn't change from pageA to pageB if both the pages are under the scope of the same service worker. This is because, service worker updates based on the byte diff, and during a page change, a new service worker will install (if the resources are different -> i.e new service-worker.js).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the strategy is cacheFirst I removed the file main.css which has a time in it.

@xtuc
Copy link
Member Author

xtuc commented Nov 21, 2016

The offline indicator looks like this :

offline-babel

Might not be the most beautiful way to do.

@xtuc
Copy link
Member Author

xtuc commented Nov 23, 2016

Or the grey version:

offline-grey

@xtuc
Copy link
Member Author

xtuc commented Nov 23, 2016

Label version:

offline-label

@existentialism
Copy link
Member

Nice work!

I feel like the notification should stand out enough to draw attention, but not be distracting. Maybe black (fits color scheme and contrasts with the yellow header)?

image

@boopathi
Copy link
Member

boopathi commented Nov 23, 2016

I tried a variant of grey version. I like the label version. My opinion is that removing/changing yellow might give the feeling of offline.

screenshot 2016-11-23 22 20 51

@addyosmani
Copy link

This is all really awesome to see come together. Fwiw, we're seeing a few sites choose the grayed out pattern for indicating offline (coupled with a top or bottom bar with the 'offline' labelling):

Some alternatives we've suggested elsewhere include showing a toast when switching online state or just having something prominent in the UI that can be toggled depending on when the user has a network connection again.

scripts/index.js Outdated
window.addEventListener('offline', function() {
offlineIndicator.removeClass("hidden");
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly leaving here as a note in case helpful later :)

The navigator.{online/offline} events may not accurately indicate that you can or can't access the network. There are well documented gotchas where you might need additional means to check that you're really online. One is checking connection loss by making failed XHR requests (retry a request a few times, if it doesn't go through, you're definitely offline).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addyosmani flipkart is awesome !

@ikhattab
Copy link

using glyphicon glyphicon-off and background color of #f55555 and we can fade color to green and fade out the offline div when the user is online

screen shot 2016-11-24 at 12 11 53 am

@marcobiedermann
Copy link
Contributor

@xtuc Awesome, great work you did there. I quite like that there would be a little indicator on the top which shows if the user is offline.
Looking forward getting this into production 😄 🎉

toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" });
toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdn.jsdelivr.net" });
toolbox.router.get('/*', toolbox.cacheFirst, { origin: "unpkg.com" }); // for repl
toolbox.router.post('/*', toolbox.cacheFirst, { origin: "algolia.net" }); // Cache Algolia search response

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in your earlier commits, but is the goal here to always return a cached post response from Algolia or to only have this work when the network is available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to keep search even once offline. The goal here is to cache some responses from Algolia because it does not support offline searching. I'm not sure about this since the user will be able to find what he already searched online.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addyosmani I guess not. it should be networkFirst.

@xtuc cacheFirst, networkFirst, fastest - all these ultimately hit the cache in some way.

@mbj36
Copy link

mbj36 commented Nov 23, 2016

'offline fallback' for the last point would do the same. We have to make a offline page for it. We have to add service worker 'fetch' event in index.js .

Any idea what should offline page looks like ?

@jeffposnick
Copy link

Apologies for the drive-by feedback. I work on sw-toolbox/sw-precache, and I've spent some time thinking about best practices for using service workers with templated sites, so I wanted to throw in my two cents.

I'll leave some inline feedback on this PR, and I've started a series of posts laying out some best practices for this scenario, with part 1 at https://jeffy.info/2016/11/02/offline-first-for-your-templated-site-part-1.html


importScripts('/scripts/sw-toolbox.js');

const VERSION = 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will you make sure that VERSION is incremented whenever any existing content anywhere on the site is updated? If VERSION doesn't get incremented, then the previously cached content for a given URL will never get refreshed, since there's a cacheFirst policy being used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sure. We could use the site.date from Jekyll. But this way the entier cache will be invalided even if the content hasn't changed.

toolbox.precache(preCachedRessources);

toolbox.router.get('/*', toolbox.cacheFirst);
toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd probably want to configure a different cache name here the runtime CDN caches, so that they're reused indefinitely and don't become irrelevant whenever you change the default toolbox.cache.name value above.

If you're using cacheFirst strategy for them, please make sure that all of the URLs that they refer to are versioned. E.g. https://unpkg.com/[email protected]/dist/react.min.js includes 15.3.1 in the URL, so it's fine to use cacheFirst. But if you request URLs like https://unpkg.com/react/dist/react.min.js, once it's cached for the first time, the initial version will be reused indefinitely (at least until the cache name changes).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good point. This is part of good practice I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked, resources from CDN (Cloudflare and jsdelivr) have a version in the URL.

toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" });
toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdn.jsdelivr.net" });
toolbox.router.get('/*', toolbox.cacheFirst, { origin: "unpkg.com" }); // for repl
toolbox.router.post('/*', toolbox.cacheFirst, { origin: "algolia.net" }); // Cache Algolia search response

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know exactly what algolia.net is used for specifically, but if it's search-related, then cacheFirst doesn't sound like a good strategy to use. networkFirst would normally be more appropriate, since you'd want fresh search results, but could fall back to stale results if the network is unavailable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

algolia.net is for the search functionality

screen shot 2016-11-15 at 11 00 34 am

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm now thinking that idea wasn't that good. (#978 (comment))


toolbox.precache(preCachedRessources);

toolbox.router.get('/*', toolbox.cacheFirst);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super broad. Are you sure you're okay with every URL under your origin being served cacheFirst? And aside from the caching strategy, are you sure that any time there's any request made for a resource on your site, you want that added to the cache? If there are large images or other large media, then that's not necessarily a good approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted an offline documentation. This caching strategy might indeed not be the best approach.


const VERSION = 1;

toolbox.cache.name = "Babel-Cache-" + VERSION;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything here that will delete old caches once this cache name changes. That means that you'll end up with many copies of your full site in the SW site, each time you change the name.

Normally, you'd want to clean up unneeded caches inside an activate handler.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to Jeff's comment. You'll want to iterate through your cache entries and cache.delete doing something along these lines:

// Where urlsToCacheKeys is a map of your items to cache
self.addEventListener('activate', function(event) {
  let setOfExpectedUrls = new Set(urlsToCacheKeys.values());
  event.waitUntil(
    caches.open(cacheName).then(function(cache) {
      return cache.keys().then(function(existingRequests) {
        return Promise.all(
          existingRequests.map(function(existingRequest) {
            if (!setOfExpectedUrls.has(existingRequest.url)) {
              return cache.delete(existingRequest);
            }
          })
        );
      });
    }).then(function() {
      return self.clients.claim();
    })
  );
});

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than cleaning manually the cache, I think we need to limit the cache using cache.maxEntries or cache.maxAgeSeconds.
The old cached website(s) will be cleaned up given these configurations.


var preCachedRessources = [
{% for page in site.pages %}
'{{ page.url }}',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're precaching entire HTML documents, then you're going to end up populating user's caches with a lot of duplicated data. I.e. if your HTML shares common headers/footers (because they're all using some underlying layouts) then the structure of those headers and footers will be take up space on your user's disk when they don't actually add any extra value.

The ideal implementation would just cache the layout elements once, then cache the underlying content once, and then perform the templating logic in the service worker to combine them at runtime.

I've got a very experimental example of this implemented at https://github.com/jeffposnick/jeffposnick.github.io/tree/work/src, but I don't know that it's stable enough to recommend using for a high-traffic site.

toolbox.router.get('/*', toolbox.cacheFirst);
toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" });
toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdn.jsdelivr.net" });
toolbox.router.get('/*', toolbox.cacheFirst, contentCacheOptions);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

networkFirst or fastest - cacheFirst is never going to hit the network after first time.

@xtuc
Copy link
Member Author

xtuc commented Nov 26, 2016

An other try :

offline-v

var offlineIndicator = $("#offline-indicator");

window.addEventListener('online', function() {
offlineIndicator.addClass("hidden");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: quotes are inconsistent (single quotes are used elsewhere)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks. I updated both files with single quotes.

var VERSION = '{{ site.time }}';

var contentCacheOptions = {
name: "Babel-Cache-" + VERSION,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same in this file

@xtuc
Copy link
Member Author

xtuc commented Dec 1, 2016

@boopathi @marcobiedermann @mathiasbynens @jeffposnick what are you thought about this ?

@jeffposnick
Copy link

The current implementation, with toolbox.fastest for (most of) the requests means that you'll have a much better upgrade story, and you'll avoid the issues I've previously pointed out where content would be cached once and potentially reused indefinitely. I don't see any specific shooting-yourself-in-the-foot issues.

What I'd recommend is that you try doing some sanity checks in a local deployment by keeping an eye on Chrome DevTool's Cache Storage display in the Applications panel, and the Network panel.

The Cache Storage display will give you a list of URLs that are being cached, and you can verify that you're not accidentally caching anything that you don't intend to. (Your /* route is very broad, and it's possible that there are requests that the site is making which you realize don't need to be cached.)

@xtuc
Copy link
Member Author

xtuc commented Dec 3, 2016

Thanks a lot @jeffposnick for your explanations.

LGTM, @hzoo @boopathi can we merge it ?

@xtuc xtuc requested review from existentialism and hzoo December 8, 2016 11:51
@xtuc xtuc requested a review from raspo December 17, 2016 16:58
@stale stale bot added the stale label Jun 12, 2017
@stale
Copy link

stale bot commented Jun 12, 2017

This issue has been automatically marked as stale because it has not had recent activity 😴. It will be closed if no further activity occurs. Thank you for your contributions 👌!

@stale stale bot closed this Jun 20, 2017
@hzoo hzoo reopened this Jun 20, 2017
@stale stale bot removed the stale label Jun 20, 2017
@babel-bot
Copy link
Contributor

babel-bot commented Nov 4, 2017

Deploy preview ready!

Built with commit 18d2580

https://deploy-preview-978--babel.netlify.com

@thenamankumar
Copy link

@jeffposnick @xtuc Hello I am new here and trying to contribute.

I was thinking of if we can have different named caches for different kind of assets. Like for images we can have a separate cache, so that we can manage it differently and even it may improve the performance while matching requests from the cache.

Also I think it would be safer to go with networkFirst for the website assets, with cache maxAge.

And do we need maxEntries? Can't we let it be maximum as they will get expired by age and so the the caching of assets won't be limited.

Copy link
Member Author

@xtuc xtuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the scripts/sw-toolbox.js in the sources now.

, cdnCacheOptions = {
name: 'cdn',
maxAgeSeconds: 60 * 60 * 6
};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please make multiple var declarations instead here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.