Description
I've just spent some time implementing the use of the shared point data for gvar, and I had an idea for an alternative approach to computing this value that I believe should reduce binary size in some cases, at the cost of a more complicated implementation.
The basic idea is simple: we should sometimes force some tuples to use a non-optimal encoding when the cost in bytes of doing so is less than the cost of encoding the optimal private point numbers.
To illustrate this idea, imagine a set of tuples with the following point indices:
point# 0 1 2 3 4 5 6 7 8 9
tuple1 x x x x
tuple2 x x x x
tuple3 x x x x x x
tuple4 x x x x x
tuple5 x x x x x
tuple6 x x x x x
In this pathological case, where no indexes are shared, we current spend 29 of 137 bytes encoding private points. The pointset of the third tuple is a superset of all the other pointsets, and it costs 6 bytes to encode. If we were to share this set across all the tuples, we would save 23 bytes encoding pointsets, at a cost of encoding 7 additional deltas. Assuming each delta required one byte to encode each axis (it could be less if the deltas are zeros, and possibly more if they're large), we're saving (29 - (6 + 7 * 2)) = 9 bytes, in this example, which comes out to a ~6.6% reduction in overall size.
Obviously real-world fonts differ significantly from this example, but I believe the same principle applies, and is at least worth investigating. It should not be too hard to hack together a naive implementation that tries to iteratively combine point sets and keep combinations that produce space savings, and this would at least give us a real sense of whether real-world savings exist.