[ech] would a callback for ECH retry-configs be useful?

Tue Apr 18 15:14:34 UTC 2023

Hi David,

Thanks for all that, which makes sense (and yet again
shows that I don't know about big servers;-). I'll look
at mimicing that approach.

One question: did you do anything that enables a server
to return a borked ECHConfigList as the retry-config for
test purposes? (I guess I can do up a special server for
that but be nicer were it possible to do such tests as
part of the openssl ``make test`` target.)

Cheers,
S.

On 18/04/2023 16:06, David Benjamin wrote:
> I don't think using the most recent config is right. We started with that
> in an early prototype, but quickly realized that doesn't work. For
> BoringSSL's API, when we make a config set, we just let you mark which
> configs are to be sent as retry configs and which aren't. I don't think you
> need a callback.
> 
> First, consider a service backed by multiple servers, as in most large
> deployments. It is impossible to atomically deploy something across all
> servers concurrently, but you also need to rotate keys, so your deployment
> model needs to take all this into account. You'll need to satisfy the
> following:
> 
> - New ECHConfigs, when they're generated, do not end up in DNS until all
> [or almost all] your servers support it
> - DNS caches should be assumed stale, so servers also need to support the
> last few generations of keys
> - The recovery flow makes a new connection, so it may hit a new server.
> Thus retry configs should also be ones that are expected to be supported by
> all servers
> 
> Put all this together and you get something of this rough shape:
> 
> - Servers keep a window of N configs. The most recent or so is in the
> process of being rolled out and may not be available on all its peers yet.
> Then there's one config that the server believes is rolled out to all its
> peers. *That* one should be the retry config. Then it also retains N-2 or
> so configs past that to deal with stale caches.
> 
> - Periodically, some provisioning process generates a new ECHConfig,
> prepends it to this list, retires the oldest one, and deploys the new
> window to all servers. The new config is *not* put in DNS yet because not
> all servers have it.
> 
> - Once the new config is sufficiently rolled out, put the new config in
> DNS, replacing the previous config. Possibly also do a round of rollout to
> all the servers to tell them they can bump the retry config forward by one
> generation, though the generation before that should also work fine
> provided N is large enough.
> 
> Second, there may be more than one retry config, hence why it's an
> ECHConfigList. ECH's extensibility model is that the server presents
> multiple ECHConfigs and the client picks one. Suppose an ECH server
> supports both P-256 and X25519 KEMs. It would then provision pairs of
> (P-256, X25519) configs every time it rotates. So while I said N configs
> above, it's really N pairs (or more depending on what your generation size
> is). The thing that goes in DNS and retry configs is one full generation of
> configs. That means your API should be able to express that.
> 
> Put another way: the retry config should be the answer to "what would I, an
> individual server instance, want to put in DNS, as of the information I
> have right now?" Having a simple boolean associated with each config
> suffices to let the caller control this, which is what we've done. (Though
> looking back, I see we didn't document the details here as well as I
> thought we had. I'll see about fixing that...)
> 
> On Sun, Apr 16, 2023 at 9:08 PM Stephen Farrell <stephen.farrell at cs.tcd.ie>
> wrote:
> 
>>
>> Hiya,
>>
>> I've been adding code for testing badly encoded ECH stuff
>> to my branch, esp. for EncodedClientHelloInner which is the
>> new thing that could cause server bugs. That's in [1] and
>> seems like a reasonable start to doing that well. And that
>> approach (for testing) also seems to work ok for badly
>> constructed values for the ECH acceptance signal in SH.random
>> or within HRRs.
>>
>> One problem I've not solved (within the test harness) is
>> how to do similarly for the retry-config values returned
>> by a server when the wrong ECH public value is used by a
>> client (or if a client GREASEs). Right now, a server (that
>> has some ECH private values loaded) will return the ECHConfig
>> corresponding to the most recently loaded ECH private value,
>> which I think is reasonable.
>>
>> However, for testing, it might be useful to enable a server
>> to trigger a callback, so that it could return a borked
>> retry-config value, to check that doesn't result in badness
>> for a client.
>>
>> My question is: would it be useful for real servers to be
>> able to choose the retry-config value to return via a new
>> callback? I guess that might be useful for servers that
>> use multiple CDNs, but I'm not at all sure, since I don't
>> get near such servers... hence asking:-)
>>
>> Secondary question: if useful, then what params might such
>> a callback need?
>>
>> Opinions welcome!
>> Thanks,
>> S.
>>
>> [1]
>>
>> https://github.com/sftcd/openssl/blob/ECH-draft-13c/test/echcorrupttest.c#L41
>> --
>> ech mailing list
>> ech at openssl.org
>> https://mta.openssl.org/mailman/listinfo/ech
>>
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xE4D8E9F997A833DD.asc
Type: application/pgp-keys
Size: 1197 bytes
Desc: OpenPGP public key
URL: <https://mta.openssl.org/pipermail/ech/attachments/20230418/585019a0/attachment-0001.asc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <https://mta.openssl.org/pipermail/ech/attachments/20230418/585019a0/attachment-0001.sig>