What invalid bit pattern? A u8 can be anything from 0 to 255, so the Option necessarily has to put its discriminant into another byte. If you replace it with a NonZeroU8, then the compiler will duly use the forbidden 0 value for the first Option level, and a separate byte for all further levels.
(Granted, in the None variant, the byte used for the u8 is not usable, but if we're already using a separate discriminant byte, 256 variants should be plenty.)
The same way as it does in the bool case? The u8 bits are invalid if either of the Options are None, but in particular if the Outer option is Some the Inner is None the bit that would otherwise be used for the bool (in the first example) is used to discriminate Outer, but doesn't do so in the case of the u8.
I don't really get what distinction you're trying to draw between niche optimization and the bitwise stuff you keep talking about. As far as I am aware, the Rust compiler never works by counting bits in any context. The LLVM backend sometimes does, but it's not responsible for enum layout.
The mental model is, an enum payload will have some number of integer/pointer values with niches in their representation. Niches don't work by counting bits, they work as numeric ranges. E.g., a char is just a u32 with values 0 through 0x10ffff, a bool is a u8 with values 0 and 1, a reference is just a pointer with any value except 0, etc., and the niche is precisely the negation of this range.
Sometimes the niche corresponds to bits used in a valid value (e.g., the 0 value of a NonZeroU8), and sometimes it corresponds to other bits (e.g., values 2 through 255 of a bool): the compiler only cares about the ranges, not the bits. If there is no large-enough niche, then the discriminant is placed in a separate byte.
An outer enum can't use sometimes-valid payloads in an inner enum to represent its discriminant, if that's what you're trying to say. Multiple discriminants can be 'flattened' into a single continuous range of niche values, but they can't be 'flattened' into inner enums' payloads. That would cause a weird inversion where you need to read the inner discriminant just to know whether the outer discriminant is valid.
(The compiler does have a few tricks up its sleeve to make the most of niches. E.g., in a Result<(u8, bool), u8>, an Err(42) becomes [42, 2], but in a Result<(bool, u8), u8>, an Err(42) becomes [2, 42] (https://play.rust-lang.org/?version=stable&mode=debug&editio...). The 42 is repositioned to keep the niche intact.)