Somehow gcc 4.8 doesn't inline this function, even though it is always
called with constant arguments and can be reduced to just a few
instructions when inlined. Adding the always_inline attribute makes gcc
inline it, saving 46 bytes on the Arduino uno.
gcc 4.3 already inlined this function, so there are no space
savings there.
If an interrupt causing overflow would occur between reading
_buffer_overflow and clearing it, this overflow condition would be
immediately cleared and never be returned by overflow().
By only clearing the overflow flag if an overflow actually occurred,
this problem goes away (worst case overflow() returns false even though
an overflow _just_ occurred, but then the next call to overflow() will
return true).