Switch the tx and rx buffer head/tail entries in the HardwareSerial
initialisation list so that they match the order the fields are defined
in. This fixes a compiler warning (repeated for each of the
HardwareSerial source files the header is used in).
This helps improve the effective datarate on high (>500kbit/s) bitrates,
by skipping the interrupt and associated overhead. At 1 Mbit/s the
implementation previously got up to about 600-700 kbit/s, but now it
actually gets up to the 1Mbit/s (values are rough estimates, though).
Moreover, declaring pointers-to-registers as const and using initializer
list in class constructor allows the compiler to further improve inlining
performance.
This change recovers about 50 bytes of program space on single-UART devices.
See #1711
By putting the ISRs and HardwareSerial instance for each instance in a
separate compilation unit, the compile will only consider them for
linking when the instance is actually used. The ISR is always referenced
by the compiler runtime and the Serialx_available() function is always
referenced by SerialEventRun(), but both references are weak and thus do
not cause the compilation to be included in the link by themselves.
The effect of this is that when multiple HardwareSerial ports are
available, but not all are used, buffers are only allocated and ISRs are
only included for the serial ports that are used. On the mega, this
lowers memory usage from 653 bytes to just 182 when only using the first
serial port.
On boards with just a single port, there is no change, since the code
and memory was already left out when no serial port was used at all.
This fixes#1425 and fixes#1259.
Before, this decision was made in few different places, based on
sometimes different register defines.
Now, HardwareSerial.h decides wich UARTS are available, defines
USE_HWSERIALn macros and HardwareSerial.cpp simply checks these macros
(together with some #ifs to decide which registers to use for UART 0).
For consistency, USBAPI.h also defines a HAVE_CDCSERIAL macro when
applicable.
For supported targets, this should change any behaviour. For unsupported
targets, the error messages might subtly change because some checks are
moved or changed.
Additionally, this moves the USBAPI.h include form HardareSerial.h into
Arduino.h and raises an error when both CDC serial and UART0 are
available (previously this would silently use UART0 instead of CDC, but
there is not currently any Atmel chip available for which this would
occur).
Before, the interrupt was disabled when it was triggered and it turned
out there was no data to send. However, the interrupt can be disabled
already when the last byte is written to the UART, since write() will
always re-enable the interrupt when it adds new data to the buffer.
Closes: #1008
When interrupts are disabled, writing to HardwareSerial could cause a
lockup. When the tx buffer is full, a busy-wait loop is used to wait for
the interrupt handler to free up a byte in the buffer. However, when
interrupts are disabled, this will of course never happen and the
Arduino will lock up. This often caused lockups when doing (big) debug
printing from an interrupt handler.
Additionally, calling flush() with interrupts disabled while
transmission was in progress would also cause a lockup.
When interrupts are disabled, the code now actively checks the UDRE
(UART Data Register Empty) and calls the interrupt handler to free up
room if the bit is set.
This can lead to delays in interrupt handlers when the serial buffer is
full, but a delay is of course always preferred to a lockup.
Closes: #672
References: #1147
It turns out there is an additional corner case. The analysis in the
previous commit wrt to flush() assumes that the data register is always
kept filled by the interrupt handler, so the TXC bit won't get set until
all the queued bytes have been transmitted. But, when interrupts are
disabled for a longer period (for example when an interrupt handler for
another device is running for longer than 1-2 byte times), it could
happen that the UART stops transmitting while there are still more bytes
queued (but these are in the buffer, not in the UDR register, so the
UART can't know about them).
In this case, the TXC bit would get set, but the transmission is not
complete yet. We can easily detect this case by looking at the head and
tail pointers, but it seems easier to instead look at the UDRIE bit
(the TX interrupt is enabled if and only if there are bytes in the
queue). To fix this corner case, this commit:
- Checks the UDRIE bit and only if it is unset, looks at the TXC bit.
- Moves the clearing of TXC from write() to the tx interrupt handler.
This (still) causes the TXC bit to be cleared whenever a byte is
queued when the buffer is empty (in this case the tx interrupt will
trigger directly after write() is called). It also causes the TXC bit
to be cleared whenever transmission is resumed after it halted
because interrupts have been disabled for too long.
As a side effect, another race condition is prevented. This could occur
at very high bitrates, where the transmission would be completed before
the code got time to clear the TXC0 register, making the clear happen
_after_ the transmission was already complete. With the new code, the
clearing of TXC happens directly after writing to the UDR register,
while interrupts are disabled, and we can be certain the data
transmission needs more time than one instruction to complete. This
fixes#1463 and replaces #1456.
The flush() method blocks until all characters in the serial buffer have
been written to the uart _and_ transmitted. This is checked by waiting
until the "TXC" (TX Complete) bit is set by the UART, signalling
completion. This bit is cleared by write() when adding a new byte to the
buffer and set by the hardware after tranmission ends, so it is always
guaranteed to be zero from the moment the first byte in a sequence is
queued until the moment the last byte is transmitted, and it is one from
the moment the last byte in the buffer is transmitted until the first
byte in the next sequence is queued.
However, the TXC bit is also zero from initialization to the moment the
first byte ever is queued (and then continues to be zero until the first
sequence of bytes completes transmission). Unfortunately we cannot
manually set the TXC bit during initialization, we can only clear it. To
make sure that flush() would not (indefinitely) block when it is called
_before_ anything was written to the serial device, the "transmitting"
variable was introduced.
This variable suggests that it is only true when something is
transmitting, which isn't currently the case (it remains true after
transmission is complete until flush() is called, for example).
Furthermore, there is no need to keep the status of transmission, the
only thing needed is to remember if anything has ever been written, so
the corner case described above can be detected.
This commit improves the code by:
- Renaming the "transmitting" variable to _written (making it more
clear and following the leading underscore naming convention).
- Not resetting the value of _written at the end of flush(), there is
no point to this.
- Only checking the "_written" value once in flush(), since it can
never be toggled off anyway.
- Initializing the value of _written in both versions of _begin (though
it probably gets initialized to 0 by default anyway, better to be
explicit).
The actual interrupt vectors are of course defined as before, but they
let new methods in the HardwareSerial class do the actual work. This
greatly reduces code duplication and prepares for one of my next commits
which requires the tx interrupt handler to be called from another
context as well.
The actual content of the interrupts handlers was pretty much identical,
so that remains unchanged (except that store_char was now only needed
once, so it was inlined).
Now all access to the buffers are inside the HardwareSerial class, the
buffer variables can be made private.
One would expect a program size reduction from this change (at least
with multiple UARTs), but due to the fact that the interrupt handlers
now only have indirect access to a few registers (which previously were
just hardcoded in the handlers) and because there is some extra function
call overhead, the code size on the uno actually increases by around
70 bytes. On the mega, which has four UARTs, the code size decreases by
around 70 bytes.
Previously, the constants to use for the bit positions of the various
UARTs were passed to the HardwareSerial constructor. However, this
meant that whenever these values were used, the had to be indirectly
loaded, resulting in extra code overhead. Additionally, since there is
no instruction to shift a value by a variable amount, the 1 << x
expressions (inside _BV and sbi() / cbi()) would be compiled as a loop
instead of being evaluated at compiletime.
Now, the HardwareSerial class always uses the constants for the bit
positions of UART 0 (and some code is present to make sure these
constants exist, even for targets that only have a single unnumbered
UART or start at UART1).
This was already done for the TXC0 constant, for some reason. For the
actual register addresses, this approach does not work, since these are
of course different between the different UARTs on a single chip.
Of course, always using the UART 0 constants is only correct when the
constants are actually identical for the different UARTs. It has been
verified that this is currently the case for all targets supported by
avr-gcc 4.7.2, and the code contains compile-time checks to verify this
for the current target, in case a new target is added for which this
does not hold. This verification was done using:
for i in TXC RXEN TXEN RXCIE UDRIE U2X UPE; do echo $i; grep --no-filename -r "#define $i[0-9]\? " /usr/lib/avr/include/avr/io* | sed "s/#define $i[0-9]\?\s*\(\S\)\+\s*\(\/\*.*\*\/\)\?$/\1/" | sort | uniq ; done
This command shows that the above constants are identical for all uarts
on all platforms, except for TXC, which is sometimes 6 and sometimes 0.
Further investigation shows that it is always 6, except in io90scr100.h,
but that file defines TXC0 with value 6 for the UART and uses TXC with
value 0 for some USB-related register.
This commit reduces program size on the uno by around 120 bytes.