Rebased the bugfix from the original Google Code issue #292 to work with Arduino 1.6.x
Description of original fix provided by Pete62:
The later 8 bit AVR's use two registers (TCCRxA, TCCRxB) whereas the ATmega8 only uses a single register (TCCR2) to house the control bits for Timer 2. Bits were inadvertently being cleared.
To avoid having a .cpp just for an extern variable definition, `static`
has been chosen over `extern`.
As the `EEPROMClass` class simply wraps functionality located elsewhere,
it is completely compiled away. Even though each translation unit which
includes the header will get a copy with internal linkage, there is no
associated overhead.
More info
[here](http://stackoverflow.com/questions/29098518/extern-variable-only-in-header-unexpectedly-working-why)
Previously, the TX pin would be set to output first and then written
high (assuming non-inverted logic). When the pin was previously
configured for input without pullup (which is normal reset state), this
results in driving the pin low for a short when initializing. This could
accidenttally be seen as a stop bit by the receiving side.
By first writing HIGH and then setting the mode to OUTPUT, the pin will
have its pullup enabled for a short while, which is harmless.
Instead of using a lookup table with (wrong) timings, this calculates
the timings in SoftwareSerial::begin. This is probably a bit slower, but
since it typically happens once, this shouldn't be a problem.
Additionally, since the lookup tables can be removed, this is also a lot
smaller, as well as supporting arbitrary CPU speeds and baudrates,
instead of the limited set that was defined before.
Furthermore, this switches to use the _delay_loop_2 function from
avr-libc instead of a handcoded delay function. The avr-libc function
only takes two instructions, as opposed to four instructions for the old
one. The compiler also inlines the avr-libc function, which makes the
timings more reliable.
The calculated timings directly rely on the instructions generated by
the compiler, since a significant amount of time is spent processing
(compared to the delays, especially at higher speeds). This means that
if the code is changed, or a different compiler is used, the
calculations might need changing (though a few cycles more or less
shouldn't cause immediate breakage).
The timings in the code have been calculated from the assembly generated
by gcc 4.8.2 and gcc 4.3.2.
The RX baudrates supported by SoftwareSerial are still not unlimited. At
16Mhz, using gcc 4.8.2, everything up to 115200 works. At 8Mhz, it works
up to 57600. Using gcc 4.3.2, it also works up to 57600 at 16Mhz and up
to 38400 at 8Mhz. Note that at these highest speeds, communication
works, but is still quite sensitive to other interrupts (like the
millis() interrupts) when bytes are sent back-to-back, so there still
are corrupted bytes in RX.
TX works up to 115200 for all combinations of compiler and clock rates.
This fixes#2019
Before, the interrupt would remain enabled during reception, which would
re-set the PCINT flag because of the level changes inside the received
byte. Because interrupts are globally disabled, this would not
immediately trigger an interrupt, but the flag would be remembered to
trigger another PCINT interrupt immediately after the first one is
processed.
Typically this was not a problem, because the second interrupt would see
the stop bit, or an idle line, and decide that the interrupt triggered
for someone else. However, at high baud rates, this could cause the
next interrupt for the real start bit to be delayed so much that the
byte got corrupted.
By clearing the interrupt mask bit for just the RX pin (as opposed to
the PCINT mask bit for the entire port), any PCINT events on other bits
can still set the PCINT flag and be processed as normal. In this case,
it's likely that there will be corruption, but that's inevitable when
(other) interrupts happen during SoftwareSerial reception.
This precalculates the mask register and value, making setRxIntMask
considerably less complicated. Right now, this is not a big deal, but
simplifying it allows using it inside the ISR next.
Since those functions are only called once now, it makes sense to inline
them. This saves a few bytes of program space, but also saves a few
cycles in the critical RX path.
Previously, up to four separate but identical ISR routines were defined,
for PCINT0, PCINT1, PCINT2 and PCINT3. Each of these would generate
their own function, with a lot of push-popping because another function
was called.
Now, the ISR_ALIASOF macro from avr-libc is used to declare just the
PCINT0 version and make all other ISRs point to that one, saving a lot
of program space, as well as some speed because of improved inlining.
On an Arduino Uno with gcc 4.3, this saves 168 bytes. With gcc 4.8, this
saves 150 bytes.
Similar to SoftwareSerial::write, this rewrites the loop to only touch
the MSB and then shift those bits up, allowing the compiler to generate
more efficient code. Unlike the write function however, it is not needed
to put all instance variables used into local variables, for some reason
the compiler already does this (and doing it manually even makes the
code bigger).
On the Arduino Uno using gcc 4.3 this saves 26 bytes. Using gcc 4.8 this
saves 30 bytes.
Note that this removes the else clause in the code, making the C code
unbalanced, which looks like it breaks timing balance. However, looking
at the code generated by the compiler, it turns out that the old code
was actually unbalanced, while the new code is properly balanced.
This change restructures the loop, to help the compiler generate shorter
code (because now only the LSB of the data byte is checked and
subsequent bytes are shifted down one by one, it can use th "skip if bit
set" instruction).
Furthermore, it puts most attributes in local variables, which causes
the compiler to put them into registers. This makes the timing-critical
part of the code smaller, making it easier to provide accurate timings.
On an Arduino uno using gcc 4.3, this saves 58 bytes. On gcc 4.8, this
saves 14 bytes.
Somehow gcc 4.8 doesn't inline this function, even though it is always
called with constant arguments and can be reduced to just a few
instructions when inlined. Adding the always_inline attribute makes gcc
inline it, saving 46 bytes on the Arduino uno.
gcc 4.3 already inlined this function, so there are no space
savings there.
Before, there was nearly identical code for the inverted and regular
cases. However, simply inverting the byte in the inverted case allows
using the regular code twice, reducing the generated code size by 100
bytes (on an Arduino Uno and gcc 4.3, on gcc 4.8 the reduction is 50
bytes).
stopListening also disabled the interrupt, if needed, so calling that
function makes more sense. Since stopListening only disables the
interrupt when the current SoftwareSerial is the active object, and that
can only be the case when _rx_delay_stopbit is non-zero, there is no
need to separately check _rx_delay_stopbit anymore.
If an interrupt causing overflow would occur between reading
_buffer_overflow and clearing it, this overflow condition would be
immediately cleared and never be returned by overflow().
By only clearing the overflow flag if an overflow actually occurred,
this problem goes away (worst case overflow() returns false even though
an overflow _just_ occurred, but then the next call to overflow() will
return true).
This prevents interrupts from triggering when the SoftwareSerial
instance is not even listening.
Additionally, this removes the need to disable interrupts in
SoftwareSerial::listen, since no interrupts are active while it touches
the variables.
The current check is still always false when the old check was, but
additionally it will not disable the interrupts when they were never
enabled (which shouldn't matter much, but this is more consistent).
In this case, SoftwareSerial::begin will not have enabled the
interrupts, so better not allow the SoftwareSerial instance to enter the
listening state either.
Before enabling interupts, begin would see if the given receive pin
actually has an associated PCINT register. If not, the interrupts would
not be enabled.
Now, the same check is done, but when no register is available, the rx
parameters are not loaded at all (which in turn prevents the interrupt
from being enabled). This allows all code to use the same "is rx
enabled" (which will be added next).
Previously, it could happen that SPI::beginTransaction was
interrupted by an ISR, while it is changing the SPI_AVR_EIMSK
register or interruptSave variable (it seems that there is
a small window after changing SPI_AVR_EIMSK where an interrupt
might still occur). If this happens, interruptSave is overwritten
with an invalid value, permanently disabling the pin interrupts.
To prevent this, disable interrupts globally while changing
these values.
From https://github.com/arduino/Arduino/pull/2376#issuecomment-59671152
Quoting Andrew Kroll:
[..this commit..] introduces a small delay that can prevent the wait
loop form iterating when running at the maximum speed. This gives
you a little more speed, even if it seems counter-intuitive. At
lower speeds, it is unnoticed. Watch the output on an oscilloscope
when running full SPI speed, and you should see closer back-to-back
writes.
Quoting Paul Stoffregen:
I did quite a bit of experimenting with the NOP addition. The one
that's in my copy gives about a 10% speedup on AVR.
Previously, when verbose uploads were enabled, avrdude was run with four
-v options, causing it to dump all raw bytes exchanged with the
bootloader. This floods the console so much that meaningful output
mostly disappears.
Most users probably want to enable verbose mode just to see what avrdude
command is ran. Furthermore, users that benefit from the raw bytes
dumped are perfectly capable of either running avrdude manually, or
modifying platform.txt. Given that, running avrdude with just one -v
should be plenty.
This fixes#891.
When checking the `left` argument, it previously allowed having
left == len. However, this means the substring starts one past the last
character in the string and should return the empty string. In practice,
this already worked correctly, because buffer[len] contains the trailing
nul, so it would (re)assign the empty string to `out`.
However, fixing this check makes it a bit more logical, and prevents a
fairly unlikely out-of-buffer write (to address 0x0) when calling
substring on an invalidated String:
String bar = (char*)NULL;
bar.substring(0, 0);
Previously, this method calculated the length of the string from the
given index onwards. However, the other remove() method called already
contains code for this calculation, which is used when the count passed
in is too big. This means we can just pass in a very big count that is
guaranteed to point past the end of the string, shrinking the remove
method by a few bytes.
Previously, if you passed in a very big index and/or count, the
`index + count` could overflow, making the count be used as-is instead
of being truncated (causing the string to be updated wrongly and
potentially writing to arbitrary memory locations).
We can rewrite the comparison to use `len - index` instead. Since we
know that index < len, we are sure this subtraction does not overflow,
regardless of what values of index and count we pass in.
As an added bonus, the `len - index` value already needed be calculated
inside the if, so this saves a few instructions in the generated code.
To illustrate this problem, consider this code:
String foo = "foo";
Serial.println(foo.length()); // Prints 3
foo.remove(1, 65535); // Should remove all but first character
Serial.println(foo.length()); // Prints 4 without this patch
Not shown in this is example is that some arbitrary memory is written
as well.
The following empty stubs has been replaced by the gcc
flag -fno-threadsafe-static:
int __cxa_guard_acquire(__guard *);
void __cxa_guard_release (__guard *);
void __cxa_guard_abort (__guard *);
The following empty stubs has been moved into their specific
module abi.cpp:
void __cxa_pure_virtual(void) __attribute ((noreturn));
void __cxa_deleted_virtual(void) __attribute ((noreturn));
Fix#107
Stream::find(char *target) passes an empty terminator string to
Stream::findUntil(char *target, char *terminator) which caused a compiler
warning with the updated toolchain, so cast it to a char*.
These chips were previously supported, but since parity error checking
was added, this support has broken. Most chips define UPE0 (etc.) for
the parity error bit. Some chips don't have numbered UARTS so only
define UPE and even fewer define PE instead of UPE. This adds support
for those chips again.
Closes: #2137
Some devices, such as the atmega2560 or the atmega256rfr2 have a timer1c
output. It seems this output is not connected to anything on the Arduino
Mega, but this allows using it on third party hardware nonetheless.
Before, HardwareSerial1+.cpp were a copy of HardwareSerial1.cpp with all
0's replaced by the corresponding number. This would mean that e.g.
the Serial1 object would use the UBRRL register instead of UBRR1L when
it was defined, or the USART_RX_vect instead of USART1_RX_vect.
In practice, this would neve actually cause problems, since:
- No avr chip currently has both the non-numbered registers as well as
numbered registers.
- HardwareSerial.h would only define HAVE_HWSERIALx when the
corresponding numbered register is defined (except for
HAVE_HWSERIAL0, which is also defined when the unnumbered registers
are present).
Furthermore, before both the UARTx_xx_vect and USART_x_xx_vect was used.
Looking at the include files, only UART1_xx_vect is actually used (by
iom161.h), the others use USARTx_xx_vect. For this reason,
HardwareSerial1.cpp keeps the preprocessor conditional to select either
UART or USART and the other files use USART unconditionally.
While we're here, also fix the compiler error message when no valid ISR
name was found (it previously said "for the first UART" in all cases).
Previously, this relied on an (ugly, avr-specific) magic default for the
compiler.path variable, set by the IDE. This allowed the IDE to fall
back to a system-wide toolchain when no bundled toolchain was found (by
making compiler.path empty).
However,
- this only worked for avr, not sam,
- this worked only for gcc, a system-wide avrdude would break on the
avrdude.conf path in platform.txt, and
This would mean that automatic system-wide fallback didn't work in all
situations, so you'd still have to modify platform.txt (or create
platform.local.txt). Since doing that explictly is the most reliable
way, this commit removes the partial-working ability to do this
automatically.
Note that the code to automatically set compiler.path is still kept
around, in case third-party hardware still relies on this. At some
point, this code should be removed, but for now it just shows a warning
message.
In commit 0e97bcb (Put each HardwareSerial instance in its own .cpp
file), the serial event handling was changed. This was probably a
copy-paste typo.
The effect of this bug was that SerialEvent3 would not run, unless
SerialEvent2 was defined, but also that if SerialEvent2 is defined but
SerialEvent3 is not, this could cause a reset (call to NULL pointer).
This closes#1967, thanks to Peter Olson for finding the bug and fix.
Added support for buffer sizes bigger than 256 bytes.
Added possibility to overrule the default size.
Added support for different size of TX and RX buffer sizes.
The default values remain the same. You can however specify a different
value for TX and RX buffer
Added possibility to overrule the default size.
If you want to have different values
define SERIAL_TX_BUFFER_SIZE and SERIAL_RX_BUFFER_SIZE on the command
line
Added support for buffer sizes bigger than 256 bytes.
Because of the possibility to change the size of the buffer sizes longer
than 256 must be supported.
The type of the indexes is decided upon the size of the buffers. So
there is no increase in program/data size when the buffers are smaller
than 257
Added support for different size of TX and RX buffer sizes.
Added support for buffer sizes bigger than 256 bytes.
Added support for different size of TX and RX buffer sizes.
The default values remain the same. If you want to have different values
define SERIAL_TX_BUFFER_SIZE and SERIAL_RX_BUFFER_SIZE on the command
line
Added support for buffer sizes bigger than 256 bytes.
The type of the indexes is decided upon the size of the buffers. So
there is no increase in program/data size when the buffers are smaller
than 257
peekNextDigit() returns an int, so it can return -1 in addition to all
256 possible bytes. By putting the result in a signe char, all bytes
over 128 will be interpreted as "no bytes available". Furthermore, it
seems that on SAM "char" is unsigned by default, causing the
"if (c < 0)" line a bit further down to always be false.
Using an int is more appropriate.
A different fix for this issue was suggested in #1399. This fix helps
towards #1728.
In C++, true and false are language keywords, so there is no need to
define them as macros. Including stdbool.h in C++ effectively changes
nothing. In C, true, false and also the bool type are not available, but
including stdbool.h will make them available.
Using stdbool.h means that we get true, false and the bool type in
whatever way the compiler thinks is best, which seems like a good idea
to me.
This also fixes the following compiler warnings if a .c file includes
both stdbool.h and Arduino.h:
warning: "true" redefined [enabled by default]
#define true 0x1
warning: "false" redefined [enabled by default]
#define false 0x0
This fixes#1570 and helps toward fixing #1728.
This only changed the AVR core, the SAM core already doesn't define true
and false (but doesn't include stdbool.h either).
Previously, pointer casting was used, but this resulted in strict-aliasing warnings:
IPAddress.h: In member function ‘IPAddress::operator uint32_t() const’:
IPAddress.h:46:61: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
operator uint32_t() const { return *((uint32_t*)_address); };
^
IPAddress.h: In member function ‘bool IPAddress::operator==(const IPAddress&) const’:
IPAddress.h:47:81: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
bool operator==(const IPAddress& addr) const { return (*((uint32_t*)_address)) == (*((uint32_t*)addr._address)); };
^
IPAddress.h:47:114: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
bool operator==(const IPAddress& addr) const { return (*((uint32_t*)_address)) == (*((uint32_t*)addr._address)); };
Converting between unrelated types like this is commonly done using a union,
which do not break the strict-aliasing rules. Using that union, inside
IPAddress there is now an attribute _address.bytes for the raw byte
arra, or _address.dword for the uint32_t version.
Since we now have easy access to the uint32_t version, this also removes
two memcpy invocations that can just become assignments.
This patch does not change the generated code in any way, the compiler
already optimized away the memcpy calls and the previous casts mean
exactly the same.
This is a different implementation of a part of #1399 and it helps
toward fixing #1728.
This was already fixed for HardwareSerial.cpp in #1863, but there was
one more case hidden in HardwareSerial_private.h.
The index attributes have been uint8_t for a while, so there is no point
in using int for local variables. This should allow the compiler to
generate slightly more efficient code, but (at least on gcc 4.8.2) it
also confuses the register allocator, causing this change to increase
code size by 2 bytes instead due to extra push/pop instructions (but
this will probably change in the future if the compiler improves).
The index attributes have been uint8_t for a while, so there is no point
in using int for local variables. This should allow the compiler to
generate slightly more efficient code, but (at least on gcc 4.8.2) it
also confuses the register allocator, causing this change to increase
code size by 2 bytes instead due to extra push/pop instructions (but
this will probably change in the future if the compiler improves).
Switch the tx and rx buffer head/tail entries in the HardwareSerial
initialisation list so that they match the order the fields are defined
in. This fixes a compiler warning (repeated for each of the
HardwareSerial source files the header is used in).
This helps improve the effective datarate on high (>500kbit/s) bitrates,
by skipping the interrupt and associated overhead. At 1 Mbit/s the
implementation previously got up to about 600-700 kbit/s, but now it
actually gets up to the 1Mbit/s (values are rough estimates, though).
Moreover, declaring pointers-to-registers as const and using initializer
list in class constructor allows the compiler to further improve inlining
performance.
This change recovers about 50 bytes of program space on single-UART devices.
See #1711
By putting the ISRs and HardwareSerial instance for each instance in a
separate compilation unit, the compile will only consider them for
linking when the instance is actually used. The ISR is always referenced
by the compiler runtime and the Serialx_available() function is always
referenced by SerialEventRun(), but both references are weak and thus do
not cause the compilation to be included in the link by themselves.
The effect of this is that when multiple HardwareSerial ports are
available, but not all are used, buffers are only allocated and ISRs are
only included for the serial ports that are used. On the mega, this
lowers memory usage from 653 bytes to just 182 when only using the first
serial port.
On boards with just a single port, there is no change, since the code
and memory was already left out when no serial port was used at all.
This fixes#1425 and fixes#1259.
Before, this decision was made in few different places, based on
sometimes different register defines.
Now, HardwareSerial.h decides wich UARTS are available, defines
USE_HWSERIALn macros and HardwareSerial.cpp simply checks these macros
(together with some #ifs to decide which registers to use for UART 0).
For consistency, USBAPI.h also defines a HAVE_CDCSERIAL macro when
applicable.
For supported targets, this should change any behaviour. For unsupported
targets, the error messages might subtly change because some checks are
moved or changed.
Additionally, this moves the USBAPI.h include form HardareSerial.h into
Arduino.h and raises an error when both CDC serial and UART0 are
available (previously this would silently use UART0 instead of CDC, but
there is not currently any Atmel chip available for which this would
occur).
Before, the interrupt was disabled when it was triggered and it turned
out there was no data to send. However, the interrupt can be disabled
already when the last byte is written to the UART, since write() will
always re-enable the interrupt when it adds new data to the buffer.
Closes: #1008
When interrupts are disabled, writing to HardwareSerial could cause a
lockup. When the tx buffer is full, a busy-wait loop is used to wait for
the interrupt handler to free up a byte in the buffer. However, when
interrupts are disabled, this will of course never happen and the
Arduino will lock up. This often caused lockups when doing (big) debug
printing from an interrupt handler.
Additionally, calling flush() with interrupts disabled while
transmission was in progress would also cause a lockup.
When interrupts are disabled, the code now actively checks the UDRE
(UART Data Register Empty) and calls the interrupt handler to free up
room if the bit is set.
This can lead to delays in interrupt handlers when the serial buffer is
full, but a delay is of course always preferred to a lockup.
Closes: #672
References: #1147
It turns out there is an additional corner case. The analysis in the
previous commit wrt to flush() assumes that the data register is always
kept filled by the interrupt handler, so the TXC bit won't get set until
all the queued bytes have been transmitted. But, when interrupts are
disabled for a longer period (for example when an interrupt handler for
another device is running for longer than 1-2 byte times), it could
happen that the UART stops transmitting while there are still more bytes
queued (but these are in the buffer, not in the UDR register, so the
UART can't know about them).
In this case, the TXC bit would get set, but the transmission is not
complete yet. We can easily detect this case by looking at the head and
tail pointers, but it seems easier to instead look at the UDRIE bit
(the TX interrupt is enabled if and only if there are bytes in the
queue). To fix this corner case, this commit:
- Checks the UDRIE bit and only if it is unset, looks at the TXC bit.
- Moves the clearing of TXC from write() to the tx interrupt handler.
This (still) causes the TXC bit to be cleared whenever a byte is
queued when the buffer is empty (in this case the tx interrupt will
trigger directly after write() is called). It also causes the TXC bit
to be cleared whenever transmission is resumed after it halted
because interrupts have been disabled for too long.
As a side effect, another race condition is prevented. This could occur
at very high bitrates, where the transmission would be completed before
the code got time to clear the TXC0 register, making the clear happen
_after_ the transmission was already complete. With the new code, the
clearing of TXC happens directly after writing to the UDR register,
while interrupts are disabled, and we can be certain the data
transmission needs more time than one instruction to complete. This
fixes#1463 and replaces #1456.
The flush() method blocks until all characters in the serial buffer have
been written to the uart _and_ transmitted. This is checked by waiting
until the "TXC" (TX Complete) bit is set by the UART, signalling
completion. This bit is cleared by write() when adding a new byte to the
buffer and set by the hardware after tranmission ends, so it is always
guaranteed to be zero from the moment the first byte in a sequence is
queued until the moment the last byte is transmitted, and it is one from
the moment the last byte in the buffer is transmitted until the first
byte in the next sequence is queued.
However, the TXC bit is also zero from initialization to the moment the
first byte ever is queued (and then continues to be zero until the first
sequence of bytes completes transmission). Unfortunately we cannot
manually set the TXC bit during initialization, we can only clear it. To
make sure that flush() would not (indefinitely) block when it is called
_before_ anything was written to the serial device, the "transmitting"
variable was introduced.
This variable suggests that it is only true when something is
transmitting, which isn't currently the case (it remains true after
transmission is complete until flush() is called, for example).
Furthermore, there is no need to keep the status of transmission, the
only thing needed is to remember if anything has ever been written, so
the corner case described above can be detected.
This commit improves the code by:
- Renaming the "transmitting" variable to _written (making it more
clear and following the leading underscore naming convention).
- Not resetting the value of _written at the end of flush(), there is
no point to this.
- Only checking the "_written" value once in flush(), since it can
never be toggled off anyway.
- Initializing the value of _written in both versions of _begin (though
it probably gets initialized to 0 by default anyway, better to be
explicit).
The actual interrupt vectors are of course defined as before, but they
let new methods in the HardwareSerial class do the actual work. This
greatly reduces code duplication and prepares for one of my next commits
which requires the tx interrupt handler to be called from another
context as well.
The actual content of the interrupts handlers was pretty much identical,
so that remains unchanged (except that store_char was now only needed
once, so it was inlined).
Now all access to the buffers are inside the HardwareSerial class, the
buffer variables can be made private.
One would expect a program size reduction from this change (at least
with multiple UARTs), but due to the fact that the interrupt handlers
now only have indirect access to a few registers (which previously were
just hardcoded in the handlers) and because there is some extra function
call overhead, the code size on the uno actually increases by around
70 bytes. On the mega, which has four UARTs, the code size decreases by
around 70 bytes.