Nerd Ralph writes:
I’ve written a few software UARTs for AVR MCUs. All of them have bit-banged the output, using cycle-counted assembler busy loops to time the output of each bit. The code requires interrupts to be disabled to ensure accurate timing between bits. This makes it impossible to receive data at the same time as it is being transmitted, and therefore the bit-banged implementations have been half-duplex. By using the waveform generator of the timer/counter in many AVR MCUs, I’ve found a way to implement a full-duplex UART, which can simultaneously send and receive at up to 115kbps when the MCU is clocked at 8Mhz.
More details on Nerd Ralph blog.