Low-level RS232 - How to resynchronize a block-wise transmission after a failure ?

Discussion:

(too old to reply)

R.Wieser

2019-06-29 07:11:29 UTC

Hello all,

I'm trying to wrap my head around some simple block-wise RS232
communication, in specific what the receiving end should do when it detects
/something/ has gone wrong (bits, and/or whole bytes got lost)

With "block wise" I mean that I intend to chop the stream up into blocks,
allowing resending of the damaged ones (instead of the whole thing).

The problem is that when one block is damaged I have no good idea how to
detect the start of the next block.

Or rather, I've got several ideas, all with their own problems :

In a /fully wired/ connection I can (ab)use one of the handshake lines to
indicate a start-of-block.

Though this would (ofcourse) be unusable for any 3-wire connection (TXD,
RXD, ground), or when an intermediate device (modem) is involved.

The old "send a block and wait for it to be acknowledged before sending a
new one" method ofcourse would work well for short distances, but not really
for longer ones where the turn-around time can be measured in tenths of, or
even whole seconds.

I also could introduce pauses between send blocks, but those would need to
be long enough to allow for some inter-byte delay (caused by the sender),
likely slowing the whole transmission down quite a bit.

Argh!

The infuriating part of all of this is knowing that its (most likely) a
rather common problem, and as such has been solved decades ago - but I can't
seem to find anything worthwhile(1) while googeling for it. :-\

(1) other than making the start-of-block bit pattern unique - forcing a
re-encoding / escaping of the actual send data (causing its own problems.
Including what would happen when a startbit is missed and the subsequently
receieved byte patterns are up for grabs ...)

Regards,
Rudy Wieser

P.s.
Although the first machine (server if you like) will (currently) be a
Windows machine, the second one (client) might also be DOS, Linux or even a
microcontroller. In other words, I need an approach I can implement myself.

Johann Klammer

2019-06-29 09:26:28 UTC

Permalink

Post by R.Wieser
Hello all,
I'm trying to wrap my head around some simple block-wise RS232
communication, in specific what the receiving end should do when it detects
/something/ has gone wrong (bits, and/or whole bytes got lost)
With "block wise" I mean that I intend to chop the stream up into blocks,
allowing resending of the damaged ones (instead of the whole thing).
The problem is that when one block is damaged I have no good idea how to
detect the start of the next block.
In a /fully wired/ connection I can (ab)use one of the handshake lines to
indicate a start-of-block.
Though this would (ofcourse) be unusable for any 3-wire connection (TXD,
RXD, ground), or when an intermediate device (modem) is involved.
The old "send a block and wait for it to be acknowledged before sending a
new one" method ofcourse would work well for short distances, but not really
for longer ones where the turn-around time can be measured in tenths of, or
even whole seconds.
I also could introduce pauses between send blocks, but those would need to
be long enough to allow for some inter-byte delay (caused by the sender),
likely slowing the whole transmission down quite a bit.
Argh!
The infuriating part of all of this is knowing that its (most likely) a
rather common problem, and as such has been solved decades ago - but I can't
seem to find anything worthwhile(1) while googeling for it. :-\
(1) other than making the start-of-block bit pattern unique - forcing a
re-encoding / escaping of the actual send data (causing its own problems.
Including what would happen when a startbit is missed and the subsequently
receieved byte patterns are up for grabs ...)
Regards,
Rudy Wieser
P.s.
Although the first machine (server if you like) will (currently) be a
Windows machine, the second one (client) might also be DOS, Linux or even a
microcontroller. In other words, I need an approach I can implement myself.

https://en.wikipedia.org/wiki/XMODEM

R.Wieser

2019-06-29 10:09:54 UTC

Permalink

Johann ,

Post by Johann Klammer
https://en.wikipedia.org/wiki/XMODEM

While that mentions a few of the same problems I penned down, the actual
solution for them do not quite materialize I'm afraid. For example, it
recognises that send-ack, send-ack, etc. doesn't quite work on higher
speeds, but than falls flat on its face in regard to how to handle it.

Its exactly that aspect (detecting start of a block in a continuous stream)
which often is glossed over and thus hard to find anything about. :-\
How the blases did the modems of yesteryear do it ?

And by the way, I'm aware of the "sliding window" mechanism, and have the
intent to (at least try to) implement it. But as long as I cannot detect
one (malformed) block ending and the next one starting I don't think I
should put any energy in that. "First things first" and all that.

Regards,
Rudy Wieser

Johann Klammer

2019-06-29 16:51:34 UTC

Permalink

Post by R.Wieser
Johann ,

Post by Johann Klammer
https://en.wikipedia.org/wiki/XMODEM

While that mentions a few of the same problems I penned down, the actual
solution for them do not quite materialize I'm afraid. For example, it
recognises that send-ack, send-ack, etc. doesn't quite work on higher
speeds, but than falls flat on its face in regard to how to handle it.
Its exactly that aspect (detecting start of a block in a continuous stream)
which often is glossed over and thus hard to find anything about. :-\
How the blases did the modems of yesteryear do it ?
And by the way, I'm aware of the "sliding window" mechanism, and have the
intent to (at least try to) implement it. But as long as I cannot detect
one (malformed) block ending and the next one starting I don't think I
should put any energy in that. "First things first" and all that.
Regards,
Rudy Wieser

You could reencode to a different alphabet and set aside distinct symbols for
communicating those protocol states. This reduces the net data rate and
adds some overhead in your nodes.
I believe the traditional thing to do, is to use a (not unique)
packet header and try to decode it. But discard a packet if the checksum
is incorrect.

R.Wieser

2019-06-30 07:23:19 UTC

Permalink

Johan,

Post by Johann Klammer
You could reencode to a different alphabet and set aside distinct
symbols for communicating those protocol states.

Possible. But than, how should I deal with a missed startbit (a one-in-ten
chance) possibly causing unique symbols to appear ?

Post by Johann Klammer
I believe the traditional thing to do, is to use a (not unique)
packet header and try to decode it. But discard a packet
if the checksum is incorrect.

:-) Which brings me straight back to my initial question : How do I discard
that send data and resync on the next header ? As the checksum is
incorrect I have no idea of how much data (or even header!) actually came in
and thus where the next header starts.

And thats apart of the possibility it being outof byte-sync and reading
databits as if they are startbits ...

Also, do you notice ? We're now exchanging our own thoughts about possible
approaches, even though the problem must have been solved before ... :-)

Regards,
Rudy Wieser

Johann Klammer

2019-06-30 13:06:31 UTC

Permalink

You must be very lonely.

R.Wieser

2019-06-30 15:03:16 UTC

Permalink

Johann,

Post by Johann Klammer
You must be very lonely.

Pardon me ? Whats that supposed to mean ?

I'm simply trying to create an RS232 connection between two computers
(regardless of close by or on distance). While doing that I've noticed a
few rather low-level problem points (as mentioned in the first post and
repeated in later replies) which I'm trying to solve and asked help with.

Personally I don't think that ignoring those problems will do me any good.
Than again, maybe you know something I'm not yet privy of. If so, why don't
you just tell me ?

Regards,
Rudy Wieser

Charlie Gibbs

2019-06-30 20:29:28 UTC

Permalink

Post by R.Wieser
Johan,

Post by Johann Klammer
You could reencode to a different alphabet and set aside distinct
symbols for communicating those protocol states.

Possible. But than, how should I deal with a missed startbit (a one-in-ten
chance) possibly causing unique symbols to appear ?

Are you talking about a missed start bit or missing start bytes in a protocol?
If it's a missing start bit, you're going to get framing errors so you'll know
that something isn't right. I presume you'll be working in a particularly
noisy environment; normally such errors don't happen too often, and they can
be detected by checking UART status and whatever checksum or CRC scheme you're
using.

Post by R.Wieser

Post by Johann Klammer
I believe the traditional thing to do, is to use a (not unique)
packet header and try to decode it. But discard a packet
if the checksum is incorrect.

Again, make sure you distinguish between start bits (that mark-to-space
transition at the beginning of a byte) from start bytes that are defined
by your protocol.

Post by R.Wieser
Also, do you notice ? We're now exchanging our own thoughts about possible
approaches, even though the problem must have been solved before ... :-)

Yup. The data you're dealing with determines which choices are best.
I presume you're sending a stream of random data in which all 256
combinations are possible. On the other hand, if you send ASCII
text (including data encoded, e.g. with base64), you can tell the
end of a record by the termination characters (e.g. CRLF). You'll
still need some sort of checksum or CRC to guard against errors,
plus some sort of ACK/NAK protocol to request retransmission of
failing blocks. If you don't receive CRLF in a given period of
time, assume the end of the record is missing and ask for a
retransmission.

Keep in mind that if the data link is as noisy as you're implying it is,
you should keep block sizes small; it's less efficient, but improves the
odds that a block will get through between line hits. Too large a block
on a noisy line will ensure that nearly every block gets corrupted, and
your throughput can fall to zero.

--
/~\ ***@kltpzyxm.invalid (Charlie Gibbs)
\ / I'm really at ac.dekanfrus if you read it the right way.
X Top-posted messages will probably be ignored. See RFC1855.
/ \ Fight low-contrast text in web pages! http://contrastrebellion.com

R.Wieser

2019-07-01 09:02:03 UTC

Permalink

Charlie,

Post by Charlie Gibbs
Are you talking about a missed start bit or missing start bytes in a protocol?

The first (though the first may also cause the latter on the next block).
But also about the more mundain "the checksum of the header doesn't match"
(I intend to checksum the header and data seperatily, with both checksums
stored in the header), meaning I cannot trust the data size field to be
correct.

Post by Charlie Gibbs
If it's a missing start bit, you're going to get framing errors so you'll
know that something isn't right.

Detecting that something goes wrong is not the problem. What is is what I
should do next.

The moment I want something else than sending blocks in a lock-step fashion
(send, wait for ack, send, wait for ack, etc) (read: no blocksyncing with
the receiver takes place) I will need to be able to find another way to
detect where the malformed block stops and the next one starts.

Post by Charlie Gibbs
I presume you'll be working in a particularly noisy environment

Not per se, but I'm considering that possibility too, as well as just having
a 3-wire connection (no easy handshakes), or one with intermediate devices
(modems). In short, I'm trying to cover my all my bases - or at least
think about them.

Bottom line: I'm not trying to hack a "just for me, just for here" solution
together.

Post by Charlie Gibbs
I presume you're sending a stream of random data in which all 256
combinations are possible.

Indeed. Although that data /could/ be text, it could as easily be not.

Post by Charlie Gibbs
e.g. with base64

Although that would always blow up the data (and that on a slow transmission
channel...) that (or something else freeing up certain bytes as being
"special") would be a possibility. It won't solve the disappearing
startbit (and resulting mis-read bits/bytes that follow and /could/ be
matching one of the special symbols) problem though.

Post by Charlie Gibbs
On the other hand, if you send ASCII text .... you can tell the end
of a record by the termination characters (e.g. CRLF).

Assuming the receiver is still in byte-sync that would be possible there,
yes.

For binary data I was thinking of introducing two or three byte-time long
pauses between the blocks (functions both as a byte as well as block
(re)sync). But that could easily turn to dust as soon as an intermediate
device (modem) is introduced (possibly applying its own blocking system
ontop of it, making such pauses disappear)...

I've even be thinking about having the transmitter abusing the BREAK signal
for that (keeping the TXD line low), but I'm not so sure that that won't
have unintended side effects ....

I've also been thinking about simply prepending a byte-sync and a
(hopefully) unique pattern to each blockheader, making it easier to scan for
the new block. (and I say "hopefully", as it won't be fun if the data
contains the same pattern ...)

In short, there are multiple possibilities. But all seem to come with their
own sets of problems. :-\

Post by Charlie Gibbs
You'll still need some sort of checksum or CRC to guard against errors,

Absolutily.

Post by Charlie Gibbs
plus some sort of ACK/NAK protocol to request retransmission of
failing blocks.

:-) Ofcourse. But thats something I will only (really) start to think of
when I know how I can resynchronise. If I can't find a dependable method,
any time spend on that is wasted. :-\

By the way: If I cannot find a method to skip over the faulty block and read
the next than I will probably just send a BREAK until data stops arriving
and than purge. After releasing it the the sender will start (re)sending
blocks again. Will be the pits on noisy lines though (thruput goes down
the drain as /lots/ of blocks get resend, including the good ones), but can
be depended on.

Post by Charlie Gibbs
Keep in mind that if the data link is as noisy as you're implying it is,
you should keep block sizes small; it's less efficient, but improves
the odds that a block will get through between line hits.

Agreed. Although that was not the reason of my blocked sending, I did
realize it was an unintended benefit. I'll probably make the size
configurable.

Regards,
Rudy Wieser

Charlie Gibbs

2019-07-01 16:00:25 UTC

Permalink

Post by R.Wieser
Charlie,

Post by Charlie Gibbs
Are you talking about a missed start bit or missing start bytes in a protocol?

The first (though the first may also cause the latter on the next block).

I wouldn't worry too much about that. As I mentioned previously, a missing
start bit will result in a framing error (likely more than one), which your
receiver can detect.

Post by R.Wieser
But also about the more mundain "the checksum of the header doesn't match"
(I intend to checksum the header and data seperatily, with both checksums
stored in the header), meaning I cannot trust the data size field to be
correct.

This is probably overkill. Most protocols I've worked with have a single
checksum at the end that encompasses the entire packet (header and data).
If something goes wrong, you'll find out soon enough.

Post by R.Wieser

Post by Charlie Gibbs
If it's a missing start bit, you're going to get framing errors so you'll
know that something isn't right.

Detecting that something goes wrong is not the problem. What is is what I
should do next.
The moment I want something else than sending blocks in a lock-step fashion
(send, wait for ack, send, wait for ack, etc) (read: no blocksyncing with
the receiver takes place) I will need to be able to find another way to
detect where the malformed block stops and the next one starts.

If you're looking for a header with a predefined format, you'll be able to
tell quite quickly. If the length field gets corrupted, you'll count the
wrong number of bytes. If you think the block is shorter than it actually
is, you'll assume that the next block starts in what it actually the middle
of the current block's data. The chances that this portion of the data
matches the layout of a block header is very small; count forward the
number of bytes indicated by your supposed next header, and the odds
decrease by the same factor again. If you think the block is longer
than it actually is, you'll probably wind up waiting for data which
won't arrive before your timeout expires. If the next block is already
coming in, you'll again be looking for a header in the middle of its data,
and get a mismatch.

Post by R.Wieser

Post by Charlie Gibbs
I presume you'll be working in a particularly noisy environment

Three-wire connections have no easy _hardware_ handshakes, but a simple
ACK/NAK protocol is enough to provide a software handshake.

Post by R.Wieser
Bottom line: I'm not trying to hack a "just for me, just for here" solution
together.

Post by Charlie Gibbs
I presume you're sending a stream of random data in which all 256
combinations are possible.

Indeed. Although that data /could/ be text, it could as easily be not.

Post by Charlie Gibbs
e.g. with base64

You're obsessing unnecessarily over the missing start bit issue.
As I pointed out earlier, the hardware will detect this and let you know.

Post by R.Wieser

Post by Charlie Gibbs
On the other hand, if you send ASCII text .... you can tell the end
of a record by the termination characters (e.g. CRLF).

Ugly. Ugly. Ugly. And unnecessary.

Post by R.Wieser
I've even be thinking about having the transmitter abusing the BREAK signal
for that (keeping the TXD line low), but I'm not so sure that that won't
have unintended side effects ....

I've been writing data collection software using serial ports for 35 years.
I can't think of the last time I saw anyone use the BREAK signal. As I
said above, ugly and unnecessary.

Post by R.Wieser
I've also been thinking about simply prepending a byte-sync and a
(hopefully) unique pattern to each blockheader, making it easier to scan for
the new block. (and I say "hopefully", as it won't be fun if the data
contains the same pattern ...)

Oh, you might get a few data bytes matching a header, but things will
fall apart soon enough. Don't worry about it - there are plenty of
far more deserving things to worry about.

Post by R.Wieser
In short, there are multiple possibilities. But all seem to come with their
own sets of problems. :-\

Post by Charlie Gibbs
You'll still need some sort of checksum or CRC to guard against errors,

Absolutily.

Post by Charlie Gibbs
plus some sort of ACK/NAK protocol to request retransmission of
failing blocks.

:-) Ofcourse. But thats something I will only (really) start to think of
when I know how I can resynchronise. If I can't find a dependable method,
any time spend on that is wasted. :-\

Again, you're overestimating the problem. I presume you haven't had much
real-world experience. You'll find that these problems aren't nearly as
subtle as you think. A lot of us have been working with it for years,
and have effective ways of dealing with it.

Post by R.Wieser
By the way: If I cannot find a method to skip over the faulty block and read
the next than I will probably just send a BREAK until data stops arriving
and than purge. After releasing it the the sender will start (re)sending
blocks again. Will be the pits on noisy lines though (thruput goes down
the drain as /lots/ of blocks get resend, including the good ones), but can
be depended on.

Please don't mention BREAK again. As I said, I haven't heard of anyone
using it for literally decades. An ACK or NAK response (wrapped in a
header if you're really paranoid) will suffice without disrupting the
data flow.

Post by R.Wieser

Agreed. Although that was not the reason of my blocked sending, I did
realize it was an unintended benefit. I'll probably make the size
configurable.

Good plan. But please remember... KISS. These problems aren't nearly
as bad as you think they are. I've written a lot of systems that send
data over serial connections hundreds of feet long, and unless the
installer wraps the cable around a big power transformer (which has
happened), you're just not going to get that many line hits. (Running
modems over a noisy phone line can be a bit more problematic...)

Study the XMODEM (and optionally YMODEM) protocol. There have been
working systems out there for a long time.

R.Wieser

2019-07-02 10:28:27 UTC

Permalink

Charlie,

As I mentioned previously, a missing start bit will result in a framing
error

I think you mean ".. will likely, eventually result ...".

A byte of which the startbit is not seen folowed by an 0xFF byte (or others
with all high bits towards its end) for instance will, as far as I can tell,
not ever cause such an error.

If you're looking for a header with a predefined format, you'll be
able to tell quite quickly.

I disagree. The /format/ may be predefined, its contents are not. Unless
you want to put some fixed-value bytes in there. But in that case I think
that having a single header-checksum would be the cheaper and more
beneficial. :-)

If the length field gets corrupted, you'll count the wrong number of
bytes.

And possibly into the next block ...

If you think the block is longer than it actually is, you'll probably
wind up waiting for data which won't arrive before your timeout
expires

Whut ?

The whole thing of the "sliding window" method is is that the send /doesn't/
wait for the receiver to acknowledge a block, and (thus) sends the next
block directly after it (until the "window" is full ofcourse).

If I could actually depend on a pause being present between blocks the
detection of a new header would be a lot simpler (I think I already
mentioned considering creating/using such a pause myself).

count forward the number of bytes indicated by your supposed next header

Bad idea. There is no way to tell what was just read (likely from within
the payload data) and thus is currently is in that field. It could be
/anything/, from zero upto 0xFFFF. (And yes, I could do a "at least a
headers worth, but no more than max block size" check, but that would not
actually help ...)

Three-wire connections have no easy _hardware_ handshakes, but a
simple ACK/NAK protocol is enough to provide a software handshake.

:-( I think I already mentioned (a few times) that I wanted to replace
such a lock-stepped solution with something else.

You're obsessing unnecessarily over the missing start bit issue.
As I pointed out earlier, the hardware will detect this and let you know.

I disagree. The hardware has got a good chance /not/ to detect it
(depending on the contents of the following byte(s)).

Post by R.Wieser
For binary data I was thinking of introducing two or three byte-time long
pauses between the blocks [snip]

Ugly. Ugly. Ugly. And unnecessary.

Ugly ? Agreed. Unnecessary ? As long as I do not see/find a better
method ....

Oh, you might get a few data bytes matching a header,

As I already said, I think there is nothing to match with.

but things will fall apart soon enough.

Absolutily, as your proposed "just skip a(n effectivily) random ammount of
bytes and try to match again" will, statistically, only cause
synchronisation only one in a max_block_size attempts (or 1/65536 if no
sanity checks on the block-size field are performed). <whistle>

Please don't mention BREAK again. As I said, I haven't heard of
anyone using it for literally decades.

:-) A "but /we/ have not used it for ages" doesn't sound like a good reason
for me to ignore it - especially as it seems to solve a problem. Is there
a problem with its use ? If so I'd like to hear. Do you have a better
solution ? Than I'd like to hear too.

But I could ofcourse just wait until the "sliding window" buffer fills up,
causing such a receive timeout, and than have both programs purge their
buffers. Would that be a better approach ? If so, why ?

An ACK or NAK response (wrapped in a header if you're really
paranoid) will suffice without disrupting the data flow.

?.....

I thought I mentioned a few times now that I do /not/ intend to use a
lock-stepped method, but a "sliding window" one instead. As such the
incoming ACK or NAK will not have much, if any possibility to affect the
timing between the ACK/NAKed and the next block (that next block, and than
some, are already in the receivers buffers)

Also, in case of the "sliding window" protocol the ACK/NAK needs to be
accompanied by /which/ block its talking about - or stand the chance that it
also gets outof sync, and ACK/NAKs the wrong blocks. So yes, it would be
wrapped - or rather part of - a (checksummed) response header.

And that means you /again/ jumped to a lock-stepped method (where sender
only transmits a new block when either the ACK or NAK is received), even
though I've mentioned several times that thats /not/ what I intend to use.
:-(

But please remember... KISS

:-) In that case I would just use the lock-stepped one (thruput be damned).

Study the XMODEM (and optionally YMODEM) protocol. There have
been working systems out there for a long time.

Yeah, I know them. In my DOS days I implemented at least one of them.
AFAIR both of them use the lock-stepped method. Z-Modem even mentions a
hack to enhance thruput on higher-latency lines by "pre-acknowledgeing" the
incoming blocks (and supposedly than just hard-abort the whole transmision
when a block checksum fails). So, of no value in regard to a sliding
window approach.

Regards,
Rudy Wieser

P.s.
Do me a favour and do NOT mention (or think of) the lock-stepped method
again. As far as I can tell solutions in regard to it are non-applicable
to the "sliding window" one. Mixing them up only confuses the matter (but
feel free to correct me if you think I'm wrong here).

Charlie Gibbs

2019-07-02 17:56:30 UTC

Permalink

Post by R.Wieser
Charlie,

As I mentioned previously, a missing start bit will result in a framing
error

I think you mean ".. will likely, eventually result ...".
A byte of which the startbit is not seen folowed by an 0xFF byte (or others
with all high bits towards its end) for instance will, as far as I can tell,
not ever cause such an error.

In theory, perhaps. In practice, you'll get a framing error within
a few bytes. Remember, the hardware is looking for a proper stop bit,
and any sort of bit shifting will likely put a zero there sooner rather
than later (unless you're sending records consisting of nothing but 0xff).

Post by R.Wieser

If you're looking for a header with a predefined format, you'll be
able to tell quite quickly.

Don't knock fixed values. You might want to have a few predetermined
record types, for instance. Or you can do the trick that Xmodem does:
send a one-byte block number followed by its inverse, and assume an
error if the two bytes don't XOR to zero.

Post by R.Wieser

If the length field gets corrupted, you'll count the wrong number of
bytes.

And possibly into the next block ...

If you think the block is longer than it actually is, you'll probably
wind up waiting for data which won't arrive before your timeout
expires

Whut ?

"Into the next block" might encounter the situation where the next block
hasn't been received yet. This can happen if we're looking at the last
block of a transmission, or if there's a considerable delay between blocks.

Post by R.Wieser
The whole thing of the "sliding window" method is is that the send /doesn't/
wait for the receiver to acknowledge a block, and (thus) sends the next
block directly after it (until the "window" is full ofcourse).
If I could actually depend on a pause being present between blocks the
detection of a new header would be a lot simpler (I think I already
mentioned considering creating/using such a pause myself).

On the other hand, you seem to be depending on a pause _not_ being present.
Either assumption is not good; you need a scheme which works whether you
get one record an hour or have them coming at you like a fire hose.

Post by R.Wieser

count forward the number of bytes indicated by your supposed next header

Indeed; the purpose of my example was to show what could go wrong.

Post by R.Wieser

Three-wire connections have no easy _hardware_ handshakes, but a
simple ACK/NAK protocol is enough to provide a software handshake.

:-( I think I already mentioned (a few times) that I wanted to replace
such a lock-stepped solution with something else.

There's nothing preventing you from using a sliding-windows protocol
on a three-wire connection.

Post by R.Wieser

You're obsessing unnecessarily over the missing start bit issue.
As I pointed out earlier, the hardware will detect this and let you know.

I disagree. The hardware has got a good chance /not/ to detect it
(depending on the contents of the following byte(s)).

I have years of experience proving the contrary. And even if your
hardware is sufficiently brain-damaged to not properly detect framing
errors, a decent checksumming protocol will quickly catch errors.

I'm not saying that you can never encounter a combination that will
fool various error-checking schemes. But in the real world, the odds
of such things are very small. More sophisticated protocols and
improved CRCs reduce the probability of an error still further.
If this wasn't so, then the billions of TCP/IP packets that flow
through the Internet each second would show lots of corruption.

<remainder snipped>

Post by R.Wieser
P.s.
Do me a favour and do NOT mention (or think of) the lock-stepped method
again. As far as I can tell solutions in regard to it are non-applicable
to the "sliding window" one. Mixing them up only confuses the matter (but
feel free to correct me if you think I'm wrong here).

I think you had better go away and do some reading and experimenting.
It's obvious to me that you have very little real-word experience
in these things. Otherwise you'd understand how well these problems
have been dealt with in the past.

Charlie Gibbs

2019-07-03 05:20:08 UTC

Permalink

Post by Charlie Gibbs
Don't knock fixed values. You might want to have a few predetermined
send a one-byte block number followed by its inverse, and assume an
error if the two bytes don't XOR to zero.

s/zero/0xff/

My bad.

R.Wieser

2019-07-03 09:25:40 UTC

Permalink

Charlie,

Post by Charlie Gibbs
In theory, perhaps. In practice, you'll get a framing error
within a few bytes.

For me you're way to eager to ignore probabilities with a low occurrence I'm
afraid. :-\

Post by Charlie Gibbs
Remember, the hardware is looking for a proper stop bit,

About that: when a startbit disappears the databit next to it may /not/ be a
Zero, otherwise the stopbit of the out-of-sync byte will fall on the next
bytes startbit - which causes a frame error (just realized that this
morning).

Post by Charlie Gibbs
(unless you're sending records consisting of nothing but 0xff).

Remember that I said that I have no idea of what might be in the datablock ?
Yes, its rather possible that a bunch of, or even a number of blocks filled
with 0xFF chars could be there (0x00 or 0xFF. The most used fillers in the
world). And thats even worse: A missed startbit will just make that byte
disappear, no help from subsequent bytes needed and no framing error either.

Post by Charlie Gibbs
Don't knock fixed values.

I don't. But as you seemed to think that checksumming the header seperatily
was already too much ...

Post by Charlie Gibbs
Or you can do the trick that Xmodem does: send a one-byte block number
followed by its inverse, and assume an error if the two bytes don't XOR to
zero.

:-) I had already written that method down in the previous reply, but
ultimatily decided not to include it.

Post by Charlie Gibbs
On the other hand, you seem to be depending on a pause _not_ being present.

No, I'm not.

Geez man, can't you understand that when using a "sliding window" method the
receiver has /no control/ over how fast one block follows the other ? The
first number of blocks /will/ come in without any (noticable) pause between
them. Than, depending on the "sliding window" size and the latency of the
line, the blocks /could/ have a delay between them (meaning: the above size
has been choosen too small), but not necessarily so.

And /if/ such a delay is present when blocks are received into the "sliding
window" FIFO *you cannot use it* because you're sucking on the other side of
it (with /lots/ of bytes in between the two ends).

Post by Charlie Gibbs
Indeed; the purpose of my example was to show what could go
wrong.

I must have missed that part.

Post by Charlie Gibbs
There's nothing preventing you from using a sliding-windows
protocol on a three-wire connection.

Apart from a the problem for which I started my current thread you mean ?
:-(

Post by Charlie Gibbs

Post by R.Wieser
I disagree. The hardware has got a good chance /not/ to detect it
(depending on the contents of the following byte(s)).

I have years of experience proving the contrary.

No, you haven't. Al you have is not having ever been in a situation where
such errors mattered. Most likely because you never-ever have used anything
else than a lock-stepped communication. :-(

Post by Charlie Gibbs
And even if your hardware is sufficiently brain-damaged to
not properly detect framing errors,

Sigh .... Or the errors are not of the framing kind, and thus pass that kind
of check without triggering it.

Post by Charlie Gibbs
a decent checksumming
protocol will quickly catch errors.

As I've mentioned a few times now, detecting errors is not the problem.
What the best way to progress afterwards (to the start of the header of the
next block) is.

Post by Charlie Gibbs
I'm not saying that you can never encounter a combination that
will fool various error-checking schemes.

Thank {deity} for at least acknowledging that !

Post by Charlie Gibbs
But in the real world, the odds of such things are very small.
More sophisticated protocols and improved CRCs reduce
the probability of an error still further.

Guess what I'm currently trying to create ...

Indeed, that "sophisticated protocol".

Post by Charlie Gibbs
I think you had better go away and do some reading and experimenting.
It's obvious to me that you have very little real-word experience
in these things.

If I thought I had, I would not have felt the need to post (my rather
explicit) question here (and I do suggest you (re-)read at least the
caption, if not the whole first post).

Post by Charlie Gibbs
Otherwise you'd understand how well these problems
have been dealt with in the past.

Which I, in that same first post, complained about that I could not find any
examples of.

And no, X-, Y- or Z-modem definitily is /not/ an example of how a "sliding
window" approach is implemented. As I already mentioned, Z-modem even
describes a "pre-acknowledge" hack which trashes its whole "ask for a resend
when something goes wrong" mechanism - and thus, I imagine, has to
hard-abort when anything goes wrong. A single noise spike on a
high-latency line ? Resend everything please. :-(

You simply have no idea of what a "sliding window" approach is, have you ?
Even though I've tried to tell you several times that a lock-stepped method
is something rather different you /still/ keep thinking from that "just wait
for a timeout" perspective.

You might have a lot of experience, but so does that guy that has pushed a
button on a certain machine for 20 years. It doesn't mean shit when you put
him infront of a different one. Whats your excuse ?

Regards,
Rudy Wieser

Charlie Gibbs

2019-07-03 16:16:13 UTC

Permalink

On 2019-07-03, R.Wieser <***@not.available> wrote:

<snip>

Post by R.Wieser
You might have a lot of experience, but so does that guy that has pushed a
button on a certain machine for 20 years. It doesn't mean shit when you put
him infront of a different one. Whats your excuse ?

You seem so sure of yourself that I'm surprised you even bother to post
questions here. What you have said is an insult to the thousands of us
who have been solving these problems for the past half a century. I have
better ways of spending my time than putting up with crap like this.

*plonk*

R.Wieser

2019-08-03 08:17:21 UTC

Permalink

Charlie,

Post by Charlie Gibbs
You seem so sure of yourself that I'm surprised you even
bother to post questions here.

:-) I'm rather sure about the questions I post, and have enough experience
to notice if the replies are actually adressing the presented problem.

Post by Charlie Gibbs
What you have said is an insult to the thousands of us
who have been solving these problems for the past half
a century.

The problem is that /you/ have (rather obviously) not worked with "these
problems", but refused to accept that, and as a result offered dumb
suggestions - like there is no need to wrap an ack/nak in a header.

And thats apart from the stupidity of pruning a problem tree (you don't need
worry about missed startbits. We never used a break signal, so you don't
need to either) even before the problem is clear to you.

Post by Charlie Gibbs
I have better ways of spending my time than putting
up with crap like this.

I'm sure you have.

But lets face it: I've tried several times to inform you of problems a
"sliding window" approach has, specifically that /no/ synchronisation is
done in between the sending of the blocks. Something that still seems to
go over your head. And yes, there /is/ a need to wrap the ack/nak in a
header (indicating which block its for). Otherwise a disappearing ack/nak
causes the whole acknowledging going outof sync.

And as a last jab at your "experience": RFC1055. Yes, SLIP. Over 30
years old. Also:
https://en.wikibooks.org/wiki/Serial_Programming/IP_Over_Serial_Connections
.

The latter made me realize that I considered the synchronisation /
delimiting as a part of the datablock itself, instead of regarding it as a
seperate layer. An approach which isolates the datablocks from each
other, causing problems with them to be of the same level as with the
X/Y/Z-modem protocols.

In other words: I've solved my problem. Thanks for your "help".

Regards,
Rudy Wieser

P.s.
I had all of the above even before you posted your "plonk" message. I just
got angry enough at your "I know it, 'cus I got /experience/!" stance -
without showing any kind of understanding (and refusing to respond to my
attempts to explain this-and-that) - that I wanted to give myself some time
to cool down. As such I just read your post today.