Modbus RS485 Half-Duplex Communications Faults

I've been using the serial interface cards with few issues - rarely have problems getting devices to communicate. Lately I have one that has us stumped. The instrument in question is a Sodium analyzer (manufactured in France, I believe) whose configuration provides the usual settings for baud rate, parity, and so on. It is called out in its manual as "Modbus RTU . . . RS 485 link is very easy to use; installation of the twisted pair does not require any specific knowledge . . ."

Initially we added it to an existing twisted-pair  network with seven other devices (another analyzer, a WiHart Gateway, and five MTL 8000 remote IO boxes). This network is running 19.2 kbaud / even partity etc. and we set the new analyzer to match. Whenever we gave it an address the DCS was polling, not only would it not communicate, but several other devices on the network would start having faults. Made many attempts, adding / subtracting termination resistors,changing polarity, swapped out the Modbus option board for a new one, experimented with port settings (transmit delay, message timeout, etc.), various network addresses, to no avail. Same behavior every time. As soon as we changed it to an address DeltaV wasn't polling, the network would clean up and normal communications would resume.

Now we have moved it to a spare serial port where it is the only device. We we slow the baud rate to 1200, no parity, we start to get a few good messages. Every few seconds "Diagnostics" toggles from "Not Communicating" to "Error Response" (CRC Error is often shown in the "View Dataset Registers" dialog box). Since I last reset the stats for the port, there have been almost 10,000 "Good" messages received versus over 670,000 "Messages with Errors".

When I do get a "Good" message, the data is correct (we are polling for 8 floating point data registers (16 bytes) currently). When I add another data set, the performance gets much worse (fewer "good" responses, which are few to begin with). I have tried polling for more or fewer registers in the primary data set. The OEM's mapping is a hodge-podge e.g. a dozen floating point, then an integer, then another few floating point, and so on.

The OEM provides in-built DIP-swicth selectable termintors on their Modbus board, which don't seem to affect much. Shield is continuous to the instrument and terminated only on the house end.

I have looked at the signals on the oscilloscope (connected at the host end) and you can see two distinct packets - on longer and high amplitude (10 V maybe?) which I think is DeltaV, immediately follwed by a shorter, lower-amplitude packet (~5 V). There is an artifact - I believe - on the response signal, like a high frequency spike on the "rising" edge of the square wave.

Anyone fought one like this before?

  • I haven't had exactly that issue but I've seen similar things and they are frequently caused by the host and the device having a different ground referance or just a device that doesn't have a very well built RS485 card.  To compensate for this I install an isolator between them ( I use one from B&B typically www.bb-elec.com/.../485OPDRI_3713ds.pdf).

  • In reply to chessley:

    That sounded like it had promise so we obtained one. So far not much luck. Lights blink but message errors still abound, worse "with" partity than with no parity. Tried ± terminators on both channels, slowed it down to 2400 baud (the slowest dip-switch selectable rate) and the ratio of "good"responses to errors is maybe a little better than straight through at 1200 baud / no parity.

    The 4850 is out in the shelter close to the analyzer. Are both ends of a 485 half-duplex cable better to have terminators, or just one? The two are only maybe 5 feet apart.

  • In reply to John Rezabek:

    If you have only the Sodium Analyzer connected in serial port spare, have you tried using RS232 instead RS485? and check the responses using peer to peer serial conection.

    Regards

  • In reply to John Rezabek:

    485, right. The receivers and transmitter of all multidropped devices share the same pair of wires and therefore only one device can be transmitting at any one time. A problem I've seen a few times now with symptoms similar to yours is due to one of the devices on your network holding on to the transmit state for too long after it has finished transmitting or disabling its transmitter while still transmitting the last character(s) of the response message.

    After transmitting, the device should switch its transmitter to high impedance ( tristate). The time of this is quite critical and the effect of a slower baud rate can often make things worse in the latter situation. If it turns off the transmitter as the last character is transmitted from the UART then it can corrupt the last character if the transmitters switch off too quickly, If the device waits for the transmit buffer to be empty and then switches the transmitter to tristate and it is slow returning to tristate you end up corrupting the first character of the next transmission from the master device, DeltaV in this case.

    I would suggest your new device has one of the poor characteristics I've described.

    If it's turning he transmitter off too early and corrupting its own transmitted message I would recommend a higher baud rate.

    If it is turning its transmitter off too slowly and corrupting the next master request message I would suggest increasing the inter transmission delay available on the DeltaV serial card port/device setting form. Put it up to 1000ms and then back it down as low as you can go before the problem comes back. Then put the value up just 100ms more than that value. the slower the device is at switching off its transmitter and a lower baud rate would require a higher value.

    It is likely to be the latter problem, based on you stating that it's upsetting other devices on the network and even when only point to point. If it was the former problem the device manufacturer wouldn't be doing much business with this device.

    The last time I had this problem, only 6 month ago, it was the second cause and  upping the inter message transmit delay to 200ms worked for me.

    Hope this makes some sense to you, and gets you towards a working system.

  • In reply to lsocorro:

    I second the thought of Isocorro, as this would rule out a fundamental packet format issue, but the fact you were getting good comms occasionally might rule that out already.

    Worth a try though as it totally rules out the 485 transmitter corruption possibility.

  • In reply to John Rezabek:

    You should have two and only two terminators on the network regardless of quantity of device. And these should be at the far end extents of the twisted pair cable. It's there to stop line reflections.

    Be careful that there are no inbuilt terminators in the devices, these can sometimes be physically switched in or out of the circuit using dip switches on the device, sometimes by software setting in the device.

  • In reply to IntuitiveNeil:

    I would like to try the RS 232 option but it doesn't appear to be an option for this instrument. I tried 19.2 kbaud with the B&B isolator and there were zero good messages. Lower baud rate helps. I will make sure I have no more than two terminators (assume the serial card has one?). With the isolator, I presume I have two independent "segments" each of which should have two terminators. Heading over to try the transmit delay to see if we get any improvement.

  • In reply to IntuitiveNeil:

    Still testing . . . hooked up a "BusHealth" scope meter and found serious distortion on the "reply" portion of the signal. It went away when the "terminator" option on the host side was switched off. BusHealth statistics all good after this at 2400 baud / no parity (purely the physical signal- looks at jitter/ overshoot / voltage levels/ rise time and maybe one or two other measurements). There is still what I believe an artifact on the trailing sqaure wave of the device message - it stays high for a couple wavelenths then has a slow decay. The slow decay seemed to conclude well before the next host message generated, but I decided to try IntuitiveNeil's transmission delay tuning suggestion. At 1000 ms the ratio of good / bad appeared to be somewhat better. Increased it as high at 10000 ms but no appreciable improvement. With all the quiet time I thought, why not increase retries, so I increased them from the default 1 to 10. Now it appears about 25 to 35% of messages have "good" replies, and the status of dataset registers remains good as well.

    B&B isolator + low baud rate + transmission delay + 10 reties looks like it might be a winner.Yes

    I will have a look at stats after lunch then try adding another dataset. If the communications hang in there I think this solution will suffice.