Monitoring a Cisco 675 with Unix
News and Events
• Palm Pilot
• Weather Station
• Temp. Readings
• Precision 420
• Cisco 675
• Loc Notes
• GPS Find
CBOS Versions and General Configuration
Update: Fri Oct 12 08:21:05 PDT 2001:
We're now running CBOS version 2.4.3. You can once again extract byte traffic from SNMP via MRTG in the same way you could for 2.3.5 ( In byte counts are still wrong or 0, so extract counts only with Out byte counts). Further, the the SNMP agent seems to be more stable to NET-SNMP queries snmpwalk. Even so we are now using telnet/expect scripts to extract the most accurate data possible, see below for our current conifguration. snmptraps are now correctly sending the eth0 based address instead of 0.0.0.0.
There are some changes regarding IP filters with 2.4.3. For instance, once you set any rule, automatically a "Deny All" rule is added for you. For some help see these two threads ( 1 , 2 ) on filtering. Also passwords are now stored MD5 encrypted. I note that a "Commander" password was inserted into the configuration even though I don't use commander? Hmmm.
Update: Fri Apr 6 08:21:05 PDT 2001:
N.B. The CBOS 2.4.1 SNMP service is different than that in previous versions, I think for making MMI (auto-provisioning) available. You won't be able to extract accurate byte traffic from SNMP via MRTG or any other SNMP query. I think I was also able to crash the SNMP agent/engine using snmpwalk. I had it stop responding to even the simplest of querys and it wouldn't work again until I rebooted the modem.
We're using version 2.3.5.012 of CBOS, and we're in bridging mode as per our ISP (olywa.net). In the version 2.2.0 queries for stats on eth0 left a curious Alarm in the log:
Feb 20 19:30:11 cisco 000:00:01:44 TCP Alarm MTU value returned by get_ip_mtu was zero
Check for your version using the CBOS command "show version:"
cbos>show version Cisco Broadband Operating System CBOS (tm) 675 Software (C675-I-M), Version v2.3.5.012 - Release Software Copyright (c) 1986-2000 by cisco Systems, Inc. Compiled May 9 2000 15:20:16 NVRAM image at 0x10359d90 *** RFC1483 Bridging Mode Enabled ***
Note that I do not advocate or endorse updating your CBOS. If you choose to upgrade, I strongly encourage you know exactly what you are doing and how to recover from a disaster. I've had to recover from a fouled upgrade, so don't fool yourself by thinking it won't happen to you. See also Setting up the Cisco 675, for general information and other links. Postyware is a site where one can find CBOS images for both the Cisco 675 and 677 DSL modems.
Measurement and Telemetry with Unix tools
It's very easy to setup the cisco 675 for remote syslogging. The CBOS commands to have the cisco send syslog messages to a remote host:
cbos#set syslog on SYSLOG is enabled cbos#set syslog remote 192.168.1.254 SYSLOG will now send messages to 192.168.1.254
Cisco sends syslog traffic with the UUCP facility, so in your syslog configuration, override the usual UUCP handling and instead do something like this:
*.info,uucp.none /var/log/messages uucp.* /var/log/cisco.log
I've setup our Cisco 675 to be monitored with SNMP for packet transfer information and have this information presented via MRTG.
I also set SNMP traps and send them to a host running snmptrapd (from net-snmp). This daemon can be setup to execute programs/scripts on recieving certain traps: I've set it up to send me mail when the modem sends a "link-up" or "cold start" trap.
I'm doing this in a NetBSD environment, but I'm sure FreeBSD, or Linux would serve just as well.
The cisco 675 CBOS configuration for 2.3.5 and above goes like this:
cbos#set snmp on SNMP enabled cbos#set snmp manager SET SNMP MANAGER takes 5 arguments IP Address, Community, [read] [write] [both], enable/disable, all/critical i.e. set snmp manager 10.0.0.2 public read on all The above means that 10.0.0.2 is the IP address of the SNMP Manager who will use the community string public and has permission to read and also receives all types(both critical and informational) of SNMP trap messages cbos#set snmp manager 184.108.40.206 foobar both enable all Added SNMP Manager
Here is the MRTG configuration we use with CBOS versions 2.4.1, and 2.4.3:
######## # # 640 kbits/s = 655360 bits/s = 81920 Bytes/s # * .87 = 71270 Bytes/s = 70 kB/second # # 272 kbits/second = 278528 bits/s = 34816 Bytes/s # * .87 = 30290 Bytes/s = 29.58 kB/second # # 512 kbits/s = 524288 bits/s = 65536 Bytes/s # * .87 = 57016 Bytes/s = 56 kB/second # # 218 kbits/s = 222822 bits/s = 27853 Bytes/s # * .87 = 24232 Bytes/s = 23.67 kB/s # Target[cisco]: `/bin/cat /usr/local/etc/.ciscoby` AbsMax[cisco]: 81920 MaxBytes1[cisco]: 71270 MaxBytes2[cisco]: 30290 Title[cisco]: Cisco 675 ((WAN0-0 / WAN Port) PageTop[cisco]: <H1>Traffic Analysis for Cisco 675 RADSL Modem</H1> <TABLE> <TR><TD>System:</TD><TD>Cisco 675 RADSL Modem</TD></TR> <TR><TD>Interface:</TD><TD>ATM Bridge (WAN0-0 / WAN)</TD></TR> <TR><TD>IP:</TD><TD>None (bridge)</TD></TR> <TR><TD>Max Speed:</TD><TD>70/30 KBytes/s</TD></TR> </TABLE> # Cisco Signal/Noise measurement # Target[cisco.sn]: `/bin/cat /usr/local/etc/.ciscosn` MaxBytes[cisco.sn]: 45 Title[cisco.sn]: Cisco S/N PageTop[cisco.sn]: <H1>Cisco S/N for DSL network</H1> YLegend[cisco.sn]: db ShortLegend[cisco.sn]: db LegendO[cisco.sn]: S/N: Unscaled[cisco.sn]: dwmy Options[cisco.sn]: nopercent, gauge # Cisco CRC and RS errors # Target[cisco.errs]: `/bin/cat /usr/local/etc/.ciscoerrs` MaxBytes[cisco.errs]: 500 Title[cisco.errs]: Cisco RS and CRC errors PageTop[cisco.errs]: <H1>Cisco RS and CRC Errors for DSL network</H1> YLegend[cisco.errs]: errs/hr ShortLegend[cisco.errs]: errs/hr LegendI[cisco.errs]: RS : LegendO[cisco.errs]: CRC: ThreshMaxO[cisco.errs]: 400 ThreshProgO[cisco.errs]: /usr/local/etc/domail Options[cisco.errs]: nopercent, perhour # Cisco RS errors # Target[cisco.rs]: `/bin/cat /usr/local/etc/.ciscors` MaxBytes[cisco.rs]: 100 Title[cisco.rs]: Cisco Reed Solomon errors PageTop[cisco.rs]: <H1>Cisco Reed Solomon Errors for DSL network</H1> YLegend[cisco.rs]: errs/sec ShortLegend[cisco.rs]: errs/sec LegendI[cisco.rs]: Uncorrected RS: LegendO[cisco.rs]: Corrected RS:Where /usr/local/etc/.ciscoby is created with the script cisprep. This script runs cisco.ex on the host that speaks directly to the cisco675. The output is piped back to cisprep, parsed into the various data we want to plot and left in a handful of files. Cisprep is run just before mrtg is run, both through CRON. For example
*/5 * * * * if test -x /etc/mrtg.conf; then /usr/local/etc/cisprep \ > /dev/null 2>&1 && /usr/pkg/bin/mrtg /etc/mrtg.conf > /var/log/mrtg.log 2>&1; fi
(I often use the execute bit on a configuration file to indicate whether I want that service to run).
Previously with CBOS 2.3.5:
Here is the MRTG configuration we used with 2.3.5 It turns out that much of the SNMP measurement facility is broken on the current releases of the CBOS operating system for the 675. What does seem to work is SNMP measurement of outgoing byte and packet counts. I track outbound traffic on wan0 (SNMP ifindex 2), and outbound traffic on eth0 (SNMP ifindex 1) to simulate inbound traffic on wan0.
######## # # 87% of theoretical is real max # # 640 kbits/s = 655360 bits/s = 81920 Bytes/s # * .87 = 71270 Bytes/s = 70 kB/second # # 272 kbits/second = 278528 bits/s = 34816 Bytes/s # * .87 = 30290 Bytes/s = 29.58 kB/second # # 512 kbits/s = 524288 bits/s = 65536 Bytes/s # * .87 = 57016 Bytes/s = 56 kB/second # # 218 kbits/s = 222822 bits/s = 27853 Bytes/s # * .87 = 24232 Bytes/s = 23.67 kB/s # # Target[cisco]: ifOutOctets.1&ifOutOctets.2:public@mycisco675 AbsMax[cisco]: 81920 MaxBytes1[cisco]: 71270 MaxBytes2[cisco]: 30290 Title[cisco]: Cisco 675 ((WAN0-0 / WAN Port) PageTop[cisco]: <H1>Traffic Analysis for Cisco 675 RADSL Modem</H1> <TABLE> <TR><TD>System:</TD><TD>Cisco 675 RADSL Modem</TD></TR> <TR><TD>Interface:</TD><TD>ATM Bridge (WAN0-0 / WAN)</TD></TR> <TR><TD>IP:</TD><TD>None (bridge)</TD></TR> <TR><TD>Max Speed:</TD><TD>70/30 KBytes/s</TD></TR> </TABLE> WithPeak[cisco]: wm
I have also arranged to collect information on a periodic basis to track error counts and S/N. This is done via an expect script and some perl scripts (ciserr2mrtg>, cissn2mrtg) to pull the data from the expect output and translate it into records that MRTG can use. More MRTG configurations:
Target[cisco.errs]: `/usr/local/etc/ciserr2mrtg` MaxBytes[cisco.errs]: 1000 Title[cisco.errs]: Cisco errors PageTop[cisco.errs]: <H1>Cisco Errors for DSL network</H1> YLegend[cisco.errs]: errs/hr ShortLegend[cisco.errs]: errs/hr LegendI[cisco.errs]: RS: LegendO[cisco.errs]: CRC: Options[cisco.errs]: nopercent, perhour #Unscaled[cisco.errs]: dwm Target[cisco.sn]: `/usr/local/etc/cissn2mrtg` MaxBytes[cisco.sn]: 50 Title[cisco.sn]: Cisco S/N PageTop[cisco.sn]: <H1>Cisco S/N for DSL network</H1> YLegend[cisco.sn]: db ShortLegend[cisco.sn]: db LegendI[cisco.sn]: S/N: LegendO[cisco.sn]: S/N: Options[cisco.sn]: nopercent, gauge
cisstat is also a script I use for immediate inspection of the modem:
% cisstat Name Rate U/D Power U/D S/N (db) wan0 272/640 0.7/6.7 38 Name Ipkts CRCerrs RSerrs Opkts Oerrs eth0 184536 0 0 275360 0 wan0-0 93300 554 1% 81/11966 86288 31 0%
Cisco 675, Qwest and Line Noise: My Story, April 2001
After Qwest updated the DSLAMS in our area, and our ISP trouble shot all of their connections, I could see that we were still getting significant error rates (CRC and RS errors). As I was able to determine later, at the worst of times this was in the range of 500 uncorrected RS errors per second without any significant TCP/IP traffic.
The progression of events was
Elapsed time of events: 2-3 hours between retrains
I poked around in the cisco configuration, and also I noted from a couple of websites that you can control the local and remote transmission power, but initially this didn't change things.
I ended up calling Qwest for troubleshooting. They had me go through resetting the nvram and setting the modem back to the default configuration. That didn't change anything. I was eventually assigned a line test ticket. When I got home in the afternoon the next day, A tech called me. After disconnecting the phones in the house, he checked the line quality: he came back with an apparent loop length of 4800ft (As the crow flies, we're about 2500ft from our CO, so I guess this is about right), and reported no errors, and a pretty clean line.
After speaking to him and reconnecting the rest of the house, the connection statistics changed dramatically. I watched it overnight and the condition of the connection is still in great shape. Errors are on the order of a few an hour, and altough the S/N has dropped to 28db or so, it's solid, +/- 1 db. Not like the variation of 10 db I was seeing before. The next day I called back, waited for 90min to talk to a tech about this. When I got the tech, she said it may have been a "locked port at the CO." Maybe caused by lightning, or some other event (It's been some time since we've had lightning) She said sending the test signal down the line probably "unlocked the port."
For illustrative purposes, I've captured the images from MRTG to allow you to see before and after: (14hrs is about the time I was playing with the modem, and then got a call from the tech. Time increases right to left).
First: packet transfer (blue, 100% = 0% loss) and RTT to ISP router (green, ms)
Second: S/N (blue). In this graph you can see the difference between highly
variable S/N ratio and the very constant ratio after the change. You can also
see where I started at a S/N of 24 db and was getting some non zero rate of
corrected RC errors (see below), when I ajusted the txpower to #4 I jumped
to a S/N of 28 db and almost completely lost all errors, execpt on a sporatic
Third: CRC (blue) and bulk RS (green)errors per hour. The RS errors appear
in blocks, in the early half the graph because they rapidly outstrip the
range of this graph.
Fourth: corrected RS (blue) and uncorrected RS (green) errors per second.
In the early part of the graph (right hand side) you can see the increasing
rate of both corrected and uncorrected errors until they are chopped off:
that's the point at which the modem would retrain.
From our base configuration I changed two things:
Update : 5/6/2001
Oddly enough I also learned that getting the best S/N does not result in the lowest number of uncorrected Reed Solomon errors. When set to txpower as above, my S/N went up, but my uncorrected Reed Solomon errors also went up significantly. I couldn't keep my modem trained for more than 1/2 hour at a time. I am currently using txpower 1 (default), and alough I get a substantial number of corrected RS errors (40/second), I get very few uncorrected RS errors, and the modem stays trained for 10's of hours at a time. CRC errors, while I do get them on occasion, are not due to a storm of uncorrected RS errors, but are more likely related to distant traffic (nothing I can do about that) and they come in ones, and twos.