The OpenNMS server requires Postgres, Sun's J2SDK 1.4, and Apache Tomcat to run. The front-end and back-end operate independently, allowing a back-end (monitoring and database updating) restart without interruption to the front-end (HTML user interface and database querying). Conversely, the front-end can be disabled without stopping the back-end from monitoring. The later is partcularily useful when performing opennms package upgrades or making configuration changes. The Postgres database must be operational in order for either the front-end user interface or the back-end monitoring to function. For more resources on the inner workings of OpenNMS check out the section near the end of this document, Further Information.
OpenNMS must be
<@@ref>quicklinuxconfigured at this point. If you start the server without
a proper configuration, ERROR and WARN messages will show up in your
log files. Configuration includes adding your ip addresses to the appropriate
places in capsd-configuration.xml, discovery-configuration.xml, snmp-config.xml.
Make appropriate modifications to your poller-configuration.xml to
disable polling of certain services (telnet on Cisco). For quick
node name resolution (or if you do not want DNS delays to affect
processing or DNS queries to generate network traffic) modify your
/etc/hosts with appropriate device names. SNMP names can also be
used if your devices are configured properly.
The Postgres server must be running prior to starting the back-end (OpenNMS pollers) or the front-end (web interface). The typical startup order is:
bash#/etc/init.d/postgresql start (Postgres)bash#/etc/init.d/opennms start (OpenNMS)bash#/etc/init.d/tomcat4 start (Tomcat User Interface)
Shutdown is normally the reverse process except with the stop
directive, rather than start . It is important that Postgres starts
first and is shutdown last. Of course this happens automatically
with the init scripts on a complete system startup or shutdown.
The following command would add 192.168.0.1 as a new monitored host and initiate discovery of default services.
/usr/share/opennms/bin/send-event.pl uei.opennms.org/internal/discovery/newSuspect -i 192.168.0.1
bash#cp /etc/opennms/map.disable /etc/opennms/map.enable
bash#/etc/init./opennms restart
This section covers very basic operations on the ONMS Postgres database.
If it is suspected that the database has gained entropy, this is a good idea.
bash~#/etc/init.d/tomcat4 stop bash~#/etc/init.d/opennms stop
bash~#su - postgres
bash~$/usr/lib/postgresql/bin/vacuumdb -v -f -d opennms
bash~#/etc/init.d/opennms start bash~#/etc/init.d/tomcat4 start
This process can be used to 'reset' the server's database while
preserving the OpenNMS configuration data. A system admin that suspects
that a problem or corruption in the database can use this process
to clean out the entire Postgres database, and restart it using the
current configuration. i.e. Current device nodes will be preserved,
however events for those nodes will be purged. The RRD data will
be preserved unless manually removed from /usr/share/opennms/share/.
bash~#su - postgres
bash~$dropdb opennms
bash~$exit
bash~#/usr/share/opennms/bin/install.pl -q /usr/share/opennms/etc/create.sql -l /usr/lib/postgresql/lib/opennms
Three backup files are created with this process. In our example
they will be 062303_onms_sql.tar.gz, 062303_onms_etc.tar.gz and 062303_onms_rrd.tar.gz.
These files should be created at the same time to ensure that the
rrd data, the SQL database and the configuration files match. It
is necessary to capture the configuration and the databases at the
same time in order to start an ONMS server with restored data.
(Not tested, though this may work while ONMS is running with only the web UI shutdown).
bash~#/etc/init.d/tomcat4 stop bash~#/etc/init.d/opennms stop
bash~#/etc/init.d/tomcat4 stop
bash~#/etc/init.d/opennms stop
bash~#su - postgres
bash~$/usr/lib/postgresql/bin/vacuumdb -v -f -d opennms
bash~$pg_dumpall > 062303_onms_sql
bash~$tar cvfz 062303_onms_sql.tar.gz 062303_onms_sql
bash~$scp 062303_onms_sql.tar.gz backupuser@backupserver:
bash~$rm 062303_onms_sql*
bash~$exit
bash~#tar cvfzP 062303_onms_etc.tar.gz /etc/opennms/* -R
bash~#scp 062303_onms_etc.tar.gz backupuser@backupserver:
bash~#rm 062303_onms_etc.tar.gz
bash~#tar cvfzP 062303_onms_rrd.tar.gz /usr/share/opennms/share/* -R
bash~#scp 062303_onms_rrd.tar.gz backupuser@backupserver:
bash~#rm 062303_onms_rrd.tar.gz
bash~#scp backupuser@backupserver:062303* .
bash~#su - postgres
bash$dropdb opennms
bash~$exit
bash~#/usr/share/opennms/bin/install.pl -q /usr/share/opennms/etc/create.sql -l /usr/lib/postgresql/lib/opennms
bash~#su - postgres
bash~$tar xvfz /root/062303_onms_sql.tar.gz bash~$psql -f 062303_onms_sql opennms
There will be some ERROR messages regarding structures that already exist, these can be safely ignored.
bash~$exit
bash~#tar xvfzP 062303_onms_etc.tar.gz
bash~#rm /usr/share/opennms/share/* -rf
bash~#tar xvfzP 062303_onms_rrd.tar.gz
bash~#rm 062303*
Restart Tomcat4 and OpenNMS
bash#/etc/init.d/tomcat4 start
bash#/etc/init.d/opennms start
With the configuration described above, and the sshd deamon installed
and running via the apt-get install ssh command, it is a good idea
to use TCP wrappers to ensure that only ssh clients can gain access
to the OpenNMS system without a local console. This secondary security
ensures a primary level of security (assuming there is also a firewall
configured) is maintained at all times. This is particularily useful
when modifying and/or testing new firewall configurations on-the-fly
using a Gtk package such as fwbuilder. Below are the/etc/hosts.allow
and /etc/hosts.deny that allow only ssh clients to connect to the
OpenNMS server (Assuming MTA is above with .forward files in home
directory, likely running as a daemon which for exim is detected
by /etc/init.d/exim which scans the inetd.conf file at startup).
SSHD:ALL
ALL:ALL
inetd.confA basic OpenNMS installation requires no services to be launched
via the inetd 'super-server'. If the OpenNMS server is being monitored
by itself or another OpenNMS server it is a good idea to run services
like exim as daemons rather than from inetd 'super-server' to avoid
service startup timeouts that can generate false outages. (The exim
mail transport service does this periodically when run from the inetd
'super-server'). Additionally internal services such as daytime and
discard are legacy an not required for present day operation and
can be used by outsiders to gain knowledge of a specific system configuration.
Following the basic installation outlined in this document the inetd.conf
should appear with all lines commented out. An example of the commented
out inetd.conf is below:
# /etc/inetd.conf: see inetd(8) for further informations. # # Internet server configuration database # # Lines starting with "#:LABEL:" or "#<off>#" should not # be changed unless you know what you are doing! # # If you want to disable an entry so it isn't touched during # package updates just comment it out with a single '#' character. # # Packages should modify this file by using update-inetd(8) # # <service_name> <sock_type> <proto> <flags> <user> <server_path> <args> # #:INTERNAL: Internal services #echo stream tcp nowait root internal #echo dgram udp wait root internal #chargen stream tcp nowait root internal #chargen dgram udp wait root internal #discard stream tcp nowait root internal #discard dgram udp wait root internal #daytime stream tcp nowait root internal #daytime dgram udp wait root internal #time stream tcp nowait root internal #time dgram udp wait root internal #:STANDARD: These are standard services. #:BSD: Shell, login, exec and talk are BSD protocols. #:MAIL: Mail, news and uucp services. #smtp stream tcp nowait mail /usr/sbin/exim exim -bs #:INFO: Info services #:BOOT: Tftp service is provided primarily for booting. Most sites # run this only on machines acting as "boot servers." #:RPC: RPC based services #:HAM-RADIO: amateur-radio services #:OTHER: Other services #<off># netbios-ns dgram udp wait root /usr/sbin/tcpd /usr/sbin/nmbd -a #<off># netbios-ssn stream tcp nowait root /usr/sbin/tcpd /usr/sbin/smbd
Once the change is complete the inetd daemon and exim mail transport
service should be restarted using the commands below. (otherwise
SMTP will be down). Exim will automatically detect the change to
inetd.conf and launch itself as a daemon with future reboots.
bash~#/etc/init.d/inetd restart bash~#/etc/init.d/exim restart
This would be -> New Firewall in fwbuilder, turn on/off some defaults with basic explaination.. VPN, SNMP, FTP/SSH/WEB, MAIL/DNS, NFS/SMB, SQL, ICMP...use the GUI.
The ONMS system logs all activity to /var/log/opennms. Messages
within the log files contain either FATAL, ERROR, WARN, INFO or DEBUG
tags. The size and verbosity of the log files can be controlled by
modifying /etc/opennms/log4j.properties. By default, each file is
set to INFO level (not as verbose as DEBUG) and will grow to 100MB.
Each log file will "spawn" four times (i.e. 400MB total).
All of these settings can be changed.
To view error messages that contain the ERROR tag, use the following
command:
bash#grep "ERROR" /var/log/opennms/* | more
Replacing ERROR with another tag such as FATAL or WARN will display
all log messages with that particular tag. Logs should contain no
FATAL messages, minimal ERROR messages (ideally none), and possibly
many WARN and INFO messages that should be at least be fully understood.
A command like watch -d "ls -al" allows one to monitor
exactly what log files are changing while OpenNMS is restarting.
Tarus describes the process of using these commands together as:
Watch the output of the "watch" command. The log files should steadily grow. First eventd.log, then capsd and collectd (usually the largest), followed by poller and finally threshd. After threshd.log has some content, you should see rtc.log and then rtcdata.log populate. When rtcdata.log has data, "Calculating" should be gone. If it stops before then, do this in the logs directory: grep FATAL * grep ERROR * and look for anything suspicious. -T
Related log messages will often share a common 'Pool-Fibre' as
shown below. Time and Date can be used to correlate messages between
different log files. Always try to determine the root cause, or initial
error message as often the final error message is preceded by more
meaningful log messages. Often related log messages are separated
by other unimportant messages so creative use of grep and | is often
required. Remember that there can be many situations that result
in the same message. For help on troubleshooting messages please
see the
<@@ref>log_troubleLog Troubleshooting section.
A device has other private interfaces that are not reachable by ONMS (but automatically detected and added to the interface table)
INFO [Capsd Rescan Pool-fiber1] FtpPlugin: FtpPlugin: Unable to test host 10.8.9.2, no route available... WARN [Capsd Rescan Pool-fiber1] IfCollector: IfCollector: No route to host 10.8.9.2, continuing protocol scans. INFO [Capsd Rescan Pool-fiber1] TcpPlugin: TcpPlugin: Could not connect to host 10.8.9.2, no route to host
Log files can grow very quickly making it difficult to find valuable
information within. In order to minimize the effort required to troubleshoot
problems, it is a good idea to backup the current logs and purge
them before performing an OpenNMS restart. DO NOT DELETE /var/log/*
as this will disable the Linux server. The procedure below outlines
a typical restart scenario where the log files are backed up to the
directory /root .
bash#/etc/init.d/opennms stop
bash#tar cvfz /root/061703_onms_logs.tar.gz /var/log/opennms/*
bash#rm -rf /var/log/opennms/*
bash#/etc/init.d/opennms start