Next Previous Contents

5. OpenNMS Administration

The OpenNMS server requires Postgres, Sun's J2SDK 1.4, and Apache Tomcat to run. The front-end and back-end operate independently, allowing a back-end (monitoring and database updating) restart without interruption to the front-end (HTML user interface and database querying). Conversely, the front-end can be disabled without stopping the back-end from monitoring. The later is partcularily useful when performing opennms package upgrades or making configuration changes. The Postgres database must be operational in order for either the front-end user interface or the back-end monitoring to function. For more resources on the inner workings of OpenNMS check out the section near the end of this document, Further Information.

5.1 Server Startup and Shutdown

OpenNMS must be <@@ref>quicklinuxconfigured at this point. If you start the server without a proper configuration, ERROR and WARN messages will show up in your log files. Configuration includes adding your ip addresses to the appropriate places in capsd-configuration.xml, discovery-configuration.xml, snmp-config.xml. Make appropriate modifications to your poller-configuration.xml to disable polling of certain services (telnet on Cisco). For quick node name resolution (or if you do not want DNS delays to affect processing or DNS queries to generate network traffic) modify your /etc/hosts with appropriate device names. SNMP names can also be used if your devices are configured properly.

The Postgres server must be running prior to starting the back-end (OpenNMS pollers) or the front-end (web interface). The typical startup order is:

  1. bash#/etc/init.d/postgresql start (Postgres)
  2. bash#/etc/init.d/opennms start (OpenNMS)
  3. bash#/etc/init.d/tomcat4 start (Tomcat User Interface)

Shutdown is normally the reverse process except with the stop directive, rather than start . It is important that Postgres starts first and is shutdown last. Of course this happens automatically with the init scripts on a complete system startup or shutdown.

5.2 Adding a new Monitored Host

The following command would add 192.168.0.1 as a new monitored host and initiate discovery of default services.

/usr/share/opennms/bin/send-event.pl uei.opennms.org/internal/discovery/newSuspect -i 192.168.0.1

5.3 Enabling Maps

  1. Enable the feature in opennms
    bash#cp /etc/opennms/map.disable /etc/opennms/map.enable
    
     
    
  2. Restart OpenNMS
    bash#/etc/init./opennms restart
    
     
    

5.4 Database Maintenance

This section covers very basic operations on the ONMS Postgres database.

How to Perform a database Vacuum

If it is suspected that the database has gained entropy, this is a good idea.

  1. Stop OpenNMS and Tomcat (Though it could probably work while ONMS is running).
    bash~#/etc/init.d/tomcat4 stop
    bash~#/etc/init.d/opennms stop
    
     
    
  2. From the root user, become the Postgres user.
    bash~#su - postgres
    
     
    
  3. Perform a vacuum on the OpenNMS database.
    bash~$/usr/lib/postgresql/bin/vacuumdb -v -f -d opennms 
    
     
    
  4. Restart OpenNMS and Tomcat.
    bash~#/etc/init.d/opennms start
    bash~#/etc/init.d/tomcat4 start
    
     
    


How to Purge and Recreate the OpenNMS Database

This process can be used to 'reset' the server's database while preserving the OpenNMS configuration data. A system admin that suspects that a problem or corruption in the database can use this process to clean out the entire Postgres database, and restart it using the current configuration. i.e. Current device nodes will be preserved, however events for those nodes will be purged. The RRD data will be preserved unless manually removed from /usr/share/opennms/share/.

  1. Become the Postgres user from the root user.
    bash~#su - postgres
    
     
    
  2. Purge the OpenNMS database.
    bash~$dropdb opennms
    
     
    
  3. Become the root user again
    bash~$exit
    
     
    
  4. Recreate the OpenNMS database (as root user).
    bash~#/usr/share/opennms/bin/install.pl -q /usr/share/opennms/etc/create.sql -l /usr/lib/postgresql/lib/opennms 
    
     
    

5.5 Backing Up and Restoring OpenNMS Configuration

Three backup files are created with this process. In our example they will be 062303_onms_sql.tar.gz, 062303_onms_etc.tar.gz and 062303_onms_rrd.tar.gz. These files should be created at the same time to ensure that the rrd data, the SQL database and the configuration files match. It is necessary to capture the configuration and the databases at the same time in order to start an ONMS server with restored data.

Stop OpenNMS and Tomcat

(Not tested, though this may work while ONMS is running with only the web UI shutdown).

bash~#/etc/init.d/tomcat4 stop
bash~#/etc/init.d/opennms stop

Dump and Backup the OpenNMS SQL database

  1. Shutdown Tomcat4
    bash~#/etc/init.d/tomcat4 stop
    
     
    
  2. Shutdown OpenNMS
    bash~#/etc/init.d/opennms stop
    
     
    
  3. Become the Postgres user
    bash~#su - postgres
    
     
    
  4. Perform a vacuum on the OpenNMS database.
    bash~$/usr/lib/postgresql/bin/vacuumdb -v -f -d opennms 
    
     
    
  5. Dump all databases (This is OK for most users where OpenNMS is the only Postgres database)
    bash~$pg_dumpall > 062303_onms_sql
    
     
    
  6. Tar the dump
    bash~$tar cvfz 062303_onms_sql.tar.gz 062303_onms_sql
    
     
    
  7. Optionally move the tarball somewhere. A simple backup script could perform this task.
    bash~$scp 062303_onms_sql.tar.gz backupuser@backupserver:
    
     
    
  8. Optionally remove the dump files once they have been transferred
    bash~$rm 062303_onms_sql* 
    
     
    
  9. Logout of the postgres user
    bash~$exit
    
     
    

Backup the OpenNMS Configuration files

  1. Tar up the configuration.
    bash~#tar cvfzP 062303_onms_etc.tar.gz /etc/opennms/* -R
    
     
    
  2. Optionally move it somewhere.
    bash~#scp 062303_onms_etc.tar.gz backupuser@backupserver:
    
     
    
  3. Optionally remove the tarball once it has bee transferred.
    bash~#rm 062303_onms_etc.tar.gz
    
     
    

Backup the OpenNMS RRD database files and reports

  1. Tar up the configuration.
    bash~#tar cvfzP 062303_onms_rrd.tar.gz /usr/share/opennms/share/* -R
    
     
    
  2. Optionally move it somewhere.
    bash~#scp 062303_onms_rrd.tar.gz backupuser@backupserver:
    
     
    
  3. Optionally remove the tarball once it has been transferred.
    bash~#rm 062303_onms_rrd.tar.gz
    
     
    

Restore a Previous OpenNMS Backup

  1. Install Debian and OpenNMS if they are not already setup on the destination server.
  2. From the new OpenNMS server transfer the backup files as root
    bash~#scp backupuser@backupserver:062303* .
    
     
    
  3. Become the database superuser from root
    bash~#su - postgres
    
     
    
  4. Purge the current (perhaps empty) OpenNMS database.
    bash$dropdb opennms
    
     
    
  5. Become the root user again
    bash~$exit
    
     
    
  6. Recreate the OpenNMS database (as root user).
    bash~#/usr/share/opennms/bin/install.pl -q /usr/share/opennms/etc/create.sql -l /usr/lib/postgresql/lib/opennms 
    
     
    
  7. Become the database superuser from root
    bash~#su - postgres
    
     
    
  8. Untar and restore the OpenNMS SQL database
    bash~$tar xvfz /root/062303_onms_sql.tar.gz
    bash~$psql -f 062303_onms_sql opennms
    
     
    

    There will be some ERROR messages regarding structures that already exist, these can be safely ignored.

  9. Become the root user again
    bash~$exit
    
     
    
  10. Untar the OpenNMS configuration
    bash~#tar xvfzP 062303_onms_etc.tar.gz
    
     
    

    Note about using tar: If the backups were not created with the 'P' option to preserve their relative location, the files can be untarred appending the equivalent '--directory /' to make them relative to the root directory.
  11. Delete the RRD database files and reports that currently exist. (left over if this ONMS server had other uses)
    bash~#rm /usr/share/opennms/share/* -rf
    
     
    
  12. Restore the RRD database and report file backup
    bash~#tar xvfzP 062303_onms_rrd.tar.gz
    
     
    
  13. Remove the local backup tarballs (optional)
    bash~#rm 062303*
    
     
    

Restart Tomcat4 and OpenNMS

  1. Restart tomcat4
    bash#/etc/init.d/tomcat4 start
    
     
    
  2. Restart OpenNMS
    bash#/etc/init.d/opennms start
    
     
    

5.6 Some Security for the OpenNMS server

Modifying TCP Wrappers

With the configuration described above, and the sshd deamon installed and running via the apt-get install ssh command, it is a good idea to use TCP wrappers to ensure that only ssh clients can gain access to the OpenNMS system without a local console. This secondary security ensures a primary level of security (assuming there is also a firewall configured) is maintained at all times. This is particularily useful when modifying and/or testing new firewall configurations on-the-fly using a Gtk package such as fwbuilder. Below are the/etc/hosts.allow and /etc/hosts.deny that allow only ssh clients to connect to the OpenNMS server (Assuming MTA is above with .forward files in home directory, likely running as a daemon which for exim is detected by /etc/init.d/exim which scans the inetd.conf file at startup).

/etc/hosts.allow

SSHD:ALL

/etc/hosts.deny

ALL:ALL

Modifying inetd.conf

A basic OpenNMS installation requires no services to be launched via the inetd 'super-server'. If the OpenNMS server is being monitored by itself or another OpenNMS server it is a good idea to run services like exim as daemons rather than from inetd 'super-server' to avoid service startup timeouts that can generate false outages. (The exim mail transport service does this periodically when run from the inetd 'super-server'). Additionally internal services such as daytime and discard are legacy an not required for present day operation and can be used by outsiders to gain knowledge of a specific system configuration. Following the basic installation outlined in this document the inetd.conf should appear with all lines commented out. An example of the commented out inetd.conf is below:

# /etc/inetd.conf:  see inetd(8) for further informations.
#
# Internet server configuration database
#
# Lines starting with "#:LABEL:" or "#<off>#" should not
# be changed unless you know what you are doing!
#
# If you want to disable an entry so it isn't touched during
# package updates just comment it out with a single '#' character.
#
# Packages should modify this file by using update-inetd(8)
#
# <service_name> <sock_type> <proto> <flags> <user> <server_path> <args>
#
#:INTERNAL: Internal services
#echo           stream  tcp     nowait  root    internal
#echo           dgram   udp     wait    root    internal
#chargen        stream  tcp     nowait  root    internal
#chargen        dgram   udp     wait    root    internal
#discard                stream  tcp     nowait  root    internal
#discard                dgram   udp     wait    root    internal
#daytime                stream  tcp     nowait  root    internal
#daytime        dgram   udp     wait    root    internal
#time           stream  tcp     nowait  root    internal
#time           dgram   udp     wait    root    internal
 
#:STANDARD: These are standard services.
 
#:BSD: Shell, login, exec and talk are BSD protocols.
 
#:MAIL: Mail, news and uucp services.
#smtp            stream  tcp     nowait  mail    /usr/sbin/exim exim -bs
 
#:INFO: Info services
 
#:BOOT: Tftp service is provided primarily for booting.  Most sites
# run this only on machines acting as "boot servers."
 
#:RPC: RPC based services
 
#:HAM-RADIO: amateur-radio services
 
#:OTHER: Other services
#<off># netbios-ns      dgram   udp     wait    root    /usr/sbin/tcpd  /usr/sbin/nmbd -a
#<off># netbios-ssn     stream  tcp     nowait  root    /usr/sbin/tcpd  /usr/sbin/smbd

Once the change is complete the inetd daemon and exim mail transport service should be restarted using the commands below. (otherwise SMTP will be down). Exim will automatically detect the change to inetd.conf and launch itself as a daemon with future reboots.

bash~#/etc/init.d/inetd restart
bash~#/etc/init.d/exim restart

Simple iptables firewall

This would be -> New Firewall in fwbuilder, turn on/off some defaults with basic explaination.. VPN, SNMP, FTP/SSH/WEB, MAIL/DNS, NFS/SMB, SQL, ICMP...use the GUI. 

5.7 OpenNMS Log Files

The ONMS system logs all activity to /var/log/opennms. Messages within the log files contain either FATAL, ERROR, WARN, INFO or DEBUG tags. The size and verbosity of the log files can be controlled by modifying /etc/opennms/log4j.properties. By default, each file is set to INFO level (not as verbose as DEBUG) and will grow to 100MB. Each log file will "spawn" four times (i.e. 400MB total). All of these settings can be changed.


Viewing Error Messages

To view error messages that contain the ERROR tag, use the following command:

bash#grep "ERROR" /var/log/opennms/* | more

Replacing ERROR with another tag such as FATAL or WARN will display all log messages with that particular tag. Logs should contain no FATAL messages, minimal ERROR messages (ideally none), and possibly many WARN and INFO messages that should be at least be fully understood.

A command like watch -d "ls -al" allows one to monitor exactly what log files are changing while OpenNMS is restarting. Tarus describes the process of using these commands together as:

Watch the output of the "watch" command. The log files should steadily grow. First eventd.log, then capsd and collectd (usually the largest), followed by poller and finally threshd. After threshd.log has some content, you should see rtc.log and then rtcdata.log populate. When rtcdata.log has data, "Calculating" should be gone. If it stops before then, do this in the logs directory: grep FATAL * grep ERROR * and look for anything suspicious. -T

Related log messages will often share a common 'Pool-Fibre' as shown below. Time and Date can be used to correlate messages between different log files. Always try to determine the root cause, or initial error message as often the final error message is preceded by more meaningful log messages. Often related log messages are separated by other unimportant messages so creative use of grep and | is often required. Remember that there can be many situations that result in the same message. For help on troubleshooting messages please see the <@@ref>log_troubleLog Troubleshooting section.

A log message example

capsd.log

A device has other private interfaces that are not reachable by ONMS (but automatically detected and added to the interface table)

INFO  [Capsd Rescan Pool-fiber1] FtpPlugin: FtpPlugin: Unable to test host 10.8.9.2, no route available...
WARN  [Capsd Rescan Pool-fiber1] IfCollector: IfCollector: No route to host 10.8.9.2, continuing protocol scans. 
INFO  [Capsd Rescan Pool-fiber1] TcpPlugin: TcpPlugin: Could not connect to host 10.8.9.2, no route to host 

Resetting the Log Files

Log files can grow very quickly making it difficult to find valuable information within. In order to minimize the effort required to troubleshoot problems, it is a good idea to backup the current logs and purge them before performing an OpenNMS restart. DO NOT DELETE /var/log/* as this will disable the Linux server. The procedure below outlines a typical restart scenario where the log files are backed up to the directory /root .

  1. Stop OpenNMS
    bash#/etc/init.d/opennms stop
    
     
    
  2. Tar up the current log files
    bash#tar cvfz /root/061703_onms_logs.tar.gz /var/log/opennms/*
    
     
    
  3. Reset the OpenNMS log directory
    bash#rm -rf /var/log/opennms/*
    
     
    
  4. Start OpenNMS
    bash#/etc/init.d/opennms start
    
     
    


Next Previous Contents