Recently in Solaris 10 Category

I ran into this very cryptic one while setting up Nagios at Joyent. I copied my plugins from one nrpe client to a new server. Three of my checks used check_procs which all failed with a message like this:

check_procs
System call sent warnings to stderr: pst3: This program can only be run by the root user!

To make this even more annoying, sudo did not fix it. The same error message was displayed. What was the problem? File permissions! The error message should say “This program must be owned by the root user!” The fix:

sudo chmod root:root pst3
0 Votes

If you’re new to Solaris, like me, you may have noticed an annoying limitation of the default ps program built into Solaris. It truncates the commands at 80 characters! To make matters worse, the man pages don’t mention anything about this issue.

The reason the man pages offer no help is because this is a kernel issue. It was decided back in 1994 that 80 characters should be enough. Since the kernel does not store more than 80 characters, ps has no way of accessing this information. Ironicly, later that decade the same company invented their “language to end all languages”, Java. Unfortunately Java’s design typically resulted in CLASSPATH values much longer than 80 characters prepended to many commands.

There is an easy solution, though. Rather than fixing the problem Sun left us the old ps tool. While this does not use their new fancy API and is probably much less efficient, it somehow has access to the full command for all running processes. It’s officially deprecated, so use at your own risk, but here’s the equivalent of ps -ef:

/usr/ucb/ps axww

Note the different path to the command. Time to set up an alias.

0 Votes

I'm using Nagios to monitor some services on my Solaris 10 systems hosted at Joyent. Until now I've just been using check_http to monitor everything that I cared about. Times change, though, and now I need to monitor disk space, free memory, and cpu load on many systems. I like to keep things simple, so I decided that it's time to install NRPE.

Building Nagios 3 and the other plugins was a breeze so I figured that this would be no problem. I downloaded NRPE and did the typical install steps. This is what I saw:

$  ./configure
... lots of configure output ...
$  gmake all
cd ./src/; gmake ; cd ..
gmake[1]: Entering directory `/home/eng/nrpe-2.12/src'
gcc -g -O2 -I/usr/local/include/openssl -I/usr/local/include -DHAVE_CONFIG_H -o nrpe nrpe.c utils.c -L/usr/local/lib  -lssl -lcrypto -lnsl -lsocket  ./snprintf.o 
nrpe.c: In function `get_log_facility':
nrpe.c:617: error: `LOG_AUTHPRIV' undeclared (first use in this function)
nrpe.c:617: error: (Each undeclared identifier is reported only once
nrpe.c:617: error: for each function it appears in.)
nrpe.c:619: error: `LOG_FTP' undeclared (first use in this function)
gmake[1]: *** [nrpe] Error 1
gmake[1]: Leaving directory `/home/eng/nrpe-2.12/src'

*** Compile finished ***

If the NRPE daemon and client compiled without any errors, you
can continue with the installation or upgrade process.

Read the PDF documentation (NRPE.pdf) for information on the next
steps you should take to complete the installation or upgrade.

Eeek! That sure is an ugly error. At first I assumed that this was a configuration issue, but that should have come up during the ./configure. I ended up doing what you're never supposed to do: I hacked the code. The rest of the installation went by the book.

Just go on into src/nrpe.c and delete the only two line references to LOGAUTHPRIV and LOGFTP. In v2.12 I found them in the middle of an if-else series.

0 Votes

I was making a release today to one of my servers at Joyent. As part of the release I ran a short script written in Java. Java complained that it could not allocate memory to create a JVM! This is a bad sign on a production system. After some poking using top and the much more useful (on Solaris) prstat, I discovered that /lib/svc/bin/svc.configd was taking up 95% of my memory! It appears to have a memory leak.

I checked out the brief man page. It seemed pretty important so I was afraid to kill the process. Some googling around for a restart solution proved my fears baseless. It's OK to kill this process. It will restart by itself.

I killed svc.configd and it came back right away without incident. My memory was freed up.

Time to start monitoring memory usage on my opensolaris zones.

0 Votes

About this Archive

This page is an archive of recent entries in the Solaris 10 category.

Ruby on Rails is the previous category.

SVN is the next category.

Find recent content on the main index or look in the archives to find all content.

Solaris 10: Monthly Archives