Controlling Weblogic Node Manager with Solaris SMF as non-root


Tags:

First, I made a few modifications to the nodemanager start/stop scripts:

(on weblogic 8.1)

in startNodeManager.sh, I added a -D tag to be able to easily spot the nodemanager process when doing a 'ps' by adding this line:

----------------------------------------------------------------------------
JAVA_OPTIONS="${JAVA_OPTIONS} -Dnodemanager"
----------------------------------------------------------------------------

and then in the actual java start commands, adding ${JAVA_OPTIONS} to each line that calls nodemanager:

----------------------------------------------------------------------------
"${JAVA_HOME}/bin/java" ${JAVA_OPTIONS} ${JAVA_VM} ${MEM_ARGS}
 -Djava.security.policy="${WL_HOME}/server/lib/weblogic.policy"
 -Dweblogic.nodemanager.javaHome="${JAVA_HOME}" 
 -DListenAddress="${LISTEN_ADDRESS}" -DListenPort="${LISTEN_PORT}"
 weblogic.NodeManager
----------------------------------------------------------------------------

...etc.

then, I created a nodemanager stop script:


stopNodeManager.sh

----------------------------------------------------------------------------
#!/bin/sh
# *************************************************************************
# This script can be used to stop the WebLogic NodeManager
#

USERNAME="weblogic"

PID=`ps -fu ${USERNAME} | grep java | grep "nodemanager" | awk '{print $2}'`

if [ ${PID} ]
then
   kill ${PID}
fi
----------------------------------------------------------------------------

verify that the paths are correct and that the scripts are starting and stopping node manager properly. Then, I created the SMF manifest (as root):


nodemanager.xml
----------------------------------------------------------------------------
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<!--
    nodemanager.xml : Weblogic NodeManager manifest, Kyle Reynolds
    2006-07-02
-->

<service_bundle type='manifest' name='nodemanager'>
<service name='application/management/nodemanager/weblogic' type='service' version='1'>

   <single_instance />

   <dependency
      name='multi-user-server'
      grouping='require_any'
      restart_on='error'
      type='service'>
      <service_fmri value='svc:/milestone/multi-user-server:default' />
   </dependency>

   <exec_method
      type='method'
      name='start'
      exec='/u01/app/weblogic/bea81sp6/user_projects/domains/mydomain/startNodeManager'
      timeout_seconds='120' >
      <method_context>
         <method_credential user='weblogic' group='weblogic' />
      </method_context>
   </exec_method>


   <exec_method
      type='method'
      name='stop'
      exec='/u01/app/weblogic/bea81sp6/weblogic81/server/bin/stopNodeManager.sh'
      timeout_seconds='120' >
      <method_context>
         <method_credential user='weblogic' group='weblogic' />
      </method_context>
   </exec_method>

        <property_group name='start' type='method'>
                <propval name='action_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
                <propval name='modify_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
                <propval name='value_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
        </property_group>
        <property_group name='stop' type='method'>
                <propval name='action_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
                <propval name='modify_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
                <propval name='value_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
        </property_group>
        <property_group name='general' type='framework'>
                <propval name='action_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
                <propval name='value_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
                <propval name='modify_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
        </property_group>

   <instance name='default' enabled='false' />

   <stability value='Unstable' />

   <template>
      <common_name>
         <loctext xml:lang='C'>NodeManager</loctext>
      </common_name>
   </template>

</service>
</service_bundle>
----------------------------------------------------------------------------

Notice the property groups and values in the above manifest, for example:

----------------------------------------------------------------------------
        <property_group name='start' type='method'>
                <propval name='action_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
                <propval name='modify_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
                <propval name='value_authorization' type='astring'
                        value='solaris.smf.manage.nodemanager/weblogic' />
        </property_group>
----------------------------------------------------------------------------

These are what will allow the user "weblogic" to start and stop the service, which, be default is only allowed for root.

So check the paths in the manifest, run (as root):

----------------------------------------------------------------------------
svccfg validate nodemanager.xml
----------------------------------------------------------------------------

to make sure there are no syntax errors, and then I had to set up RBAC to allow the "weblogic" user to manage the service.

in /etc/security/auth_attr, add (as root):

----------------------------------------------------------------------------
solaris.smf.manage.nodemanager/weblogic:::Nodemanager Management::
----------------------------------------------------------------------------

and then run the usermod command (as root):

----------------------------------------------------------------------------
usermod -A solaris.smf.manage.nodemanager/weblogic weblogic
----------------------------------------------------------------------------

now, RBAC is set up and we just need to import the manifest (as root):

----------------------------------------------------------------------------
svccfg import nodemanager.xml
----------------------------------------------------------------------------

and now, as the "weblogic" user, you can control the service:

----------------------------------------------------------------------------
% svcs -a | grep nodemanager
online         Jul_05   svc:/application/management/nodemanager/weblogic:default

% svcadm disable application/management/nodemanager/weblogic

% svcadm enable application/management/nodemanager/weblogic
----------------------------------------------------------------------------


its about time you update your shit. something interesting too.

A different approach for starting/stopping the nodemanager:

After the line(s) that start the nodemanager, store the process ID (PID) in a variable and store it in some file. Eg. do this:


pid_file="/var/run/weblogic/nodemanager.pid"
"${JAVA_HOME}/bin/java" ${JAVA_OPTIONS} ${JAVA_VM} ...
echo $! > $pid_file

For this to work, a /var/run/weglogic directory needs to be created first and needs to be writable by the user running weblogic (let's call that user weblogic). So, as root, you'd do:

mkdir -m0755 /var/run/weblogic
chown weblogic /var/run/weblogic

Having done that, stopNodeManager.sh can be as short as:

#!/bin/sh
# *************************************************************************
# This script can be used to stop the WebLogic NodeManager
#
# Fetch the value of $pid_file from the startNodeManager.sh script
eval `grep \^pid_file= ....startNodeManager.sh`
if [ -r $pid_file ]; then
kill `cat $pid_file`
rm $pid_file
fi
# EOF #

I had to do this on my site, because in the "ps" output, your (admittedly clever) "hack" of adding "-Dnodemanager" did not show up:

--($ ~)-- ps -ef | grep webl
bea81 820 16140 0 07:49:01 ? 0:17 /opt/apps/bea81/jdk142_05/bin/java -Xms512m -Xmx512m -Dweblogic.management.serv
bea81 785 16140 0 07:48:48 ? 0:17 /opt/apps/bea81/jdk142_05/bin/java -Xms512m -Xmx512m -Dweblogic.management.serv
bea81 788 16140 0 07:48:55 ? 0:17 /opt/apps/bea81/jdk142_05/bin/java -Xms512m -Xmx512m -Dweblogic.management.serv

Thanks! Nice work around for the tag not showing up in the ps.

I realize that this is common practice for many applications, but I typically try to avoid storing the PID in a file just because if it's not done correctly, if the process dies abnormally, and the PID file is not cleaned up, you end up with a confused script that thinks the process is still running, so I try to always find the PID dynamically, but as you've illustrated, that's not always reliable either...

Assuming this is solaris, did you try using a Berkley ps? /usr/ucb/ps might show the -D tag.

thanks again for the tip. I appreciate it!

Hi,

just one hint for people who like to use ps to find the nodemanager or any other java processes is to use /usr/ucb/ps instead of the default ps (/usr/bin/ps). You have to use the option 'ps -axww' to be able to see the full command including all arguments. We usually use this "trick" for starting and stopping java in both SMF as well as Sun Cluster stop/start/monitor scripts.

//Tommy

Hi WLS Masters,

If you see that (after svcs -a | grep nodemanager):
---------------------------------------------------------------------------------
maintenance 23:38:21 svc:/application/management/nodemanager/weblogic:default
---------------------------------------------------------------------------------

after enable the service, check the log (/var/svc/log/application-management-nodemanager-weblogic:default.log):

---------------------------------------------------------------------------------
[ Apr 6 00:21:17 Method or service exit timed out. Killing contract 188 ]
[ Apr 6 00:21:17 Method "start" failed due to signal KILL ]
---------------------------------------------------------------------------------

So, there is an issue :).

Solution:

add in nodemanager.xml the following rows:
---------------------------------------------------------------------------------
<property_group name='startd' type='framework'>
<propval name='duration' type='astring' value='child' />
<propval name='ignore_error' type='astring'
value='core,signal' />
<propval name='utmpx_prefix' type='astring' value='co' />
</property_group>
---------------------------------------------------------------------------------

The solution was found here: http://forums.sun.com/thread.jspa?threadID=5357141

Best Regards,

Boyan Boychev

I would not hesitate to include such topic in my Essay Example

Hi Richard,
Thanks for the excellent post.... great information and this process worked very well for us for Weblogic 10 on Solaris 10, after a couple of weeks of struggling with several issues!
Incidently, we updated the stopNodeManager.sh slightly to remove the hard coding of the username..
i.e.

USERNAME=`who am i|awk '{print $1}'`

Many thanks,

Andy