Writing Solaris Service Management Facility (SMF) service manifest

SMF services are basically daemons staying in background and waiting for the requests which they should server, when the request come the daemon wake ups, serve the request and then wait for the next request to come.

The services are building using software development platforms and languages but they have one common aspect which we are going to discuss here. The service manifests which describe the service for the SMF and let the SMF manage and understand the service life cycle.

To write a service manifest we should be familiar with XML and the service manifest schema located at /usr/share/lib/xml/dtd/service_bundle.dtd.1.  This file specifies what elements can be used for describing a service for the SMF.

Next thing we need is a text editor and preferable an XML aware text editor. The Gnome gedit can do the task for us.

The service manifest file composed of 6 important elements which are listed below:

  • The service declaration:  specifies the service name, type and instantiation model
  • Zero or more dependency declaration elements: Specifies the service dependencies
  • Lifecycle methods: Specifies the start, stop and refresh methods
  • Property groups: Which property groups the service description has.
  • Stability element: how stable the service interface is considering version changes
  • Template element: more human readable information for the service.

To describe a service, first thing we need to do is identifying the service itself. Following snippet shows how we can declare a service named jws.

<service name='network/jws’ type='service' version='1'>

The first line specifies the service name, version and its type. The service name attribute forms the FMRI of the service which for this instance will be svc:/network/jws. In the second line we are telling SMF that it should only instantiate one instance of this service which will be svc:/network/jws:default. We can use the create_default_instance element to manipulate automatic creation of the default instance.

All of the elements which we are going to mention in the following sections of this article are immediate child elements of the service element which itself is a child element of the service_bundle element.

The next important element is dependency declaration element. We can have one or more of this element in our service manifest.

<dependency name='net-physica' grouping='require_all ' restart_on='none' type='service'>
<service_fmri value='svc:/network/physical'/>

Here we are telling that our service depends on the svc:/network/physical service and this service needs to be online before our service can start. Some of the values for the grouping attribute are as follow:

  • The require_all which represent that all services marked with this grouping must be online before our service came online
  • The require_any which represents that any of the services in this grouping suffice and our service can become online if one of them is online
  • The optional_all presence of the services marked with this grouping is optional for our service. Our service works with or without them.
  • The exclude_all:  specifies the services which may have conflict with our service and we cannot become online in presence of them

The next important elements are specifying how the SMF should start, stop and refresh the service. For these tasks we use three exec_method elements as follow:

<exec_method name='start' type='method' exec='/opt/jws/runner start' timeout_seconds='60'>

This is the start method, SMF will invoke what the exec attribute specifies when it want to start the service.

<exec_method name='stop' type='method' exec=':kill' timeout_seconds='60'>

The SMF will terminate the process it started for the service using a kill signal. By default it uses the SIGTERM but we can specify our own signal. For example we can use ‘kill -9’ or ‘kill -HUP’ or any other signal we find appropriate for our service termination.

<exec_method name='refresh' type='method' exec='/opt/jws/runner reload_cof' timeout_seconds='60'>

The final method which we should describe is the refresh method in which we should reload the service configuration without disturbing its function.

The start and stop methods description are required to be present in the manifest in order to import it into the SMF repository but the refresh method description and implementation is optional.

The timeout_seconds=’60’ specifies that the SMF should wait for 60 seconds before aborting the method execution and retrying it over. We can use longer timeout when we know that the execution of the method take longer or lower timeout when we know that our service starts sooner.

The property group elements specify which property groups the SMF should associate with the service.  We can use as many of the SMF built-in property groups or define our own property groups.

<property_group name='startd' type='framework'>
<propval name='ignore_error' type='astring' value='core,signal'/>

The above snippet shows how we can define using a framework built-in property group. The built-in property groups and their properties have meaning for the SMF framework and change its way of dealing with our service.  For example the above snippet says that the SMF should ignore core dump termination of suppresses of this service via a kill signal from another application.

Next important element is the stability element which specifies whether the property names, service name, its dependencies and other service attributes may or may not change for the next release. For example the following snippet specifies that the service attributes may change in the next release.

<stability value='Unstable'/>

Finally we have the template element which contains descriptive information about the service, for example we can have something like the following element which describes our service for human users.

<loctext xml:lang=’Java'>Java Network Server </loctext>
<manpage title='JWS' section='1M' manpath='/usr/share/man'/>

The entire element is self describing except for the man page element which specifies where the man utility command should look for the man pages for the jws service.

All of these elements should be saved into an XML file with the following structure:
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle ...>
<service ...>
<single_instance .../>
<dependency ... />
<exec_method... />
<property_group .../>
<stability .../>
<template ...>

This is a raw representation of what the service manifest bare bone cal look like. The syntax is not accurate and the attributes and child elements are omitted to make it easier to scan by the eyes.

The service_bundle, service, and  start and stop method elements are mandatory while other elements are optional.

Now we should install the service into the SMF repository and then administrate it. For installing the service we can use the following command.

# svccfg import /path/to/manifest.xml

When we import the service manifest into the service repository, SMF will scan the service profiles and check whether this service is required to be enabled either directly or because of a dependent service or not. If required, the SMF will use the start method specified in the manifest to start the service. But before starting it, SMF will check its dependencies and start any required service recursively if they are not already online.

Reviewing the XML schema located at /usr/share/lib/xml/dtd/service_bundle.dtd.1 and the staying tuned for my next few entries is recommended for grasping a better understanding of the whole SMF and Solaris Services administration/management.

Solaris fault administration using fmadm command

In this article we will study how we can use the fmadm command to get the list of faulty components along with the detailed information about the fault.

Before starting this article we should have a command console open and then we can proceed with using the fmadm command. The most basic form of using fmadm command is using its faulty subcommand as follow

# fmadm faulty

When we have no error in the system, this command will not show anything and exit normally but with a faulty component  the output will be different, for example in the following sample we have  a faulty ZFS pool because some of its underlying devices are missing.

Starting from top we have:
  • Identification record: This record consisting of  the timestamp, a unique event ID, a message Id letting us know which knowledge repository article we can refer to for learning more about the problem and troubleshooting and finally the fault severity which can be Minor or Major.
  • Fault Class: This field allows us to know what is the device hierarchy causing this fault
  • Affects: tells us which component of our system is affected and how. In this instance some devices are missing and therefore the fault manager takes the Zpool out of service.
  • Problem in: shows more details about the problem root. In this case the device id.
  • Description: this field refer us to the a knowledge base article discussing this type of faults
  • Response:  Shows what action(s) were executed by fault manager to repair the problem.
  • Impact: describe the effect of the fault on the overall system stability and the component itself
  • Action: a quick tip on the next step administrator should follow to shoot the problem. This step is fully described in the knowledge base article we were referred in the description field.

Following figure shows the output for proceeding with the suggested action.

As we can see the same article we were referred to, is mentioned here again. We can see that two of the three devices have failed and fpool had no replica for each of those failed device to replace them automatically.

If we had a mirrored pool and one of the three devices were out of service, the system could automatically take corrective actions and replace continue working in a degraded status until we replace the faulty device.

The fault management framework is a plug-able framework consisting of diagnosis engines and subsystem agents. Agents and diagnosis engine contains the logic for assessing the problem, performing corrective actions if possible and filing the relevant fault record into the fault database.

To see a list of agents and engines plugged into the fault management framework we can use config subcommand of the fmadm command. Following figure shows the output for this command.

As we can see in the figure, there are two engines deployed with OpenSolaris, eft and the zfs-diagnosis. The eft, standing for Eversholt Fault Diagnosis Language, is responsible for assessing and analyzing hardware faults while the zfs-diagnosis is a ZFS specific engine which analyzes and diagnoses ZFS problems.

The  fmadm is a powerful utility we which can perform much more than what we discussed.  Here we can see few other tasks we can perform using the fmadm.
We can use the repaired subcommand of the fmadm utility to notify the FMD about a fault being repaired so it changes the component status and allows it to get enabled and utilized.

For example to notify the FMD about repairing the missing underlying device of the ZFS pool we can use the following command.

# fmadm repaired  zfs://pool=fpool/vdev=7f8fb1c77433c183

We can rotate the log files created by the FMD when we want to keep a log file in a specific status or when we want to have a fresh log using the rotate subcommand as shown below.

# fmadm rotate errlog | fltlog

The fltlog and errlog are two log files residing in the /var/fm/fmd/ directory storing all event information regarding faults and the errors causing them.

To learn more about fmadm command we can use the man pages and the short help provided with the distribution. Following commands shows how we can invoke the man pages and the short help message.

# man fmadm
# fmadm --help
A good starting resource is the FM wiki page which is located at http://wikis.sun.com/display/OpenSolarisInfo/Fault+Management

Monitoring ZFS pools performance using zpool iostat

Performance, performance, performance; this is what we hear in all software development and management sessions. ZFS provides few utility commands to monitor one or more pools’ performance.

You should remember that we used fsstat command to monitor the UFS performance metrics. We can use iostat subcommand of the zpool command to monitor the performance metrics of ZFS pools.

The iostat subcommand provides some options and arguments which we can see in its syntax shown below:

iostat [-v] [pool] ... [interval [count]]

The -v option will show detailed performance metrics about the pools it monitor including the underlying devices.  We can pass as many pools as we want to monitor or we can omit passing a pool name so the command shows performance metrics of all commands.  The interval and count specifies how many times we want the sampling to happen what is the interval between each subsequent sampling.

For example we can use the following command to view detailed performance metrics of fpool for 100 times in 5 seconds interval we can use the following command.

# zpool iostat -v fpool 5 100
The output for this command is shown in the following figure.

The first row shows the entire pool capacity stats including how much space were used upon the sampling and how much was available. The second row shows how many reads and writes operations performed during the interval time and finally the last column shows the band width used for reading from this pools and writing into the pool.

The zpool iostat retrieve some of its information from the read-only attributes of the requested pools and the system metadata and calculate some other outputs by collecting sample information on each interval.

Managing Logical network interfaces in Solaris

Like other operating system we can assign multiple IP address to a network interface. This secondary address are called logical interfaces and we can use them to make one machine with one single network interface own multiple IP addresses for different purposes. We may need to assign multiple IP address to an interface to make it available to both internal and external networks or for testing purposes.

We should have one network interface configured in our system in order to create additional logical interfaces.

We are going to add a logical interface to e1000g1 interface with a as its static IP address. Before doing so let’s see what network interface we have using the ifconfig command.

Now to add the logical interface we only need to execute the following command:

# ifconfig e1000g1 addif up
Invoking this command performs the following tasks:
  • Create a logical interface named e1000g1:1. The naming schema for logical interfaces conforms with interface_name:logical_interface_number which the number element can be from 1 to 4096.
  • Assign as its IP address, net mask and broadcast address.

Now if we invoke ifconfing -a command the output should contain the logical interface status as well. The following figure shows a fragment of ifconfig -a command.

Operating system does not retain this configuration over a system reboot and to make the configuration persistent we need to make some changes in the interface configuration file. For example to make the configuration we applied in this recipe persistent the content of /etc/opensolaris.e1000g1 should something similar to the following snippet.

The first line as we discussed in recipe 3 of this chapter assign the given address to this interface and the second like adds the logical interface with the given address to this interface. To remove a logical interface we can simply un-plumb it using the ifconfig command as shown below.

# ifconfig e1000g1:1 unplumb

When we create a logical interface, OpenSolaris register that interface in the network and any packet received by the interface will be delivered to the same stack that handles the underlying physical interface.

Like other operating system we can assign multiple IP address to a network interface. This secondary address are called logical interfaces and we can use them to make one machine with one single network interface own multiple IP addresses for different purposes. We may need to assign multiple IP address to an interface to make it available to both internal and external networks or for testing purposes.

We should have one network interface configured in our system in order to create additional logical interfaces.

Configuring Solaris Link Aggregation (Ethernet bonding)

Link aggregation or commonly known Ethernet bonding allows us to enhance the network availability and performance by combining multiple network interfaces together and form an aggregation of those interfaces which act as a single network interface with greatly enhanced availability and performance.

When we aggregate two or more network interfaces, we are forming a new network interface on top of those physical interfaces combined in the link layer.
We need to have at least two network interfaces in our machine to create a link aggregation. The interfaces must be unplumb-ed in order for us to use them in a link aggregation. Following command shows how to unplumb a network interface.
# ifconfig e1000g1 down unplumb
We should disable the NWAM because link aggregation and NWAM cannot co-exist.
# svcadm disable network/physical:nwam
The interfaces we want to use in a link aggregation must not be part of virtual interface; otherwise it will not be possible to create the aggregation. To ensure that an interface is not part of a virtual interface checks the output for the following command.
# dladm show-link
Following figure shows that my e1000g0 has a virtual interface on top of it so it cannot be used in an aggregation.
To delete the virtual interface we can use the dladm command as follow
# dladm delete-vlan vlan0
The link aggregation as the name suggests works in the link layer and therefore we will use dladm command to make the necessary configurations.  We use create-aggr subcommand of dladm command with the following syntax to create aggregation links.
dladm  create-aggr [-l interface_name]*  aggregation_name
In this syntax we should have at least two occurrence of -l interface_name option followed by the aggregation name.
Assuming that we have e1000g0 and e1000g1 in our disposal following commands configure an aggregation link on top of them.
# dladm create-aggr -l e1000g0 -l e1000g1 aggr0
Now that the aggregation is created we can configure its IP allocation in the same way that we configure a physical or virtual network interface. Following command plumb the aggr0 interface, assign a static IP address to it and bring the interface up.
# ifconfig aggr0 plumb up
Now we can use ifconfig command to see status of our new aggregated interface.
# ifconfig aggr0
The result of the above command should be similar to the following figure.
To get a list of all available network interfaces either virtual or physical we can use the dladm command as follow
# dladm show-link
And to get a list of aggregated interfaces we can use another subcommand of dladm as follow.
# dladm show-aggr
The output for previous dladm commands is shown in the following figure.
We can change an aggregation link underlying interfaces by adding an interface to the aggregation or removing one from the aggregation using add-aggr and remove-aggr subcommands of dladm command.  For example:
# dladm add-aggr -l e1000g2 aggr0
# dladm remove-aggr -l e1000g1 aggr0
The aggregation we created will survive the reboot but our ifconfig configuration will not survive a reboot unless we persist it using the interface configuration files.
To make the aggregation IP configuration persistent we just need to add create /etc/hostname.aggr0 file with the following content:
The interface configuration files are discussed in recipe 2 and 3 of this chapter in great details. Reviewing them is recommended.
To delete an aggregation we can use delete-aggr subcommand of dladm command. For example to delete aggr0 we can use the following commands.
# ifconfig aggr0 down unplumb
# dladm delete-aggr aggr0
As you can see before we could delete the aggregation we should bring down its interface and unplumb it.
In recipe 11 we discussed IPMP which allows us to have high availability by grouping network interfaces and when required automatically failing over the IP address of any failed interface to a healthy one. In this recipe we saw how we can join a group of interfaces together to have a better performance. By grouping a set of aggregations we can have the high availability that IPMP offers along with the performance boost that link aggregation offers.
The link aggregation works in layer 2 meaning that the aggregation groups the interfaces in layer 2 of network where network packets are dealt with. In this layer the network layer’s packets are extracted or created with the designated IP address of the aggregation and then are delivered to the lower, higher layer.

Configuring DHCP server in Solaris

A Dynamic Host Configuration Protocol (DHCP) server leases IP address to clients connected to the network and has DHCP client enabled on their network interface.
Before we can setup a start the DHCP server we need to install DHCP configuration packages. Detail information about installing packages in provided in recipe of chapter 1. But to save the time we can use the following command to install the packages.
# pkg install SUNWdhcs
After installing these packages we can continue with the next step.
How to do it…
First thing to setup the DHCP server is creating the storage and initial settings for the DHCP server. Following command does the trick for us.
# dhcpconfig -D -r SUNWfiles -p /fpool/dhcp_fs -a -d domain.nme -h files -l 86400
In the above command we used several parameters and options, each one of these options are explained below.
  • The -D specifies that we are setting up a new instance of the DHCP service.
  • The -r SUNWfiles specifies the storage type. Here we are using plain-text storage while SUNWbinfiles and SUNWnisplus are available as well.
  • The -p /fpool/dhcp_fs specifies the absolute path to where the configuration files should be stored.
  • The -a specifies the DNS server to use on the LAN. We can multiple comma separated addresses for DNS servers.
  • The -d domain.nme specifies the network domain name.
  • The -h files specifies where the host information should be stored. Other values are nisplus and dns.
  • The -l 86400 specifies the lease time in seconds.
Now that the initial configuration is created we should proceed to the next step and create a network.
# dhcpconfig -N -m  -t
Parameters we used in the above command are explained below.
  • The -N specifies the network address.
  • The -m specifies the network mask to use for the network
  • The -t specifies the default gateway
All configurations that we created are stored in DHCP server configuration files. We can manage the configurations using the dhtadm command. For example to view all of the current DHCP server configuration assemblies we can use the following command.
# dhtadm -P
This command’s output is similar to the following figure.
Each command we invoked previously is stored as a macro with a unique name in the DHCP configuration storage. Later on we will use these macros in subsequent commands.
Now we need to create a network of addresses to lease. Following command adds the addresses we want to lease.
# pntadm -C
If we need to reserve an address for a specific host or a specific interface in a host we should add the required configuration to the system to ensure that our host or interface receives the designated IP address. For example:
# pntadm -A -f MANUAL -i 01001BFC92BC10 -m -y
In the above command we have:
  • The -A adds the IP address
  • The -f MANUAL sets the flag MANUAL in order to only assign this IP address to the MAC address specified.
  • The -i 01001BFC92BC10 sets the MAC address for the host this entry assigned  to it.
  • The -m specifies that this host is going to use the macro.
  • The –y asks the command to verify that the macro entered actually exists.
  • The Specifies the network the address is assigned to.
Finally we should restart the DHCP server in order for all the changes to take effect. Following command restarts the corresponding service.
#  svcadm restart dhcp-server
When we setup the DHCP service, we store the related configuration in the storage of our choice. When we start the service, it reads the configuration from the storage and wait dormant until it receives a request for leasing an IP address. The service checks the configuration and if an IP was available for lease, it leases the IP to the client.
Prior to leasing the IP, DHCP service checks all leasing conditions like leasing a specific IP address to a client to ensure that it leases the right address to a client, etc.
We can use the DHCP Manager GUI application to configure a DHCP server. The DHCP manager can migrate the DHCP storage from one format to another. To install the DHCP manager package we can use the following command.
# pkg install SUNWdhcm
Now we can invoke the DHCP manager using the following command which opens the DHCP Manager welcome page shown in the following figure.
# dhcpmgr