Introducing NIO.2 (JSR 203) Part 6: Filtering directory content and walking over a file tree

In this part we will look at how the directory tree walker and the directory stream reader works. These two features are another couple of long requested features which was not included in the core java before Java 7.

First, lets see what directory stream reader is, this API allows us to filter content of a directory on the file system and extract the file names that matches our filter criteria. The feature works for very large folders with thousands of files.

For filtration we can use PathMatcher expression matching the file name or we can filter the directory content based on the different file attributes. for example based on the file permissions or the file size.

Following sample code shows how to use the DirectoryStream along with filtering. For using the PathMatcher expression we can just use another overload of the newDirectoryStream method which accepts the PathMatcher expression instead of the filter.

public class DirectoryStream2 {

    public static void main(String args[]) throws IOException {
	//Getting default file system and getting a path
        FileSystem fs = FileSystems.getDefault();
        Path p = fs.getPath("/usr/bin");

	//creating a directory streamer filter
        DirectoryStream.Filter
 filter = new DirectoryStream.Filter
() {

            public boolean accept(Path file) throws IOException {
                long size = Attributes.readBasicFileAttributes(file).size();
                String perm = PosixFilePermissions.toString(Attributes.readPosixFileAttributes(file).permissions());
                if (size > 8192L && perm.equalsIgnoreCase("rwxr-xr-x")) {
                    return true;
                }
                return false;
            }
        };
	// creating a directory streamer with the newly developed filter
        DirectoryStream
 ds = p.newDirectoryStream(filter);
        Iterator
 it = ds.iterator();
        while (it.hasNext()) {

            Path pp = it.next();

            System.out.println(pp.getName());
        }
    }
}

The above code is self explaining and I will not explain it any further than the in-line comments.

Next subject of this entry is directory tree walking or basically file visitor API. This API allows us to walk over a file system tree and execute any operation we need over the files we found. The good news is that we can scan down to any depth we require using the provided API.

With the directory tree walking API, the Java core allows us to register a vaster class with the directory tree walking API and then for each entry the API come across, either a file or a folder, it calls our visitor methods. So the first thing we need is a visitor to register it with the tree walker. Following snippet shows a simple visitor which only prints the file type using the Files.probeContentType() method.

class visit extends SimpleFileVisitor
 {

    @Override
    public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) {
        try {
            System.out.println(Files.probeContentType(file));

        } catch (IOException ex) {
            Logger.getLogger(visit.class.getName()).log(Level.SEVERE, null, ex);
        }
        return super.visitFile(file, attrs);
    }

    @Override
    public FileVisitResult postVisitDirectory(Path dir, IOException exc) {
        return super.postVisitDirectory(dir, exc);
    }

    @Override
    public FileVisitResult preVisitDirectory(Path dir) {
        return super.preVisitDirectory(dir);
    }

    @Override
    public FileVisitResult preVisitDirectoryFailed(Path dir, IOException exc) {
        return super.preVisitDirectoryFailed(dir, exc);
    }

    @Override
    public FileVisitResult visitFileFailed(Path file, IOException exc) {
        return super.visitFileFailed(file, exc);
    }
}

As you can see we extended the SimpleFileVisitor and we have visitor methods for all possible cases.

Now that we have the visitor class, the rest of the code is straight forward. following sample shows how to walk over /home/masoud directory down to two levels.

public class FileVisitor {

    public static void main(String args[]) {

        FileSystem fs = FileSystems.getDefault();
        Path p = fs.getPath("/home/masoud");
        visit v = new visit();
        Files.walkFileTree(p, EnumSet.allOf(FileVisitOption.class), 2, v);

    }
}

You can grab the latest version of Java 7 aka Dolphin from here and from the same page you can grab the latest JavaDoc which is generated from the same source code that is used to generate the binary bits. Other entries of this series are under the NIO.2 tag of my weblog.

JavaZone 2010 sessions I am going to attend

I will be attending JavaZone 2010 both as an speaker presenting NIO.2 and as an atendee sitting and learning new technologies other speakers kindly share. After looking at the list of sessions, here is the sessions I decided to attend. It is really hard to decided on which speaker and subject to choose. All speakers are known engineers and developers and selected subjects are exceptionally good.

I may change few of this items as JZ2010 staff updates the agenda but majority of the sessions will be what I have already selected. See you at JZ2010 and the JourneyZone afterward the conference itself.

8th of September:

  1. Emergent Design — Neal Ford
  2. JRuby: Now With More J! — Nick Sieger
  3. Howto: Implement Collaborative Filtering with Map/Reduce – Ole-Martin Mørk
  4. Building a scalable search engine with Apache Solr, Hibernate Shards and MySQL – Aleksander Stensby, Jaran Nilsen
  5. The not so dark art of Performance Tuning — Dan Hardiker, Kirk Pepperdine
  6. Creating modular applications with Apache Aries and OSGi — Alasdair Nottingham
  7. Java 7 NIO.2: The I/O API for Future — Masoud Kalali -> I am Speaking here
  8. The evolution of data grids, from local caching to distributed computing – Bjørn Vidar Bøe

9th of September:

  1. NOSQL with Java — Aslak Hellesøy
  2. CouchDB and the web — Knut O. Hellan
  3. Decision Making in Software Teams — Tim Berglund
  4. A Practical Introduction to Apache Buildr — Alex Boisvert
  5. Architecture Determines Performance — Randy Stafford
  6. Surviving Java Maintenance – A mini field manual — Mårten Haglind
  7. Using the latest Java Persistence API 2.0 features — Arun Gupta

JavaZone 2010, Oslo
JavaZone 2010, JavaZone logo

Introducing NIO.2 (JSR 203) Part 5: Watch Service and Change Notification

For long time Java developers used in-house developed solutions to monitor the file system for changes. Some developed general purpose libraries to ease the task of others who deal with the same requirement.  Commercial and  free/ open source libraries like http://jnotify.sourceforge.net/, http://jpathwatch.wordpress.com/ and  http://www.teamdev.com/jxfilewatcher among others. Java 7 comes with NIO.2 or JSR 203 which provides native file system watch service.

The  watch service provided in Java 7  uses the underlying file system functionalities to watch the file system for changes, so if we are running on Windows, MacOS or Linux… we are sure that the watch service is not imposing polling overhead on our application because the underlying OS and file system provides the required functionalities to allow Java to register for receiving notification on file system changes. If the underlying file system does not provide the watch-ability, which I doubt it for any mainstream file system,  Java will fall back to some rudimentary polling mechanism to keep the code working but the performance will degrade.

From the mentioned libraries the jpathwatch API is the same as the Java 7 APIs to make it easier to migrate an IO based application from older version of Java to Java 7 when its time arrives.

The whole story starts with WatchService which we register our interest in watching a path using it. The WatchService itself is an interface with several implementatins for different file system and operating systems.
We have four class to work with when we are developing a system with file system watch capability.
  1. A Watchable: A watchable is an object of a class implementing the Watchable interface. In our case this is the Path class which is the one of the central classes in the NIO.2
  2. A set of event types: We use it to specify which types of events we are interested in. For example whether we want to receive creation, deletion, … events. In our case we will use StandardWatchEventKind which implements the WatchEvent.Kind<T>.
  3. An event modifier: An event modifier that qualifies how a Watchable is registered with a WatchService. In our case we will deal with nothing specific up to now as there is no implementation of this interface included in the JDK distribution.
  4. The Wacher: This is the watcher who watch some watchable. In our case the watcher watches the File System for changes. The abstract class is java.nio.file.WatchService but we will be using the FileSystem object to create a watcher for the File System.

Now that we know the basics, let’s see how a complete sample will look like and then we will break down the sample into different parts and discuss them one by one.

import java.io.IOException;
import java.nio.file.FileSystem;
import java.nio.file.FileSystems;
import java.nio.file.Path;
import java.nio.file.StandardWatchEventKind;
import java.nio.file.WatchEvent;
import java.nio.file.WatchKey;
import java.nio.file.WatchService;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;

public class WatchSer {
    public static void main(String args[]) throws InterruptedException {
        try {
            FileSystem fs = FileSystems.getDefault();
            WatchService ws = null;
            try {
                ws = fs.newWatchService();
            } catch (IOException ex) {
                Logger.getLogger(WatchSer.class.getName()).log(Level.SEVERE, null, ex);
            }
            Path path = fs.getPath("/home/masoud/Pictures");
            path.register(ws, StandardWatchEventKind.ENTRY_CREATE, StandardWatchEventKind.ENTRY_MODIFY, StandardWatchEventKind.OVERFLOW, StandardWatchEventKind.ENTRY_DELETE);

            WatchKey k = ws.take();

            List> events = k.pollEvents();
            for (WatchEvent object : events) {
                if (object.kind() == StandardWatchEventKind.ENTRY_MODIFY) {
                    System.out.println("Modify: " + object.context().toString());
                }
                if (object.kind() == StandardWatchEventKind.ENTRY_DELETE) {
                    System.out.println("Delete: " + object.context().toString());
                }
                if (object.kind() == StandardWatchEventKind.ENTRY_CREATE) {
                    System.out.println("Created: " + object.context().toString());
                }
            }
        } catch (IOException ex) {
            Logger.getLogger(WatchSer.class.getName()).log(Level.SEVERE, null, ex);
        }
    }
}

At the beginning of the code we create a FileSystem object and then create a WatchService for the file system underneath the FileSystem object we created previously.

            FileSystem fs = FileSystems.getDefault();
            WatchService ws = null;
            try {
                ws = fs.newWatchService();
            } catch (IOException ex) {
                Logger.getLogger(WatchSer.class.getName()).log(Level.SEVERE, null, ex);
            }

At the next step we create a Path object which basically is our watchable, what we want to watch, and then register it with the WatchService object we created in the previous step.

  Path path = fs.getPath("/home/masoud/Pictures");
   path.register(ws, StandardWatchEventKind.ENTRY_CREATE, StandardWatchEventKind.ENTRY_MODIFY, StandardWatchEventKind.OVERFLOW, StandardWatchEventKind.ENTRY_DELETE);

Next we get a key which represents the registration of a watchable with a watch service.
This WatchKey object represents our access door to events propagated by the WatchService object.

After getting the key, we can poll for the events arrived at the key. An File system event may get represented by multiple WatchService events. For example a rename is represented by deletion and creation events.

 WatchKey key = ws.take();
 List> events = key.pollEvents();

Now that we have the list of events we can iterate over the events list and see whether we are intrested in the event or not.

for (WatchEvent object : events) {
                if (object.kind() == StandardWatchEventKind.ENTRY_MODIFY) {
                    System.out.println("Modify: " + path.toRealPath(true)+"/"+ object.context().toString());

                }
                if (object.kind() == StandardWatchEventKind.ENTRY_DELETE) {
                    System.out.println("Delete: " +  path.toRealPath(true)+"/"+ object.context().toString());
                }
                if (object.kind() == StandardWatchEventKind.ENTRY_CREATE) {
                    System.out.println("Created: " +  path.toRealPath(true)+"/"+ object.context().toString());
                }
            }

In our sample code we are running the watch service in the application thread while we can use multiple threads to handle the events. One reason that the WatchService is not implemented using the Listener is the urge for using multiple threads to deal with the events in cases where there are thousands of events or event processing take longer than application threshold.
You can grab the latest version of Java 7 aka Dolphin from here and from the same page you can grab the latest JavaDoc which is generated from the same source code that is used to generate the binary bits. Other entries of this series are under the NIO.2 tag of my weblog.