Tcl Directory Scanning Library Module

The Tcl directory scanning library module has the file name dscan.tcl.

Introduction

The directory scanning module/API will traverse one or more directories from a specific entry point matching file/directory names. This API does not use recursion to traverse the directory tree and is not subject to stack limitations and excessive memory consumption typical with recursive algorithms.

This Tcl version is different than the C version in that it uses the glob command which gets a list of matches instead of one match at a time. I use a global list to store the matches and return only one match at a time. Since Tcl has no typical data structures (ie: records or structures), the list of active directories is stored in an array with each element of the array containing a list.

Here is a list of procedures in this library module:

Using the API

To use the API, you start by calling the procedure dscan_findfirst which will initialize all the module variables. You should then enter a loop which will call the procedure dscan_findnext. Here is a simple sample program:

set TCLDIR /home/rick/gpl/tcl
source $TCLDIR/lib/stdhead.tcl
source $TCLDIR/lib/dscan.tcl

   set fname ""
   set ret [dscan_findfirst "/" "*.log" "fds" fname]

   if {ret == 0} {
      puts "not found"
      exit
   }

   while {ret} {
      puts "found:ret=$ret,fname=$fname"
      set ret [dscan_findnext fname]
   }

   exit
}

The first three lines of the above sample program set the Future Lab Tcl base directory and source all the required modules. Refer to the guide on running the Future Lab Tcl applications for complete information.

You supply the scanning criteria to the dscan_findfirst procedure. The first parameter is the starting point in file system. The second parameter indicates what names to match including any wildcards. The third parameter consists of the flags and the last parameter is where the name (full path and name) of the found item is returned. Both dscan_findfirst and dscan_findnext will return a non-zero value if a match is found. The return code identifies the item type (file, directory or symbolic link). The type codes are defined in this module as:

   DSCAN_TYPE_FILE
   DSCAN_TYPE_DIR
   DSCAN_TYPE_SYMLINK

The flags (third) parameter may consist of any of the flags:

   f - scan for files
   d - scan for directories
   s - include sub-directories
   l - scan for symbolic links

One of f, d or l must be supplied. Scanning support for symbolic links is only available in Unix or Unix-like operating systems.

In the above example, the starting point is the root (/) searching for files (f) and directories (d) that have the .log extension. Sub-directories will also be scanned.

The memory that is used by dscan will automatically be unset when there are no longer any matching items. If, however, you do not repeatedly call dscan_findnext to logically conclude the scan, the memory in use by dscan will remain allocated. In this case, call the procedure dscan_end to force dscan to unset all memory used.

Keep in mind that dscan may take some time to locate specific item(s) especially if a large disk with many files and directories is being scanned. Dscan has been built to use the logging manager if it is active. If you really want to know exactly what dscan is doing, just enable the log and be prepared for much information.

Theory of Operation

Dscan maintains an array of all directories currently being scanned. It starts by scanning the initial directory supplied in the procedure dscan_findfirst. All items in the directory are compared. The name of each sub-directory located is placed in a string delimited by DSCAN_SUB_DELIM. When there are no more items in the directory, each sub-directory name is appended to the current path and scanned again. When there are no more sub-directories, one level is removed from the current path and the cycle starts over again. Each time a specific directory has no more sub-directories to scan. It's element in the array is deleted.

This method has the benefit of very low memory consumption. The maximum number of elements in the array is tied directly to the depth of a specific directory structure.

Platform Specific Notes

Dscan operates on every platform in the same basic way. There are, however, some differences in the way that the glob Tcl command is implemented on each platform, especially in regards to file name case sensitivity.

As noted above, support for scanning symbolic links is limited to Unix-like operating systems.

Module Dependencies

The following modules are required along with this module:

Module Definition Files

This module requires the definition file:

Module Procedures

dscan_findfirst

Declaration  : proc dscan_findfirst {path fname flags pfname}
Parameters   :
      Name   : path
      Type   : string
Description  : starting path

      Name   : fname
      Type   : string
Description  : name/pattern to locate

      Name   : flags
      Type   : string
Description  : scan flags (f, d, s, l)

      Name   : pfname
      Type   : string
Description  : returned item full path and name

Returns      : DSCAN_TYPE_FILE, DSCAN_TYPE_DIR or DSCAN_TYPE_SYMLINK upon match, zero otherwise

This procedure should be called first when starting a directory scan. The scan will proceed from the starting path continuing until a match is found or until there are no more directories to scan.

dscan_findnext

Declaration  : proc dscan_findnext {pfname}
Parameters   :
      Name   : pfname
      Type   : string
Description  : returned item full path and name

Returns      : DSCAN_TYPE_FILE, DSCAN_TYPE_DIR or DSCAN_TYPE_SYMLINK upon match, zero otherwise

This procedure will scan for subsequent files or directories. The procedure dscan_findfirst must have already been called first to initialize the scan.

dscan_end

Declaration  : proc dscan_end {}

This procedure will unset the scan array and wipe all the global variables in use by dscan. Normally, when the scan process is brought to a natural conclusion (no more items found), dscan will automatically do this for you. If however, you stop the scan before all items have been found, the memory used by dscan is still allocated. By calling this procedure, you can force dscan to release all used memory.

dscan_find

Declaration  : proc dscan_find {fname}
Parameters   :
      Name   : fname
      Type   : string
Description  : output full path and name

Returns      : DSCAN_TYPE_FILE, DSCAN_TYPE_DIR or DSCAN_TYPE_SYMLINK upon match, zero otherwise

This procedure will scan the current directory for an item match.

dscan_sub

Declaration  : proc dscan_sub {}

Returns      : 1 if another directory was located, 0 otherwise

This procedure will first attempt to obtain the name of one of the sub-directories in the current path that has been stored in the array. If that fails, one directory level is removed from the current path.

dscan_chop_path

Declaration  : proc dscan_chop_path {}

Returns      : 1 upon success, 0 otherwise

This procedure will remove one directory level from the current path.

dscan_set_flags

Declaration  : proc dscan_set_flags {flags}
Parameters   :
      Name   : flags
      Type   : string
Description  : scan flags

Returns      : 1 upon success, 0 otherwise

This procedure will set the global scan flags based on the supplied scan flags.

dscan_set_current_path

Declaration  : proc dscan_set_current_path {p}
Parameters   :
      Name   : p
      Type   : string
Description  : directory path

Returns      : 1 upon success, 0 otherwise

This procedure will set the global current path to the directory path.

dscan_ll_delete

Declaration  : proc dscan_ll_delete {p}
Parameters   :
      Name   : p
      Type   : string
Description  : full directory path

This procedure will unset the full directory path entry from the array.

dscan_set_mvars

Declaration  : proc dscan_set_mvars {path fname}
Parameters   :
      Name   : path
      Type   : string
Description  : starting directory path

      Name   : fname
      Type   : string
Description  : target file name/pattern

Returns      : 1 upon success, 0 otherwise

This procedure will set the global variables to the supplied starting directory and the target file name/pattern.

dscan_delete_mvars

Declaration  : proc dscan_delete_mvars {}

This procedure will either unset or wipe the global variables used by dscan.

dscan_ll_find

Declaration  : proc dscan_ll_find {p}
Parameters   :
      Name   : p
      Type   : string
Description  : directory path

Returns      : 1 if the array element was found, 0 otherwise

This procedure will attempt to locate an array entry by the directory path.

dscan_ll_add

Declaration  : proc dscan_ll_add {p}
Parameters   :
      Name   : p
      Type   : string
Description  : directory path

Returns      : 1 if the add was successful, 0 otherwise

This procedure will add an entry to the array with the supplied directory path.

dscan_set_hold

Declaration  : proc dscan_set_hold {glist type}
Parameters   :
      Name   : glist
      Type   : list
Description  : list of directory items

      Name   : type
      Type   : string
Description  : item type

This procedure will add each item in the list of directory items to the holding list.

dscan_ll_debug

Declaration  : proc dscan_ll_debug {}

This procedure will output the entire current contents of the scan link list to the current log destination. For this procedure to operate properly, the logging manager must have already been started by calling the procedure logman_start.

dscan_header

Declaration  : proc dscan_header {mname}
Parameters   :
      Name   : mname
      Type   : string
Description  : procedure name

This procedure will create a log entry (if enabled) to indicate entry into a dscan procedure.

Goto Top | Tcl Applications | Tcl Software Overview | Tcl Library Overview
| Future Lab Home | Contact Webmaster | Feedback

Copyright © 2005-2006 Future Lab, Last Updated Jul 01, 2006