BACKASCII John Osudar Electronics Department Argonne National Laboratory 205 A-051 9700 South Cass Avenue Argonne, IL 60439-4837 Phone numbers: FTS: 972-7505 (312) 972-7505 Electronic mail addresses: Bitnet: B35049 at ANLCMT MFENET: B35049@AN2 BACKASCII consists of a pair of utility programs, BACKPACK and BACKUNPACK. These programs were originally designed to convert BACKUP saveset files to and from a packed ASCII format, so that such files could be sent over communications links (e.g. networks such as Bitnet) which do not accept, or incorrectly translate, eight-bit binary data, and possibly even some seven-bit printable characters. BACKPACK works by counting the frequency of each byte value in each record of a file, writing a header indicating which values have been replaced by substitute characters, and then ASCIIfying and compressing the record. Identical consecutive values may be replaced by repeat counts, and the most frequently occurring values are replaced by single printable characters from the set "G"-"Z" and "a"-"z". Less common values are represented by their two-character hexadecimal equivalents. The resulting "packed" file has variable-length records of at most 80 characters, and contains only upper-case and lower-case letters, along with digits, periods, and commas. Options may be specified to restrict the use of lower-case letters as well, for use in cases where only upper-case data can be transferred, or to expand the set of replacement characters to include all non-blank printable ASCII characters (except the hex digits, comma, period, and dollar sign) for cases where all those are known to transmit correctly. (The dollar sign and space are never included in any packed file. The reason is that the dollar sign might occur in column 1, preventing the "distribution" file from working -- the line would appear to be a DCL command -- and some communications software does strange things with spaces, such as delete trailing ones.) BACKUNPACK reads the packed file, and recreates the original file. The original filename and extension are written into the packed file by BACKPACK, so the unpacked file will have the same name as the original. BACKUNPACK knows how to unpack upper/lowercase, uppercase-only, and "extended" format packed files. Although BACKASCII was designed to process BACKUP savesets, which are fixed-length record files with a large recordsize, it can be used to process any file of fixed-length or variable-length records, e.g. a VMS executable image, a graphics metafile, etc. (Note that, once the recordsize is large enough that there aren't enough replacement characters to handle all replaceable byte values, increased recordsize reduces efficiency. However, small records tend to have more unreplaceable byte values, i.e. values that produce replacement savings of less than three bytes.) When used with BACKUP savesets created with /BLOCK=2048, packed files that are 10% to 20% smaller than the original saveset have been seen. With graphics metafiles averaging over a block per record, packed files were up to 30% larger than the original. Packed EXE files vary, from around 10% smaller than the original to as much as 20% larger. Using the "extended" format, an additional 2% to 5% size reduction was seen on savesets, while graphics metafiles were only 20% to 25% larger than the original file. BACKASCII is the command procedure that drives both utilities. It is executed with parameters as follows: @BACKASCII command inputfile {outputfile} "command" may be: PACK to produce a packed upper/lowercase output file; PACK/U to produce a packed uppercase-only output file; PACK/X to produce a packed full-ASCII output file; UNPACK to unpack a packed file "inputfile" is the name of the file being processed "outputfile" is required on the PACK commands, and optional on UNPACK. If specified on UNPACK, it should include the device and/or directory where the file is to be written; the filename and file type of the original file (as contained in the packed file header) will be retained. BACKDIST is a template command procedure that includes the (uncommented) source of BACKUNPACK, and provides a place to insert a packed file. By editing BACKDIST and inserting a packed file where indicated, the resulting command procedure can be sent to another VMS site, where executing it will produce a copy of the original file. (Fortran is required on the receiving node in order to make use of this.) Note that BACKUNPACK is written entirely in VAX Fortran, and calls no system services or other VMS-specific routines directly. It should be relatively easy to convert BACKUNPACK to work with a non-VMS Fortran-77 compiler, in which case it would be possible to ship packed files of binary data (e.g. graphics metafiles) to non-VMS hosts and unpack them into original form at the destination. Also note that the packing and unpacking processes require a fairly large amount of CPU time, although this version has been optimized to work a lot more efficiently than the original version. (On a 785, typical packing and unpacking rates of 50 (unpacked) blocks per CPU-second have been observed -- i.e. a 6000-block saveset might require two MINUTES of 785 CPU time to pack or unpack!)