Freeware Command Line Sort Utility CMSort

[Overview] [Features] [Users' Review] [Screenshot] [Example 1] [Example 2] [Run-time Comparison] [Download] [Other Freeware] [Home]

 

Overview

Do you (still) need to sort plain text files with DOS, WINDOWS, UNIX, MAC (or even mixed!) end-of-line marks? Or files with fixed-length records? Then you should take a look at CMSort, a freeware command line sort utility for Windows 95/98/NT/2000/XP.

If you have any questions or comments, feel free to contact the author. CMsort was built with Borland Delphi.

From Vision to Code - Powered by Delphi

 

Features

CMSort 1.6 has the following main features: How is CMsort working? CMsort is reading records of an input file until the adjusted memory is reached. Then the records are sorted and written to a temporary file. This will be repeated until all records are processed. Finally, all temporary files are merged into the output file.

[Top]
 
 

Users' Review

On June 28, 2001 I received the following mail:

I just wanted to drop you a line to congratulate you on a superb piece of work when it comes to CMSort. I had a large file (130,000,000 bytes - 10,000,000 records ... each a digit from 0 to 999,999,999) which nothing I had would sort. I could play with it in Access97, but nothing else would touch it (and Access didn't like it). So, since the file was created, and managed with Visual Basic (which worked quite well for generating and picking out some data), I didn't feel like writing a homegrown sorting algorithm (I'm experimenting with random numbers, and a means of generating unique values which resulted in the file in question...) Anyhow, your software clocks in at 18 minutes and 16 seconds to sort my data. (I'm running a PII 300, and I used the default settings, ie. CMSort <infile> <outfile>). Very impressive! To think, I was just impressed that the program made it through the data (I ran out of patience with Access97 and killed the process after close to an hour... it wasn't getting anywhere and it was slowing my machine down!) So, to make a long story short... Thanks! You made my day.

[Top]
 
 

Screenshot

Screenshot of CMSort

[Top]
 
 

Example 1

Let's suppose you have a file CUSTOMER.TXT with customer orders as follows (including three header lines, which are not sorted by CMSort by command /H=3).
1234567890123456789012345678901234567890123
Cust.   Name         Order           Return
No.                  Date
1004711 Miller & Co. 1999-12-06    1,207.23
1004713 Topsoft      2000-01-04    2,521.95
1004747 MCP & Co.    2000-01-04    7,356.88
1004799 Eftpos       1999-12-06   23,122.56

Execution of
cmsort /H=3 /S=22,10 /N=33,11- CUSTOMER.TXT CUSTOMER.SOR
will sort this file by order date (ascending) and return (descending). The result is:
1234567890123456789012345678901234567890123
Cust.   Name         Order           Return
No.                  Date
1004799 Eftpos       1999-12-06   23,122.56
1004711 Miller & Co. 1999-12-06    1,207.23
1004747 MCP & Co.    2000-01-04    7,356.88
1004713 Topsoft      2000-01-04    2,521.95
Explanation of command line arguments:

/H=3 don't sort three header lines
/S=22,10 first part of key is a string, beginning at position 22, length 10 bytes, sort ascending (default)
/N=33,11- second part of key is numeric, beginning at position 33, length 11 bytes, sort descending (-)

[Top]
 
 

Example 2

This example shows how to ignore duplicate records. Duplicate records are recognized by the defined key, not by the whole line. If you want to exclude duplicate lines, you must perfom an additional sort beforehead by using the whole line as key. The following log file is containing user ID, user name, and last access time:
055 Maas       2001-02-05 07:31:55
087 Mechenbier 2001-02-05 08:01:23
024 Hesselbein 2001-02-05 08:15:16
055 Maas       2001-02-05 08:44:24
089 Kruft      2001-02-05 09:05:07
087 Mechenbier 2001-02-05 09:31:13

Execution of
cmsort /S=1,3 /D LOG.TXT LOG.SOR
will sort the log file by user ID (ascending) without duplicates. The result is:
024 Hesselbein 2001-02-05 08:15:16
055 Maas       2001-02-05 08:44:24
087 Mechenbier 2001-02-05 09:31:13
089 Kruft      2001-02-05 09:05:07

[Top]
 
 

Run-time Comparison

Using a file with 300,000 records of 9-digit integers as input (total file size 3,300,000 bytes), on a Pentium 266 MHz running under Windows 98 with 32 MB RAM the following times were measured compared to CUSORT (a relatively fast DOS sorting tool).

Program Memory Usage Elapsed Time Notes
CUSORT 24 KB 547 seconds 75 MB additional memory on HDD needed
CMSort 24 KB 32 seconds 6.5 MB on HDD needed
CMSort 100 KB 31 seconds  
CMSort 1024 KB 30 seconds  

[Top]
 
 

Download

The archive file cmsort.zip contains the following files:

Before downloading CMSort, you have to accept my license and disclaimer agreement.

Download CMSort release 1.6 here (57 KB ZIP archive).

[Top]