Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: sort email ...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 6 of 12 Topic 2169 of 2377
Post > Topic >>

Re: sort email by date

by Ted Davis <tdavis@[EMAIL PROTECTED] > Feb 25, 2008 at 08:21 AM

On Sun, 24 Feb 2008 19:47:46 -0800, droid wrote:

> I have "received" emails from the early nineties to present.  These are
> concatenated together in two huge files.
> 
> My problem is that segments of one file need to be inserted into the
other
> file.  Also, there would be many dupes.
> 
> I was hoping to find a utility that could do the sorting after joining
> them with 'cat', then use Thunderbird to remove the dupes.
> 
> But from the replies here, it appears this is more complicated than I
> supposed.

A better approach, based on personal experience, is to split the files
into separate year files, then let Thunderbird sort them any way you want
for index display.  I've been doing this for people who never clean out
their inbox for quite a few years.  It's fairly easy in awk ... provided
the header marker is either constant or there are a limited number of
them.

I haven't written a script for Thunderbird files yet, but I need to -
maybe I can do it today.

....

This script, written if full procedural format for ease of conversion to
other formats, splits Thunderbird mailboxes into year files.

BEGIN{
	OutFile = ""
}
{
# First line of header recognition and year extraction (Thunderbird)
	if( $0 ~ /^From - / ) {
# Verification: 7 fields, the last of which is all numbers.
		if( NF == 7 ) {
			if( $7 !~ /[^0-9]/ ) {
# If the filename changes, it is necessary to close the previous one.
				if( $7 != OutFile ) {
					close( OutFile )
					OutFile = $7 ".mailbox"
				}
			}
		}
	}
	print $0 > OutFile
}

The first line of the headers is in this format

From - Thu Feb 02 09:10:14 2006

To help prevent triggering on spurious lines in messages and forwards from
Outlook, the line is identified, then tested for number of fields and for
a pure number in the seventh field.  Since I wrote it in a very open
format, with each test on a separate line, it should be reasonably easy to
convert it for other first line formats.

-- 
T.E.D. (tdavis@[EMAIL PROTECTED]
)
 




 12 Posts in Topic:
sort email by date
droid <jshowalter@[EMA  2008-02-24 11:20:20 
Re: sort email by date
Ed Morton <morton@[EMA  2008-02-24 15:25:27 
Re: sort email by date
gazelle@[EMAIL PROTECTED]  2008-02-25 00:56:26 
Re: sort email by date
Ted Davis <tdavis@[EMA  2008-02-24 20:16:22 
Re: sort email by date
droid <jshowalter@[EMA  2008-02-24 19:47:46 
Re: sort email by date
Ted Davis <tdavis@[EMA  2008-02-25 08:21:37 
Re: sort email by date correction
Ted Davis <tdavis@[EMA  2008-02-25 10:49:08 
Re: sort email by date
Edward Morton <morton@  2008-02-25 08:28:01 
Re: sort email by date
Ed Morton <morton@[EMA  2008-02-25 08:28:59 
Re: sort email by date correction
droid <jshowalter@[EMA  2008-02-25 09:19:54 
Re: sort email by date
spcecdt@[EMAIL PROTECTED]  2008-02-25 22:08:12 
Re: sort email by date
William James <w_a_x_m  2008-02-25 16:12:50 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Tue Oct 14 8:33:09 CDT 2008.