Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: checking fo...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 2 of 2 Topic 2146 of 2241
Post > Topic >>

Re: checking for corrupted files

by Ed Morton <morton@[EMAIL PROTECTED] > Jan 29, 2008 at 09:42 PM

On 1/29/2008 4:09 PM, Seb wrote:
> Hi,
> 
> I would like to check if some simple text files have been corrupted.  A
> manual/visual check of a few of the files shows that some of them
> contain "garbage" characters in them.  I can't directly "see" what those
> characters are, but they can be found at any part of the file.  The
> information I'm after is simply the name of the file that is corrupted.
> So I thought the following would do:
> 
> awk '/[^[:alnum:]]/ {print FILENAME}' *
> 
> Since, if IIUC, [:alnum:] represents all alphabet letters (upper and
> lower case) and all digits, punctuation marks and symbols,

alnum = ALpha NUMeric, i.e. alphabetic and numeric characters, no
punctuation
marks, symbols or anything else.

> which are
> part of the uncorrupted files.  Basically, print the file name a line
> does NOT match any of these characters.  Is this a good way to spot
> those corrupted files?

Beats me since I don't know what the your files can legally contain and so
don't
know what it means for them to be "corrupted", but to detect control
characters
you'd use the "[:cntrl:]" character class:

awk '/[[:cntrl:]]/ {print FILENAME}' *

Note though, that the presence of control characters doesn't always mean a
file's been corrupted (e.g. people use control-Ls to separate functions in
source code to force form-feed to printers). Maybe some other character
class or
combination of such would be more appropriate. See
http://www.gnu.org/software/gawk/manual/gawk.html#table_002dchar_002dclasses
for
the list.

	Ed.




 2 Posts in Topic:
checking for corrupted files
Seb <spluque@[EMAIL PR  2008-01-29 22:09:58 
Re: checking for corrupted files
Ed Morton <morton@[EMA  2008-01-29 21:42:59 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Sat May 17 12:13:09 CDT 2008.