GaryScott wrote:
(snip)
> If I have control over the file format, I generally mark up the data
> file to indicate the type of data on each record (millions of ways to
> do that). You could just begin each line with a process control text
> string like:
> Title: This is the title
> Data: a=1.2, b=6.0...
(snip)
> and so on. If you have a large number of these, it would probably be
> faster than scanning the entire line for valid/invalid characters.
I know many programs that have a title line, then many data
lines, then another title line (for the next batch) and
many data lines, etc. Title lines should be rare enough that
the time needed to scan one isn't significant, but if the count
of data lines is wrong, it is nice to know.
> You can add other types of data as well. If you want to confuse
> somebody, you can make it look like html...
> <title>This is title 1
> <data>a=1.2, b=6.0...
> Wouldn't that be fun.
Then you should probably go to XML, which is designed for
this problem. Among others, every open tag needs a close
tag, though there are also empty tags. (In HTML, many tags
don't have a matching close, and early parsers were very
loose on the syntax, resulting in much non-standard code
that must be sup****ted.)
-- glen


|