"c gordon liddy" <c@[EMAIL PROTECTED]
> writes:
> 2 different cats.
>
> I've been going through chp 8 of K&R and wanted to write a standard cat
> function with a little more functionality than existing solns: I want to
> code behavior for the -v switch. It occurs to me that there "should" be
> source out there for this and googled for "cat.c unix source" . The
second
> hit I got was this:
>
> /*
> * Concatenate files.
> */
>
> #include <stdio.h>
> #include <sys/types.h>
> #include <sys/stat.h>
>
> char stdbuf[BUFSIZ];
>
> main(argc, argv)
> char **argv;
> {
> int fflg = 0;
> register FILE *fi;
> register c;
> int dev, ino = -1;
> struct stat statb;
>
> setbuf(stdout, stdbuf);
> for( ; argc>1 && argv[1][0]=='-'; argc--,argv++) {
> switch(argv[1][1]) {
>
> Holy smokes! This must count as archeology for unix systems.
Funky-looking
> main call and register as a type as opposed to storage specifier. This
gave
> me a pretty good idea what I wasn't looking for.
Yes, that code is archaic, but it's of some historical interest (to
show how much the language has improved if nothing else).
It makes heavy use of "implicit int", which is discouraged for C90
(though the standard doesn't say so) and dropped completely in C90.
In "register c;", register is a storage specifier, just as it is in
modern C; the declaration is equivalent to "register int c;".
> I then hit on:
>
http://www.openbsd.org/cgi-bin/cvsweb/src/bin/cat/cat.c?rev=1.14&content-type=text/plain
>
> The first thing a person notices is the stack of non-standard headers.
> Their inclusion is the usual reason for lack of topicality of unix
> questions. My platform and my target consist of my non-unix machine;
not
> only do I not know what's in those headers, I don't have them.
Most of the non-standard stuff is probably not strictly necessary, but
it can be used to improve performance or to provide various bells and
whistles.
> Past that is the main control:
> while ((ch = getopt(argc, argv, "benstuv")) != -1)
[...]
getopt is non-standard, as I'm sure you know.
[...]
> The only case I'm to consider is 'v', so I won't need all of this.
getopt
> will be something that I have to code from scratch. Out of curiosity,
what
> header is it defined in?
It varies. Consult your system's do***entation, or Google it. If
your system doesn't provide it, there are open-source implementations
out there. (For that matter, there are plenty of open-source
implementations of "cat", but I suppose that would defeat your
purpose.)
> Moving along is:
> if (bflag || eflag || nflag || sflag || tflag || vflag)
> cook_args(argv);
> , so if any flag gets set we cook the args. Maybe instead, we cook with
the
> args. In this process we traverse through:
>
> } else if (vflag)
> { if (!isascii(ch))
> { if (putchar('M') == EOF || putchar('-') == EOF)
> break;
> ch = toascii(ch);
> }
> if (iscntrl(ch)) {
> if (putchar('^') == EOF ||
> putchar(ch == '\177' ? '?' :
> ch | 0100) == EOF)
> break;
> continue;
> }
> I did my best to get this on the screen. The parts I don't understand
here
> follow the double pipe, which I read as "inclusive or." In the first if
> clause, it would appear that 'M' is substituted for non-ascii chars.
What
> does
> || putchar('-') == EOF)
> do beyond this?
It uses a common convention (at least it's common on Unix) for
displaying non-printable characters. Control characters in the range
0 to 31 are represented as a '^' followed by another character,
usually an uppercase letter; it's determined by adding 64 to the
value. (On old keyboards, the control key actually worked by clearing
a bit in the 7-bit or 8-bit value that was transmitted.) The DEL
character, 127, is represented as ^?; this is a special case.
Characters with the high bit set, in the range 128 to 255, are called
"meta" characters (some old keyboards had a "meta" key that set this
bit), and are represented as "M-" followed by the representation of
the corresponding 7-bit character. For example, character 129 would
be printed as M-^A.
putchar() returns EOF on failure.
All this (except the EOF part) is very specific to the ASCII character
set, something that's not specified by the C standard, but it should
give you enough information to understand what the code is doing (with
a bit of work).
> Similarly, I'm out of my depth with what follows the double pipe in the
> second if clause.
> || putchar(ch == '\177' ? '?' : ch | 0100) == EOF)
>
> Wouldn't \177 be a tri-graph? A perfectly-acceptable explanation might
be
> that it's beyond the scope of my present endeavor and can be omitted.
No, it's not a trigraph; trigraphs are introduced by a double question
mark. It's a character constant that uses an escape sequence. '\177'
is the character whose integer value is 177 in octal, or 127 in
decimal; it's the ASCII DEL character. "ch | 0100" yields the value
of ch with a certain bit forced on; it's terse way of mapping
control-A (1) to 'A" and so forth. The conditional expression is used
to handle the fact that mapping DEL to "^?" is a special case.
--
Keith Thompson (The_Other_Keith) <kst-u@[EMAIL PROTECTED]
>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"


|