On Mar 28, 8:11 am, Keith Thompson <ks...@[EMAIL PROTECTED]
> wrote:
> c gordon liddy <grumpy196...@[EMAIL PROTECTED]
> writes:
>
>
>
> > "Keith Thompson" <ks...@[EMAIL PROTECTED]
> wrote in message news:
> > 87hcerqseb....@[EMAIL PROTECTED]
> >> "c gordon liddy" <c...@[EMAIL PROTECTED]
> writes:
> [...]
> >>> Similarly, I'm out of my depth with what follows the double pipe in
the
> >>> second if clause.
> >>> || putchar(ch == '\177' ? '?' : ch | 0100) == EOF)
>
> >>> Wouldn't \177 be a tri-graph? A perfectly-acceptable explanation
might be
> >>> that it's beyond the scope of my present endeavor and can be
omitted.
>
> >> No, it's not a trigraph; trigraphs are introduced by a double
question
> >> mark. It's a character constant that uses an escape sequence.
'\177'
> >> is the character whose integer value is 177 in octal, or 127 in
> >> decimal; it's the ASCII DEL character. "ch | 0100" yields the value
> >> of ch with a certain bit forced on; it's terse way of mapping
> >> control-A (1) to 'A" and so forth. The conditional expression is
used
> >> to handle the fact that mapping DEL to "^?" is a special case.
>
> > I think I could study the above for a long time and not really get
> > it. It's interesting but not germane to something that can be done in
> > standard C. I have a double problem with the double pipe here. Not
> > only is that which is on the right hand side of it obfuscated C, I
> > don't get the control mechanism. To me, it looks like
> > if this then that or the other.
>
> The quote code, as far as I can tell, *is* standard C. I don't
> believe it's deliberately obfuscated; rather, it's unusually terse,
> written in a style that favors packing lots of information into
> complex expressions rather than breaking it down into separate
> statements.
>
> You can skip it and go on to something easier if you like, but you
> might consider taking one more stab at it.
>
> Let's take a look at the statement:
>
> if (iscntrl(ch)) {
> if (putchar('^') == EOF ||
> putchar(ch == '\177' ? '?' :
> ch | 0100) == EOF)
> break;
> continue;
> }
>
> if ch is a control character then
> if printing '^' fails *or* printing another character fails then
> break out of the loop (give up)
> end if
> Printing succeeded; nothing more to do here: "continue"
> end if
>
> iscntrl(ch) returns true if ch is a "control character". In this
> context, it tells us that it's a non-printable character that we want
> to represent as a '^' followed by another character (^G for the ASCII
> BEL character, ^? for DEL).
>
> Within the if statement we see two calls to putchar(), one to print
> the '^' character and one to print whatever follows it. Both results
> are compared against EOF (which indicates failure); if either
> putchar() fails, we break out of the loop.
>
> The part before the "||" is reasonably clear: try to print a '^'
> character and check whether the attempt failed. "||" is a
> short-circuit operator, evaluating its right operand only if the left
> operand is false, so if the first putchar call fails we won't attempt
> the second one.
>
> Now let's look at the part after the "||":
>
> putchar(ch == '\177' ? '?' : ch | 0100) == EOF
>
> We've covered the higher level control flow, so we're down to figuring
> out what the heck
>
> ch == '\177' ? '?' : ch | 0100
>
> means. Some parentheses might make it clearer:
>
> (ch == '\177') ? ('?') : (ch | 0100)
>
> If ch is equal to '\177' (character 177 octal, 127 decimal, ASCII
> DEL), the expression yields '?'. The result is that we print a '?'
> after the '^'.
>
> Otherwise (For any other control character), the result is (ch |
> 0100). 0100, since it begins with '0' is an octal constant, equal to
> 64, a power of 2. "|" is the bitwise "or" operator.
>
> The binary value of 0100 is 01000000. Suppose the value of ch is 7
> (ascii BEL, which we're going to want to print as "^G"). 7 is
> 00000111. Applying bitwise or to these two operands gives us
> 01000111, which is 0107 in octal, or 71 in decimal, or 'G' in ASCII.
>
> 0100 (octal) is being used as a bit mask; it has a single bit set to
> 1, and all others set to 0. (ch | 0100) yields the value of ch with
> the bit in that particular position turned on. As it happens, that's
> a terse way to specify a transformation from a control character to
> the corresponding letter.
>
> Note that (ch + 64) would have worked just as well in this context
> (since we know the bit we want to turn on isn't already on). The
> author probably chose to write "ch | 0100" because he thought of the
> operation as setting a bit, not as the equivalent addition.
>
> Here's a much more verbose chunk of code that does the same thing.
> I've kept the "c | 0100" idiom, but expanded everything else. The
> original code is more terse than I tend to like; the following is much
> too verbose for my taste, but it might be clearer. (I've compiled it,
> but I haven't tested it.)
>
> if (iscntrl(ch)) {
> /* ch is a control character */
> int result;
>
> /*
> * The two characters we want to print. The first is '^';
> * we don't know yet what the second is.
> */
> int ch1 = '^';
> int ch2;
>
> /* Try to print the first character. */
> result = putchar(ch1);
> if (result == EOF) {
> /* Failed, terminate the loop *?
> break;
> }
>
> if (ch == '\177') {
> /* ch is DEL, we want "^?" */
> ch2 = '?';
> }
> else {
> /*
> * ch is another control character.
> * Transform 1 to 'A', 2 to 'B', etc. using
> * our intimate knowledge of ASCII encoding.
> */
> ch2 = ch | 0100;
> }
>
> /* Print as above */
> result = putchar(ch2);
> if (result == EOF) {
> break;
> }
> }
>
> --
> Keith Thompson (The_Other_Keith) <ks...@[EMAIL PROTECTED]
>
> Nokia
> "We must do something. This is something. Therefore, we must do this."
> -- Antony Jay and Jonathan Lynn, "Yes Minister"
K&R 7.5 has the text that includes the cat function that is alluded to
in 8.1. The filecopy there uses characters instead of buffers to do
its business. I believe it is better suited to my current task than
using buffers. The part needing revision to account for the -v
behavior appears as an external, void function. Main makes the
adjustment for the output to go to stdout.
/*filecopy */
void filecopy(FILE *ifp, FILE *ofp)
{
int c;
while((c=getc(ifp)) != EOF)
putc(c, ofp);
}
I don't know whether I'll be able to get the job done with one int, so
I'll put c in reserve and use ch to match the source I snipped from
the bsd site. I've further added symbols to match Keith's verbose
version.
/*filecopy */
void filecopy(FILE *ifp, FILE *ofp)
{
int c;
int ch;
int result;
while((ch=getc(ifp)) != EOF)
putc(ch, ofp);
}
So, I've got to exchange this for the putc statement:
if (iscntrl(ch)) {
/* ch is a control character */
int result;
/*
* The two characters we want to print. The first is '^';
* we don't know yet what the second is.
*/
int ch1 = '^';
int ch2;
/* Try to print the first character. */
result = putchar(ch1);
if (result == EOF) {
/* Failed, terminate the loop *?
break;
}
if (ch == '\177') {
/* ch is DEL, we want "^?" */
ch2 = '?';
}
else {
/*
* ch is another control character.
* Transform 1 to 'A', 2 to 'B', etc. using
* our intimate knowledge of ASCII encoding.
*/
ch2 = ch | 0100;
}
/* Print as above */
result = putchar(ch2);
if (result == EOF) {
break;
}
}
So I think I'm ready to take this to a compiler. I'm on someone
else's laptop. It probably does have a compiler, but its owner is in
an online naval battle. Our girlfriends are at the theatre. I love
theatre when I don't have to go.
Because I have to make the keystrokes, I'll finish with the caller.
No non-standard headers here:
#include <stdio.h>
int main(int argc, char **argv)
{
FILE *fp;
void filecopy(FILE *, FILE *);
if (argc < 2) printf("die");
else
while (--argc > 0)
if ((fp = fopen(*++argv, "r")) == NULL)
{
printf("catv can't open %s\n", *argv);
return 1;
}
else
{
filecopy(fp, stdout);
fclose(fp);
}
return 0;
}
Since the google ****tal is the only way for me to get this back to my
own machine, I include reference material after the sig.
--
c gordon liddy
if (iscntrl(ch)) {
if (putchar('^') == EOF ||
putchar(ch == '\177' ? '?' :
ch | 0100) == EOF)
break;
continue;
}
if ch is a control character then
if printing '^' fails *or* printing another character fails then
break out of the loop (give up)
end if
Printing succeeded; nothing more to do here: "continue"
end if
iscntrl(ch) returns true if ch is a "control character". In this
context, it tells us that it's a non-printable character that we want
to represent as a '^' followed by another character (^G for the ASCII
BEL character, ^? for DEL).
Within the if statement we see two calls to putchar(), one to print
the '^' character and one to print whatever follows it. Both results
are compared against EOF (which indicates failure); if either
putchar() fails, we break out of the loop.
The part before the "||" is reasonably clear: try to print a '^'
character and check whether the attempt failed. "||" is a
short-circuit operator, evaluating its right operand only if the left
operand is false, so if the first putchar call fails we won't attempt
the second one.
Now let's look at the part after the "||":
putchar(ch == '\177' ? '?' : ch | 0100) == EOF
We've covered the higher level control flow, so we're down to figuring
out what the heck
ch == '\177' ? '?' : ch | 0100
means. Some parentheses might make it clearer:
(ch == '\177') ? ('?') : (ch | 0100)
If ch is equal to '\177' (character 177 octal, 127 decimal, ASCII
DEL), the expression yields '?'. The result is that we print a '?'
after the '^'.
Otherwise (For any other control character), the result is (ch |
0100). 0100, since it begins with '0' is an octal constant, equal to
64, a power of 2. "|" is the bitwise "or" operator.
The binary value of 0100 is 01000000. Suppose the value of ch is 7
(ascii BEL, which we're going to want to print as "^G"). 7 is
00000111. Applying bitwise or to these two operands gives us
01000111, which is 0107 in octal, or 71 in decimal, or 'G' in ASCII.
0100 (octal) is being used as a bit mask; it has a single bit set to
1, and all others set to 0. (ch | 0100) yields the value of ch with
the bit in that particular position turned on. As it happens, that's
a terse way to specify a transformation from a control character to
the corresponding letter.
Note that (ch + 64) would have worked just as well in this context
(since we know the bit we want to turn on isn't already on). The
author probably chose to write "ch | 0100" because he thought of the
operation as setting a bit, not as the equivalent addition.
Here's a much more verbose chunk of code that does the same thing.
I've kept the "c | 0100" idiom, but expanded everything else. The
original code is more terse than I tend to like; the following is much
too verbose for my taste, but it might be clearer. (I've compiled it,
but I haven't tested it.)


|