I need to extract comments from C/C++/Java source file. Normally
comments
are thrown away and they are defined as lexer rules (in ANTLR v3):
// Single-line comments
SL_COMMENT
: '//' ( ~('\r'|'\n')* ) '\r'? '\n' {skip();}
;
// Multi-line comments
ML_COMMENT
: '/*' (options {greedy=false;} : .)* '*/' {skip();}
;
Basically I need something like this:
comment : sl_comment | ml_comment ;
sl_comment : '//' COMMENT_LINE ;
ml_comment : '/*' COMMENT_TEXT '*/' ;
How do I define COMMENT_LINE and COMMENT_TEXT? Especially if I am
ignoring
whitespace, it becomes more confusing.
Any hints or pointers would be appreciated,
-- Amal