Regexp::Common::comment -- provide regexes for comments. |
Regexp::Common::comment -- provide regexes for comments.
use Regexp::Common qw /comment/;
while (<>) { /$RE{comment}{C}/ and print "Contains a C comment\n"; /$RE{comment}{C++}/ and print "Contains a C++ comment\n"; /$RE{comment}{PHP}/ and print "Contains a PHP comment\n"; /$RE{comment}{Java}/ and print "Contains a Java comment\n"; /$RE{comment}{Perl}/ and print "Contains a Perl comment\n"; /$RE{comment}{awk}/ and print "Contains an awk comment\n"; /$RE{comment}{HTML}/ and print "Contains an HTML comment\n"; }
use Regexp::Common qw /comment RE_comment_HTML/;
while (<>) { $_ =~ RE_comment_HTML() and print "Contains an HTML comment\n"; }
Please consult the manual of the Regexp::Common manpage for a general description of the works of this interface.
Do not use this module directly, but load it via Regexp::Common.
This modules gives you regular expressions for comments in various languages.
Below, the comments of each of the languages are described.
The patterns are available as $RE{comment}{LANG}
, foreach
language LANG. Some languages have variants; it's described
at the individual languages how to get the patterns for the variants.
Unless mentioned otherwise,
{-keep}
sets $1
, $2
, $3
and $4
to the entire comment,
the opening marker, the content of the comment, and the closing marker
(for many languages, the latter is a newline) respectively.
\
), and last till
the end of the line.
See http://homepages.cwi.nl/%7Esteven/abc/.
--
, and last till the end of the line.
#
or //
, and last till the
end of the line.
;
and last till
the end of the line. See also http://www.wurb.com/if/devsys/12.
--
, and last till the end of the line.
See also http://w1.132.telia.com/~u13207378/alan/manual/alanTOC.html.
comment
,
and end with a ;
. See http://www.masswerk.at/algol60/report.htm.
#
, or by one of the
keywords co
or comment
. The keywords should not be part of another
word. See http://westein.arb-phys.uni-dortmund.de/~wb/a68s.txt.
With {-keep}
, only $1
will be set, returning the entire comment.
/*
and ending with */
.
#
and end at the end of the line.
/*
and ending with */
.
$RE{comment}{BASIC}{mvEnterprise}
. Comments in this language start with a
!
, a *
or the keyword REM
, and end till the end of the line. See
http://www.rainingdata.com/products/beta/docs/mve/50/ReferenceManual/Basic.pdf.
{-keep}
, $1
will be set, and set to the
entire comment. This pattern requires perl 5.8.0 or newer.
//
and that continue till the end of the line. See also
http://www.catseye.mb.ca/esoteric/b-juliet/index.html.
;
. See http://www.catseye.mb.ca/esoteric/befunge/98/spec98.html.
<?c_
, and ending with c_?>
.
See http://www.livejournal.com/doc/server/bml.index.html.
<
, >
, [
, ]
, +
, -
, .
and ,
.
Any other characters are considered comments. With {-keep}
,
$1
is set to the entire comment.
/*
and ending with */
.
/*
and ending with */
.
See http://cs.uas.arizona.edu/classes/453/programs/C--Spec.html.
//
and last till the end of the line, and comments that start with
/*
, and end with */
. If {-keep}
is used, only $1
will be
set, and set to the entire comment.
//
and last till the end of the line, and comments that start with
/*
, and end with */
. If {-keep}
is used, only $1
will be
set, and set to the entire comment.
See http://msdn.microsoft.com/library/default.asp.
(*
, end with *)
, and can be nested.
See http://www.cs.caltech.edu/courses/cs134/cs134b/book.pdf and
http://pauillac.inria.fr/caml/index-eng.html.
//
and last till the end of the line, and comments that start with
/*
, and end with */
. If {-keep}
is used, only $1
will be
set, and set to the entire comment.
See http://developer.nvidia.com/attach/3722.
CLU
, a comment starts with a procent sign (%
), and ends with the
next newline. See ftp://ftp.lcs.mit.edu:/pub/pclu/CLU-syntax.ps and
http://www.pmg.lcs.mit.edu/CLU.html.
;
) and last till the end of the line. See http://www.rbnn.com/cql/.
//
, and end with the end of the line.
//
, or are nested comments, delimited with /*
and */
.
Under {-keep}
, only $1
will be set, returning the entire comment.
This pattern requires perl 5.6.0 or newer.
//
and last till the end of the line, and comments that start with
/*
, and end with */
. If {-keep}
is used, only $1
will be
set, and set to the entire comment. JavaScript is Netscapes implementation
of ECMAScript. See
http://www.ecma-international.org/publications/files/ecma-st/Ecma-262.pdf,
and http://www.ecma-international.org/publications/standards/Ecma-262.htm.
--
, and last till the end of the line.
{
and end with }
.
See http://wouter.fov120.com/false/false.txt
//
and last till the end of the line, and comments that start with
/*
, and end with */
. If {-keep}
is used, only $1
will be
set, and set to the entire comment.
\
, and end with the end of the line.
See also http://docs.sun.com/sb/doc/806-1377-10.
!
, and end at the end of the line.
The pattern for this is given by $RE{Fortran}
. Fixed form Fortran,
which has been obsoleted, has comments that start with C
, c
or
*
in the first column, or with !
anywhere, but the sixth column.
The pattern for this are given by $RE{Fortran}{fixed}
.
See also http://www.cray.com/craydoc/manuals/007-3692-005/html-007-3692-005/.
;
.
#
and lasting the rest of the line.
,
.
See http://www.dangermouse.net/esoteric/haifu.html.
{-
and -}
.
Under {-keep}
, only $1
will be set, returning the entire comment.
This pattern requires perl 5.6.0 or newer.
<!
, and ends with a
>
. Inside this declaration, we have zero or more comments.
Comments starts with --
and end with --
, and are optionally
followed by whitespace. The pattern $RE{comment}{HTML}
recognizes
those comment declarations (and hence more than a comment).
Note that this is not the same as something that starts with
<!--
and ends with -->
, because the following will
be matched completely:
<!-- First Comment -- --> Second Comment <!-- -- Third Comment -->
Do not be fooled by what your favourite browser thinks is an HTML comment.
If {-keep}
is used, the following are returned:
<!
.
>
.
!
(which cannot be followed by a \
), or are nested comments,
delimited with !\
and \!
.
Under {-keep}
, only $1
will be set, returning the entire comment.
This pattern requires perl 5.6.0 or newer.
#
and end at the next new line.
See http://www.toolsofcomputing.com/IconHandbook/IconHandbook.pdf,
http://www.cs.arizona.edu/icon/index.htm, and
http://burks.bton.ac.uk/burks/language/icon/index.htm.
NOT
or N'T
, and can optionally be preceded by the
keywords DO
and PLEASE
. If both keywords are used, PLEASE
precedes DO
. Keywords are separated by whitespace.
NB.
, and that last till
the end of the line. See
http://www.jsoftware.com/books/help/primer/contents.htm, and
http://www.jsoftware.com/.
//
and last till the end of the line, and comments that start with
/*
, and end with */
. If {-keep}
is used, only $1
will be
set, and set to the entire comment.
/**
end with */
. If {-keep}
is used, only $1
will be set,
and set to the entire comment. See
http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html#format.
//
and last till the end of the line, and comments that start with
/*
, and end with */
. If {-keep}
is used, only $1
will be
set, and set to the entire comment. JavaScript is Netscapes implementation
of ECMAScript.
See http://www.mozilla.org/js/language/E262-3.pdf,
and http://www.mozilla.org/js/language/.
%
and ending at the end of the line.
;
) and last till the
end of the line.
/*
and ending with */
.
;
, and last till the end
of the line.
--
, and last till the end
of the line. See also http://www.lua.org/manual/manual.html.
M
(aka MUMPS
), comments start with a semi-colon, and last
till the end of a line. The language specification requires the
semi-colon to be preceded by one or more linestart characters.
Those characters default to a space, but that's configurable. This
requirement, of preceding the comment with linestart characters is
not tested for. See
ftp://ftp.intersys.com/pub/openm/ism/ism64docs.zip,
http://mtechnology.intersys.com/mproducts/openm/index.html, and
http://mcenter.com/mtrc/index.html.
#
and continue to the end of the line, including
the newline. The pattern $RE {comment} {m4}
matches such comments.
In m4, it is possible to change the starting token though.
See http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf,
http://www.cs.stir.ac.uk/~kjt/research/pdf/expl-m4.pdf, and
http://www.gnu.org/software/m4/manual/.
Modula-2
, comments start with (*
, and end with *)
. Comments
may be nested. See http://www.modula2.org/.
Modula-3
, comments start with (*
, and end with *)
. Comments
may be nested. See http://www.m3.org/.
#
and lasting the rest of the line.
#
(like Perl), or multiline comments delimited by /*
and */
(like C). Under -keep
, only $1
will be set. See also
http://www.nickle.org.
(*
and end with *)
.
See http://www.oberon.ethz.ch/oreport.html.
$RE{comment}{Pascal}
{
, or
(*
, and end with }
or *)
. This means that {*)
and (*}
are considered to be comments. Many Pascal applications don't allow this.
See http://www.pascal-central.com/docs/iso10206.txt
$RE{comment}{Pascal}{Alice}
{
and end with }
. Comments are not allowed to contain newlines.
See http://www.templetons.com/brad/alice/language/.
$RE{comment}{Pascal}{Delphi}
, $RE{comment}{Pascal}{Free}
and $RE{comment}{Pascal}{GPC}
//
and last till the end of the line, are delimited with {
and }
or are delimited with (*
and *)
. Patterns for those
comments are given by $RE{comment}{Pascal}{Delphi}
,
$RE{comment}{Pascal}{Free}
and $RE{comment}{Pascal}{GPC}
respectively. These patterns only set $1
when {-keep}
is used,
which will then include the entire comment.
See http://info.borland.com/techpubs/delphi5/oplg/, http://www.freepascal.org/docs-html/ref/ref.html and http://www.gnu-pascal.de/gpc/.
$RE{comment}{Pascal}{Workshop}
{
and }
, delimited with
(*)
and *
), delimited with /*
, and */
, or starting
and ending with a double quote ("
). When {-keep}
is used,
only $1
is set, and returns the entire comment.
!
and last till the end of the
line, or start with /*
and end with */
. With {-keep}
,
$1
will be set to the entire comment.
#
or //
and last till the
end of the line, or are delimited by /*
and */
. With {-keep}
,
$1
will be set to the entire comment.
.
or ;
, and end with the
next newline. See http://www.mmcctech.com/pl-b/plb-0010.htm.
/*
and ending with */
.
--
and run till the end
of the line, or start with /*
and end with */
.
#
, and continue till the end
of the line.
//
,
and last till the end of the line.
#
, and continue till the end
of the line.
`
(a backtick), and
contine till the end of the line.
QML
, comments start with #
and last till the end of the line.
See http://www.questionmark.com/uk/qml/overview.doc.
#
and
end with the following new line. See http://www.r-project.org/.
;
and last till the
end of the line.
#
and last till the end of the time.
;
, and last till the end of the line.
See http://schemers.org/.
#
and end at the end of
the line.
;
. See http://www.catseye.mb.ca/esoteric/shelta/index.html.
#
and includes the rest of the
line (just like Perl). Second, there is the multiline, nested comment,
which are delimited by (*
and *)
. Under C{-keep}>, only
$1
is set, and is set to the entire comment. This pattern needs
at least Perl version 5.6.0. See
http://www.cs.berkeley.edu/~ug/slide/docs/slide/spec/spec_frame_intro.shtml.
%
and lasting the rest of the line.
"
.
;
, and last till the
end of the line.
"
. Double quotes can appear inside comments by doubling them.
MySQL does not follow the standard. Instead, it allows comments
that start with a #
or --
(that's two dashes and a space)
ending with the following newline, and comments starting with
/*
, and ending with the next ;
or */
that isn't inside
single or double quotes. A pattern for this is returned by
$RE{comment}{SQL}{MySQL}
. With {-keep}
, only $1
will
be set, and it returns the entire comment.
#
and continue till the end of the line.
%
and ending at the end of the line.
\"
, and continuing till the end of the line.
//
and continue to the end of the line. See http://www.ubercode.com.
"
, and ending at the end of the line.
||
, and end with !!
.
;
, and continue till the
end of the line.
'
character, and end at the following newline. See
http://dave2.rocketjump.org/rad/zzthelp/lang.html.
the Regexp::Common manpage for a general description of how to use this interface.
Damian Conway (damian@conway.org)
This package is maintained by Abigail (regexp-common@abigail.be).
Bound to be plenty.
For a start, there are many common regexes missing. Send them in to regexp-common@abigail.be.
This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail.
This module is free software, and maybe used under any of the following licenses:
1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT.
Regexp::Common::comment -- provide regexes for comments. |