RTF::Writer - for generating documents in Rich Text Format |
RTF::Writer - for generating documents in Rich Text Format
use RTF::Writer; my $rtf = RTF::Writer->new_to_file("greetings.rtf"); $rtf->prolog( 'title' => "Greetings, hyoomon" ); $rtf->number_pages; $rtf->paragraph( \'\fs40\b\i', # 20pt, bold, italic "Hi there!" ); $rtf->close;
This module is for generating documents in Rich Text Format.
This module is a class; an object belonging to this class acts like an output filehandle, and calling methods on it causes RTF text to be written.
Incidentally, this module also exports a few useful functions, upon request.
The following documentation assumes some familiarity with the RTF Specification. Users not already intimately familiar with RTF should look at RTF::Cookbook and/or my book RTF Pocket Guide from O'Reilly, http://www.oreilly.com/catalog/rtfpg/
IO::File->new(...)
).
"Stuff\n\t\tUmmm\n"
causes 'Stuff\line \tab \tab Umm\line '
to be written. See
rtfesc(x)
for further details of escaping.
Unless $RTF::Writer::AUTO_NL (normally on) has been turned off, the
item written will be followed with a (presumably harmless) newline
character to delimit any code in there from any following text,
if the last character of this string is a digit or a lowercase
letter. This is so that (\'\i', "foo!")
emits \i[newline]foo!'
(which does what you expected), instead of '\ifoo!'
, which looks like
an RTF command ``ifoo'' followed by a plaintext ``!''.
You can nest these array-references, like:
$h->print( \'\col2', [ \'\pard', "It is now ", [ \'\f1', scalar(localtime), " local, or ", scalar(gmtime), " GMT.", ], " -- if you're ", [ \'\i', "keeping track.", ], ], \'\par\page', );
The return value of the print()
method is currently always the value 1,
although this may change.
Since emitting a prolog opens a ``{''-group, calling $h->prolog(...)
sets
a flag in $h so that when you call $h->close(), a closing ``}'' will
automatically be written before the stream object is actually closed.
The options to the prolog()
method are passed as a list of keys and values,
for controlling the contents of the prolog written. The options are listed
below, roughly with the most important options first.
(Be careful with the spelling of these options. Some are rather odd, because they are (mostly) based on the name of the relevent RTF command, and a systematic naming scheme for commands is one thing you won't find in RTF!)
\'\froman Times New Roman'
. If the value of the ``fonts''
parameters is a scalar ref, then it is taken to be a reference to
code of your own that expresses the whole font table.
If you don't specify a value for the ``font'' option, then
you get a font table with one entry, ``Times New Roman''.
You should be sure to declare all fonts that you switch to in your document (as with \'\f3', to change the current font to what's declared in entry 3 (counting from 0) in the font table).
If you don't stipulate any value for 'colors', then you get a table consisting of three colors: null/default (undef), 100% red ([2550,0,0]), and 100% blue ([0,0,255]).
You can freely ignore concerns of color tables if you don't use color-changing codes in your document (like \'\cf2', to switch the text foreground color to what's declared at entry 2 (starting from 0) in the color table).
If no value is specified, then RTF::Writer puts a string noting
the value of $0
(typically the filespec to the current Perl program),
and the version of RTF::Writer used.
The meanings of all of these are explained in greater detail in the RTF spec.
(stat($thing)[9])
or time()
; or you may
pass a reference a timelist, like [localtime($whatever)].
If no defined value for revtime is stipulated in the call to prolog(...)
then the current value of time()
is used.
Explicitly pass a value of undef to suppress emitting any 'creatim' value.
prolog(...)
then the current value of time()
is used.
Explicitly pass a value of undef to suppress emitting any 'creatim' value.
$h->printf( \'{\i "%s"} was found in %2.2f percent of matches\par', $word, 100 * $count / $total );
The page numbering consists of just putting the page number at the top-right of each page. If you provide items in the list (...), then that is pre-pended to the page number. Example:
$h->number_pages("Lexicon, p.");
Or:
$h->number_pages(\'\b\fs30\f2', "page ");
The work that a declaration has to do, is best explained in this diagram of a bordered three-cell table (first cell containing ``Foo ya!''), placed near a left margin (shown as the line of colons). The things in brackets are not on the page, but just for our reference:
: [..w1...] : [......w2.......] : [...w3....] [.A..] [.B.] [.B.] : : +-------+---------------+---------+ : | Foo | Bar baz | Yee! | : | ya! | quuxi quuxo | | : | | quaqua. | | : +-------+---------------+---------+ : [.A..] [.B.] [.B.] [..r1........] [.....r2.....................] [........r3............................]
Here the horizontal dimensions of the three-celled table are expressed in terms of: A, the distance from the current left margin; B, the minimum distance between the content of the cells (or you can think of this as twice the internal left or right borders in each cell); and then EITHER [w1, w2, w3], expressing the width of each cell, OR [r1, r2, r3], expressing each cell's right end's distance from the current left margin. All distances are, of course, in twips.
Options to RTF::Writer::TableRowDecl->new( ...options... ) are:
You must provide a value for $trdecl, or a fatal error results.
If you provide fewer items than $trdecl declares cells, then you get empty cells to fill out the row. If you provide more items than $trdecl declares cells, then the width of the last declared row is used in figuring the width of the additional cells for this row.
Example:
my $decl = RTF::Writer::TableRowDecl->new('widths' => [1500,1900]); $h->row($decl, "Stuff", "Hmmm"); $h->row($decl, [\'\ul', 'Foo'], 'Bar', \'\bullet'); $h->row($decl, "Hooboy.");
This creates a table resembing:
+-------------+-------------------+ | Stuff | Hmm | +-------------+-------------------+-------------------+ | _Foo_ | Bar | * | +-------------+-------------------+-------------------+ | "Hooboy." | | +-------------+-------------------+
Note that you MUST NOT use '\par' commands in any items you emit in row cells!
The $h->row(...)
method is a wrapper for producing elementary tables
in RTF, with the minimum of parameters; the myriad other options
that tables can have (for example, changing borders) are not supported.
If you really need to generate tables fancier than what $h->row(...)
can produce, start off reading the RTF spec, reading the source
for row()
(and the RTF::Writer::TableRowDecl class), and progress from
there. Note that MSWord has been known to crash when given malformed
RTF table code.
=item $h->table($trdecl, [...row1 items...], [...row2 items...], ... );
$h->paragraph( "See here: ", $h->image( 'filename' => "foo.png", ), );
The legal options are explained below:
(The filename
option above is required, but the following options
are all generally optional -- altho some RTF processors may be
finicky if you set some of the following but not others, for no
apparent reason. When in doubt, test.)
(The default is to do neither, as you'd get from a cropping value of 0.)
$h->image(...)
, but has three
differences: First, it is a shortcut for this:
$h->paragraph( \'\qc', $h->image( ...params...), );
Secondly, whereas $h->image(...)
returns the image data
(as an RTF scalarref), $h->image_paragraph(...)
doesn't
return much of anything.
Thirdly, $h->image_paragraph(...)
is often much more
memory-efficient, since it can write the image data to a file
as it's RTF-ified, instead of building it all up in memory.
$h
;
this generally (assuming you'd called $h->prolog)
involves just writing a final close-brace to $h,
and then closing whatever filehandle or file $h writes to
(unless we're writing to a string, in which case we just discard $h's
reference to it).
After you call $h->close
, you should not call any other
methods with $h
!
Note that you don't have to explicitly call $h->close
--
when an unclosed RTF::Writer object goes out of scope (or, more
precisely speaking, when if its refcount hits zero), then
something equivalent to calling $h->close
is
done automatically for you.
In addition to any of the above methods, you can use any RTF command (and optional integer arguments) as a valid method name, by just capitalizing its first letter, as shown below:
For example, $h->Page()
is the same as $h->print(\'\page')
The same as $h->print( [ \'\foo', ... ] );
For example:
$h->I('stuff')
is the same as $h->print([\'\i', 'stuff'])
For example: $h->Cols2()
is the same as $h->print(\'\cols2')
The same as $h->print( [ \'\foo123', ... ] );
For example: $h->F2('stuff')
is the same as
$h->print([\'\f2', 'stuff']).
For example: $h->Li_1440()
is the same as $h->print([\'\li-1440', 'stuff'])
The same as $h->print( [ \'\foo-123', ... ] );
None of these functions are exported by default, but they can be exported on request, as in:
use RTF::Writer qw(inches cm rtfesc);
in($x)
inches(1.5)
returns 2160,
because an inch and a half is exactly 2160 twips. The return value of these
functions is always an integer, as fractions of twips are not used in RTF.
pt($x)
points(54)
returns 1080,
because fifty-four points is exactly 1080 twips. The return value of
these functions is always an integer, as fractions of twips are not used
in RTF.
cm(x)
cm(1.5)
returns
850, because 1.5cm is approximately 850 twips (i.e., it's 850, when rounded
to the nearest whole number). Since twips and points are both are defined
in terms of inches (1440 twips = 72 points = 1 inch), conversion between cm
and these other units is approximate.
The return value of cm($x)
is always an integer, as fractions of twips are
not used in RTF.
In void context (i.e., where you aren't capturing the return value), this in-place alters the values you pass it.
In scalar or list context, doesn't alter the original(s), but returns an escaped copy of what you pass in.
To control alignment of cells, specify align => "direction
direction direction...>"
, where each direction is one of these
alphametic strings for the given directions (based on the abbreviated
English names for map directions and canvas directions):
NW N NE TL T TR \ | / \ | / W - C - E L - C - R / | \ / | \ SW S SE BL B BR
For example, align => "nw c"
means that the first cell will be
aligned to the northwest (a.k.a. the top-left),
and that the second cell
(and any cells thereafter) will be aligned to the center.
An acceptable alternate syntax is to
align => ['nw', 'c']
-- i.e., to pass a reference to an array
of 'direction' items, instead of just passing a single scalar of
whitespace-padded directions.
(Note that alignment syntax and cell border syntax, may look a bit alike, but are really very different; try not to mix them up.)
To specify what borders occur on cells, use one of the following syntaxes:
borders => 1,
or borders => 'all',
to
turn on a simple border for all sides of all cells.
This is the default -- so if you don't specify a
borders => something
option, it will be as if
you specified borders => 1
.borders => 0,
or borders => 'none',
to turn off all borders for all cells. In previous versions
of RTF::Writer, this was the default.or use this complex syntax for finer control:
borders => [ cellborders, cellborders, ... ],
...where each cellborders
is a string
in the form ``border border border'', where, in turn, each
border is a substring in the form
``direction-thickness-type'',
``direction-type''
``direction-thickness'', or ``direction''.
Alternately, cellborders
can be one of these shorter values:
direction is either ``all'', or a combination of some of the uppercase or lowercase letters N, S, E, W, T, B, R, L. (Of course, the first four are synonymous with the other four, respectively.)
thickness (by default, 15) is an integer between 1 and 75, specifying the thickness of the border, in twips.
And type (by default, ``s'') is one of these, as specified in the RTF spec:
s : Single-thickness border th : Double-thickness border sh : Shadowed border db : Double border dot : Dotted border hair : Hairline border
dash : Dashed border inset : Inset border dashsm : Dashed border (small) dashd : Dot-dashed border dashdd : Dot-dot-dashed border outset : Outset border triple : Triple border tnthsg : Thick-thin border (small) thtnsg : Thin-thick border (small) tnthtnsg : Thin-thick thin border (small) tnthmg : Thick-thin border (medium) thtnmg : Thin-thick border (medium) tnthtnmg : Thin-thick thin border (medium) tnthlg : Thick-thin border (large) thtnlg : Thin-thick border (large) tnthtnlg : Thin-thick-thin border (large) wavy : Wavy border wavydb : Double wavy border dashdotstr : Striped border emboss : Embossed border engrave : Engraved border frame : Border resembles a "frame"
Not all of the above are supported by all RTF readers. If you're concerned about portability, consider sticking to the core set of just the first six listed above.
Also, the syntax borders => cellspec
is accepted as a synonym
for borders => [cellspec]
, for when you're specifying just
a single cellspec, for use the the first and all subsequent cells.
Cell border syntax is best shown by example:
borders=> [ "ns-30-db w-25", "all-10-wavy", "none", 13 ],
That means to that the first cell should have a 30-twip-thick double border on the top and bottom (north and south) and a 25-twip-thick single border on the west (and no border on the east side); the second cell should have a 10-twip-thick wavy border on all sides; the third cell should have no borders on any sides; and the fourth (and any additional) cells should have a 13-twip-thick single border on all sides.
Incidentally, when a particular cellspec contains apparently contradictory
declarations, the last one is the one that has an effect. For example,
consider "all-20-db w-10-s"
-- the first part turns on 20-twip double
borders on all sides, and the second part turns on a 10-twip single border on
the west side. Since the second part is last, that's the one that has an
effect -- so just the north, south, and east sides actually get a 20-twip
double border, and the west side gets the 10-twip single border.
(This means that if you say "w-10-s all-20-db"
, the first part
will have no effect, because the second part will override the west-side
declaration.)
If you'd prefer a more formal grammar for this all, this should help:
borderdec := 'borders' => '0' # no borders at all | '1' # same as ["all-15-s"] | [ cellspec, cellspec, ... ] | cellspec # default for one-cell form of the above
cellspec := "" | undef # same as "all-15-s" | int # same as "all-INT-s" (note: 2 <= int <= 75) | "none" # no borders on this cell | (border ( ', ' . border )* ) # a list of border expressions separated by # a comma (and/or whitespace, in fact)
border := direction-thickness-type # For example, "nse-15-s" | direction-type # same as "DIR-15-TYPE" | direction-thickness # same as "DIR-THICK-s" | direction # same as "DIR-15-s"
direction := "all" | qr/^[nsewtblrNSEWTBLR]+$/ # Note that "nw" doesn't mean the direction northwest, but # simultaneously the north and west sides.
thickness := integer in the range 1 - 75
type := "s" | "th" | "sh" | "db" | "dot" | "hair" | (etc)
The book RTF Pocket Guide from O'Reilly. http://www.oreilly.com/catalog/rtfpg/
Copyright 2001,2,3 Sean M. Burke.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
The author of this document is not affiliated with the Microsoft corporation.
Product and company names mentioned in this document may be the trademarks or service marks of their respective owners. Trademarks and service marks are not identified, although this must not be construed as the author's expression of validity or invalidity of each trademark or service mark.
Sean M. Burke, <sburke@cpan.org>
RTF::Writer - for generating documents in Rich Text Format |