xpdf-no-select-disableHEAD master

author: Calvin Morrison <calvin@pobox.com> 2023-04-05 14:13:39 -0400
committer: Calvin Morrison <calvin@pobox.com> 2023-04-05 14:13:39 -0400
commit: 835e373b3eeaabcd0621ed6798ab500f37982fae (patch)
tree: dfa16b0e2e1b4956b38f693220eac4e607802133 /doc/pdftohtml.cat
1 files changed, 144 insertions, 0 deletions
diff --git a/doc/pdftohtml.cat b/doc/pdftohtml.cat
new file mode 100644
index 0000000..5ddbfa0
--- /dev/null
+++ b/doc/pdftohtml.cat
@@ -0,0 +1,144 @@
+pdftohtml(1)                General Commands Manual               pdftohtml(1)
+
+
+
+NAME
+       pdftohtml  -  Portable Document Format (PDF) to HTML converter (version
+       4.04)
+
+SYNOPSIS
+       pdftohtml [options] PDF-file HTML-dir
+
+DESCRIPTION
+       Pdftohtml converts Portable Document Format (PDF) files to HTML.
+
+       Pdftohtml reads the PDF file, PDF-file, and places  an  HTML  file  for
+       each page, along with auxiliary images in the directory, HTML-dir.  The
+       HTML directory will be created; if it already  exists,  pdftohtml  will
+       report an error.
+
+CONFIGURATION FILE
+       Pdftohtml  reads  a  configuration  file at startup.  It first tries to
+       find the user's private config file, ~/.xpdfrc.  If that doesn't exist,
+       it looks for a system-wide config file, typically /etc/xpdfrc (but this
+       location can be changed when pdftohtml is built).   See  the  xpdfrc(5)
+       man page for details.
+
+OPTIONS
+       Many  of  the following options can be set with configuration file com-
+       mands.  These are listed in square brackets with the description of the
+       corresponding command line option.
+
+       -f number
+              Specifies the first page to convert.
+
+       -l number
+              Specifies the last page to convert.
+
+       -z number
+              Specifies  the  initial  zoom  level.  The default is 1.0, which
+              means 72dpi, i.e., 1 point in the PDF file will be  1  pixel  in
+              the  HTML.   Using  '-z 1.5', for example, will make the initial
+              view 50% larger.
+
+       -r number
+              Specifies the resolution, in DPI, for background  images.   This
+              controls the pixel size of the background image files.  The ini-
+              tial zoom level is controlled by the '-z' option.  Specifying  a
+              larger '-r' value will allow the viewer to zoom in farther with-
+              out upscaling artifacts in the background.
+
+       -vstretch number
+              Specifies a vertical stretch factor.  Setting this  to  a  value
+              greater  than  1.0  will stretch each page vertically, spreading
+              out the lines.  This also  stretches  the  background  image  to
+              match.
+
+       -embedbackground
+              Embeds  the  background image as base64-encoded data directly in
+              the HTML file, rather than storing it as a separate file.
+
+       -nofonts
+              Disable extraction of embedded  fonts.   By  default,  pdftohtml
+              extracts  TrueType and OpenType fonts.  Disabling extraction can
+              work around problems with buggy fonts.
+
+       -embedfonts
+              Embeds any extracted fonts as base64-encoded  data  directly  in
+              the HTML file, rather than storing them as separate files.
+
+       -skipinvisible
+              Don't draw invisible text.  By default, invisible text (commonly
+              used in OCR'ed PDF files) is drawn as transparent (alpha=0) HTML
+              text.   This  option  tells  pdftohtml to discard invisible text
+              entirely.
+
+       -allinvisible
+              Treat all text as invisible.  By default,  regular  (non-invisi-
+              ble)  text  is not drawn in the background image, and is instead
+              drawn with HTML on top of the image.  This option  tells  pdfto-
+              html  to  include  the regular text in the background image, and
+              then draw it as transparent (alpha=0) HTML text.
+
+       -formfields
+              Convert AcroForm text and checkbox fields  to  HTML  input  ele-
+              ments.  This also removes text (e.g., underscore characters) and
+              erases background image content (e.g., lines or  boxes)  in  the
+              field areas.
+
+       -table Use  table  mode when performing the underlying text extraction.
+              This will generally produce better output when the  PDF  content
+              is  a  full-page table.  NB: This does not generate HTML tables;
+              it just changes the way text is split up.
+
+       -opw password
+              Specify the owner password for the  PDF  file.   Providing  this
+              will bypass all security restrictions.
+
+       -upw password
+              Specify the user password for the PDF file.
+
+       -verbose
+              Print  a status message (to stdout) before processing each page.
+              [config file: printStatusInfo]
+
+       -q     Don't print any messages or errors.  [config file: errQuiet]
+
+       -cfg config-file
+              Read config-file in place of ~/.xpdfrc or the system-wide config
+              file.
+
+       -v     Print copyright and version information.
+
+       -h     Print usage information.  (-help and --help are equivalent.)
+
+BUGS
+       Some  PDF  files contain fonts whose encodings have been mangled beyond
+       recognition.  There is no way (short of OCR) to extract text from these
+       files.
+
+EXIT CODES
+       The Xpdf tools use the following exit codes:
+
+       0      No error.
+
+       1      Error opening a PDF file.
+
+       2      Error opening an output file.
+
+       3      Error related to PDF permissions.
+
+       99     Other error.
+
+AUTHOR
+       The  pdftohtml software and documentation are copyright 1996-2022 Glyph
+       & Cog, LLC.
+
+SEE ALSO
+       xpdf(1),  pdftops(1),  pdftotext(1),  pdfinfo(1),  pdffonts(1),  pdfde-
+       tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5)
+       http://www.xpdfreader.com/
+
+
+
+                                  18 Apr 2022                     pdftohtml(1)
author	Calvin Morrison <calvin@pobox.com>	2023-04-05 14:13:39 -0400
committer	Calvin Morrison <calvin@pobox.com>	2023-04-05 14:13:39 -0400
commit	835e373b3eeaabcd0621ed6798ab500f37982fae (patch)
tree	dfa16b0e2e1b4956b38f693220eac4e607802133 /doc/pdftohtml.cat