Tim Oey‎ > ‎Articles‎ > ‎

Epub Formats

Re: Looking for Electronic Publshing formats...

Tim Oey (oey@apple.com)
Tue, 20 Apr 1993 00:50:46 GMT

BTW, for those who are curious about what setext is, here is a response I
recently sent to an individual who e-mailed me.  Enjoy!


BRANDAUER CARL M <brandy@rintintin.Colorado.EDU> asks:
> What is 'setext' and where can one find it?

'setext' (structure enhanced text) is a budding public domain standard
for formatting plain ascii text so that it is as readable as possible for
humans on essentially any computer terminal.  You might consider setext
to be a set of style conventions for human readable ascii text files
(newsletters, documentation, etc).  The raw setext form is extremely
readable and the small amount of extra work is well worth it.

The great strength of setext is that it is useable on essentially all
computer viewing devices -- with the lowest common denominator being a
VT100, 80 character by 24 line monospaced terminal with a
non-word-wrapping text display.

Adopting a consistent structure also makes setext more amenable to
"machine" readers so people can construct setext browsers that take
advantage of the well structured text to allow humans to access the text
in an even faster and easier fashion.  A setext browser can turn several
megabytes of text into a exceptionally useable encyclopedia or reference
manual.  The best example of this is currently Easy View -- a Macintosh
browser available on AOL, Compuserve, and probably quite a few FTP sites.

As for where to find setext:

1) The TidBits electronic newsletter is distributed in setext format. 
TidBits can be gotten from most of the commercial online services (AOL,
Compuserve, etc) and can also be gotten from comp.sys.mac.digest.

2) For a summary document about setext send an email to
  fileserver@tidbits.com with the single word "setext" in the
  subject line.

3) Contact Ian Feldman (the standard author/keeper) at
setext-list@random.se.

4) If you have a Macintosh available, get Easy View from AOL, Compuserve,
or some FTP site.

5) The following is an example of a setext document (not including the
begin & end lines):
-----------------------------------------------------------------begin

  Administrivia
---------------
  Tim Oey's original headers have been appended last. This is
  an example of how his submission could or would have looked
  had it been piped through a setext formatter (one of many
  possible layouts). Tim's assessment of the "lowest common
  denominator being ascii" should be rephrased, however, as
  "ASCII text formatted and OPTIMIZED for the lowest common-
  denominator hardware still for many more years to come: a
  VT100 terminal screen of 80 char*24 lines of _monospaced_
  character output." Those unable to read this in the Easy View
  appl on the Mac may care to know that, despite obvious lack
  of any visible "coding" elements, this text contains enough
  of mechanically-parseable structure to allow it to appear as
  an outline in that browser's window, thus permitting access
  to its many parts in non-sequential manner while certain of
  its elements would display as richtext (styled) output.
  __ianf


The Question
------------
  by Tim Oey <oey@apple.com>

  In article <1qff1hINNf5u@uwm.edu> Gregory R Block,
  gblock@csd4.csd.uwm.edu writes:

> I'm looking into electronic formats for printed materials.
> Specifically, I'd like to know publications being distributed in
> electronic or electronic and printed form.  I'd like to work on
> some "viewers" for these formats.

  You've hit a chord (at least for me).  I too am looking for
  electronic formats for printed materials.  Particularly for use in
  electronic forums such as BikeNet on America Online.

  After some investigation, it's become apparent that there are many
  viewpoints and considerations to take into account when asking
  this question.


Considerations
--------------

(1) What do you want the target audience to see (hear, use)?
    Formatted text? Fonts? Graphics? Color? B/W? Sound? Video?
    Hypertext navigation?

(2) How will transmission occur? Floppies? CDs? LANs? High speed
    networks? Internet? Phone lines? Commercial Online Networks?

(3) What hardware platform will it be viewed on? ASCII terminals?
    DOS PCs? Unix boxes? Mainframes? Macintoshes?

(4) What software does it need (for viewing as well as creation)?
    Text editor? Wordprocessor? Free viewer? Bundled viewer?

(5) How archivable (is that a word?) does it need to be? Should the
    format be compact? Searchable? Searchable across a set of
    electronic editions?

(6) How convertable does the format need to be? Should it be
    available in a variety of formats? ASCII? Formatted text? RTF?
    SGML? Other?


Comments & possible solutions
-----------------------------
  Depending on your needs and audience you may end up with

(1) a lowest common denominator solution (where everyone has basic
  hardware and software and no one can be expected to upgrade), or

(2) a highest common denominator solution (where you can dictate
  what everyone will be able to use, for instance buy everyone a
  Macintosh and high speed net connections to allow mixed text,
  graphics, video, and sound) or

(3) something in between.

  The lowest common denominator is plain ascii text.  A cool
  variation on this theme is setext -- a recent "standard" for
  formatting plain ascii text to make it more readable for humans as
  well as machines.  It's machine interpretability allows developers
  to create better ways to view setext.

  For a summary document about setext send an email to
  fileserver@tidbits.com with the single word "setext" in the
  subject line.  Or send e-mail to Ian Feldman (the standard author/
  keeper) at setext-list@random.se.  See also in comp.sys.mac.digest
  the online TidBIT postings which are setext documents.  Easy View
  (a free viewer) is an excellent "smart" text browser that
  understands setext format (and some others).  It is available from
  America Online and some other sources.

  As for the highest common denominator, the varied Macintosh
  formats seem to be the most widely available/ distributable/
  flexible/ powerful ones around.


The Middle Ground
-----------------
  The middle ground is where there is alot of action these days.

**SGML** is a very functional standard, but it's hard to handle for
  lots of people and there don't seem to be many inexpensive SGML
  browsers or editors around.  Its strength is that it only captures
  the logical structure of documents.  Its weakness is that it only
  captures the logical structure of documents.

**RTF** is a standard interchange format that's fairly flexible.
  Several popular wordprocessing programs on PCs and Macs can
  handle it (Wordperfect, Microsoft Word, etc).  It does reasonably
  well with text formatting but graphics within docs can be a
  problem between platforms.

**GRAPHICS** can cause lots of problems.  Formats include MacPaint,
  PICT, TIFF, EPS, CGM, GIFF, fax group 3 & group 4, and many
  others.

**COMPOUND DOCUMENTS** are the toughest.  At the moment proprietary
  formats (such as MS Word, Wordperfect, HyperCard, ToolBook, Frame,
  Interleaf, etc) handle these the best.  But some are readable on
  other platforms, some aren't.  Plus everyone needs the appropriate
  software.

  As far as other compound document viewers, AppleLink from Apple
  (an e-mail program), SuperGlue from Portfolio Software, Inc.
  (408-252-0420), and Disk Paper (not sure if this became available
  or not) are all ones that are roughly based on the PICT format
  that work well on the Macintosh.

  Currently Adobe's Acrobat appears to be the best hope for non-
  multimedia interplatform compound document electronic publishing.
  There are some smaller vendors who will also likely compete in
  this arena (Common Ground from No Hands Software??).  They will
  initially run on at least the Mac and PC and eventually Unix
  platforms as well.  However, as of April 14, 93 they are not yet
  shipping.  Soon hopefully we'll see how well they perform in
  practice.


In My Opinion
--------------
  My primary audience is Mac, DOS PC, and Apple II users who are
  connected to online networks (in particular America Online).

  As for my own electronic publishing needs, setext seems to be the
  most flexible, useable, and readable format for text documents.
  The Easy View reader (runs on Macintosh) is really quite good.
  I'm hoping someone will create a comparable one for DOS PCs.

  Graphics are more difficult.  I want a format for both clip art
  and for maps.  Maps are extremely challenging since they generally
  have a large format as well as lots of detail.  For color pictures
  on screen, GIFF is pretty good but I have yet to see GIFF handle
  hi-res (300 dpi) graphics well.  Also GIFF seems to be a totally
  bit-mapped standard.  I also need a vector-graphics standard.
  PICT on the Mac handles both, but people on other platforms have a
  tougher time handling it.

  A good compound document format is currently out of my reach.  For
  the moment, I'm going to rely on RTF.  However, graphics don't
  seem to survive between Mac & PC's that well at this time.

  BTW, surprisingly enough, the best cross platform program I've
  found is FileMaker Pro, from Claris (a flatfile database
  application - IMHO the best one on any platform).  The PC and Mac
  versions use completely interchangeable files.  Even to the point
  of both the PC and Mac having the same file open at the same time
  (!) (FileMaker Pro on both platforms is automatically multi-user).
  Even graphics come across ok on both platforms.


Conclusion
----------
  So there you have it -- at least as far as creating & distributing
  electronic versions of paper documents.  The multimedia arena may
  take alot longer to shake out.  Although many of the electronicly
  published documents have used a multimedia format -- but have been
  usually viewable only on one platform.  These formats look whizzy
  but don't seem very good afterwards.

  As far as I can tell, setext (and plain ascii in general) is
  probably going to be one of the best interchange formats for a
  very, very long time to come.

  We're going to have to all wait for a compact, multi-purpose,
  interchangeable compound document format.

  Enjoy!

--Tim

-----------------------------------------------------------------end

That's all, enjoy!


-------
Tim Oey                     Work: oey@apple.com        |
Apple Computer, MS: 75-6H   Home: oey@aol.com          |    _~C      __C
20400 Stevens Creek Blvd.         TheCyclist@aol.com   |  ='\<,    ='\<,
Cupertino, CA 95014         Applelink: OEY             | (&)/(\)  (&)/(\)
Voice: 408-974-7282         America Online: Oey        |
Fax:   408-974-8983                         TheCyclist |
--------------------Don't pollute, bike commute!!------------------------

Comments