2004-11-19 : These are design notes. I had a bunch of stuff in my head this morning, and right
now it's all gone vague and disorganized. Feh. Hopefully it will come back to me as I type.
The Problem
I would like to create something to make it easy to present web comics.
Goals / Forces To Resolve
- Would like to extract the common functionality such as rendering a page with first, prev,
next, last buttons. There is some functionality that is common to basically all comics and other
sequential art. Would like to factor that common behavior out so that creating a new archive
would mean just adding the raw data. Would like to have canned views, such as an index view
and a calendar control, etc.
- Need flexibility with the content of a day (node). Sometimes there are comics that have both
a picture and commentary, etc. Maybe there are multiple pictures for a day. (If the file naming
was strict, there wouldn't be a way to generate two different file names.)
A single node may have multiple resources.
- Need flexibility with the ordering. Binding every node to a single day is limiting. What if
you want to have more than one update a day? What if you don't even have a natural mapping to
days? (For example, a couple of pages published all at once each month, or a large historical
archive.) However, a by-date binding is very common and should be easily supported. Sometimes,
pages will have different naming schemes, like chapter cover pages having different names from
inner pages. This breaks any strict file naming requirement.
- Would like it to be as easy to update as possible. Initial use would be to create archives
of strips by hand. Would be nice if there was enough information that a tool could look at the
data directories and identify newly added pages and semi-automatically add them to the data
structure. Would even like to have tools that could to a web scrape and append to the archive.
With enough information in the archive, it could automatically check backward to see if any
entries had been missed, or notice that the comic hadn't changed and "today" had already been
downloaded yesterday.
- "Commercialization", or making the infrastructure available to other sites, would add new
requirements. These requirements are not important when creating the prelimiary version, since
distributing it to other sites would not happen for a long time (or, until a lot of functionality
has been implemented). However, nothing in the original design should preclude against this.
- For example,
repository access has to be scalable. For example, scanning the directory on every page view
to determine the list
of files (with an implied sequence by date embedded in the file) does not scale well as the size
of the sequence grows. An XML file (or the data structure recovered from one) can be cached and
queried much faster.
- A "commercial" package would also require a web interface so artists could
upload new comics and maintain their sequences very easily.
- A "commercial" package would need
things like access control to limit which users can see "premium" features and preven people from
looking ahead (so the artist can build up a buffer).
Design
Concepts:
- node: A node represents unit in a sequence. It's basically the equivalent of one day
for a web comic. However, there is no requirement that a node map directly to a day - there
may be more than one node for a day, as in a LiveJournal.
- resource: Resources are the artifacts associated with a node. A simple web comic may
just have a full size comic image file as the single resource for a node. However, some web comics
have commentary from day to day. The day's commentary would also be a resource. There might be
a thumbnail image as well. Not all resources are displayed in all views. A web comic like Sluggy
Freelance has multiple images, plus layout html, plus possibly special style info (for black
backgrounds during Halloween storylines, etc.).
- sequence: A sequence is simply an ordered list of nodes. A repository may have more
than one sequence, and a node may be a member of more than one sequence. For example, say a web
comic displays fan art every Wednesday. The "daily" sequence would have mixed comics and fan art.
The "story-only" sequence would have only the comics, skipping the fan art. The "fan-art" sequence
would have only fan art. Thus a fan-art node would be a member of both the daily sequence and the
fan-art sequence.
- partition: identifies sections of a sequence and gives it a hierarchy. Allows a story
to be broken up into chapters.
- repository: top level data structure that holds the all relevant data.
- view: a transformation of the data in the repository into a presentable form. For
example, there may be a daily view, which shows one day with a full size comic, and there may be
an index view, which shows every day on one page as a thumbnail.
XML is convenient because it allows freeform data when necessary. This would make adding
the resources to the node easy. Plus you could use XSLT to transform the resources for a node
to produce the view. I like the idea of adding a standard mechanism for indicating that the
contents of an XML element is actually located in another file. That could make updating much
easier and reduce contention should there ever be a significant number of updates.
Need to define a standard file layout, and standard elements and attributes. For example,
the date mapping should be standard so that a common calendar control can be used. Similarly
partition information. There should be a standard way
to indicate the preferred view transform for
a node (so that one node with an unusual layout can be
handled easily), and possibly the preferred view transform for a repository or a sequence (so
that the standard transform doesn't have to be specified in every node, only the special cases).
Also, node name, for index views.
Need to define the standard components and views. For example, Full page w/ prev, next
buttons. Calendar view. TOC/index view. Hierarchical locator dropdown.
Some of the desired functionality would be better represented in a full programming language
like Java, but access to such is difficult from within web pages. There are multiple standard for
web site scripting (ASP, PHP, ASP.Net, JSP) which means that, for example, if significant effort
was spent developing an ASP implementation, it would not be "commercializable" to a site based on
PHP.
Need to research of there is something like an easy way to invoke Java from both ASP and PHP.
{repository}
{sequences}
{sequence id="main"}c1p1 c1p2 c1p3 c1p4{/sequence} {!-- how should this be delimited? --}
{/sequences}
{transforms /} {!-- bind transform names to xlst files, etc. --}
{nodes}
{node id="c1p1" date="19990909" title="The Fun Begins"} {!-- id used in sequece;
standard date mapping?; title used in TOC? --}
{altTransform use="pageWithComment" /} {!-- identify a custom transform by name;
may also specify xslt file, etc. --}
{img id="main" src="chapter1\pg1.gif" width="700" height="500" /}
{comment src="chapter1\pg1.txt"}
{img id="thumb" src="chapter1\t-pg1.gif" width="70" height="50" /}
{/node}
{nodes href="chapter2\nodex.xml"} {!-- make recursive, so that we can link in other files.
should still load into same namespace. Or, multiple top-level. --}
{/nodes}
{/repository}
Look at Alien Dice for partition model. I had a pretty good one, but I don't remeber
the details.
There is a bit of a difference between views that look at a single node (page view) and
views that look over all the nodes (calendar view). How significant is this? I guess in some
sense every view must be over the entire set of nodes, or how could the page view generate the
navigation buttons? Perhaps all views are "centered" on a node but are allowed to traverse the
entire data structure. A TOC view would basically ignore which node it was centered on.
Interestingly, such a design allows for scoped TOCs, like showing all the pages of just
the current chapter in thumbnail. (The hierarchical locator dropdown is also an example of a
scoped TOC.)
It looks like most controls could render with just a sequence id and a node id, plus some
style information. In fact, the sequence id may not even be necessary - some controls may only
work on certain sequences (calendar only works on by-date, etc.) Or, the sequence id is more
of a "configure"-time parameter than a run-time parameter (hierarchical locator,
prev/next buttons)
C o m m e n t s :
(nothing yet)