2012-02-13 23:35:58 +00:00
|
|
|
Oyster is designed as a 'proactive document cache', meaning that it takes URLs
|
|
|
|
and keeps a local copy up to date depending on user-specified criteria.
|
|
|
|
|
|
|
|
Data Model
|
|
|
|
==========
|
|
|
|
|
|
|
|
tracked - metadata for tracked resources
|
|
|
|
_id : internal id
|
|
|
|
_random : a random integer used for sorting
|
|
|
|
url : url of resource
|
2012-02-14 21:42:41 +00:00
|
|
|
doc_class : string indicating the document class, allows for different
|
|
|
|
settings/hooks for some documents
|
2012-02-13 23:35:58 +00:00
|
|
|
metadata : dictionary of extra user-specified attributes
|
2012-02-14 21:42:41 +00:00
|
|
|
versions : list of dictionaries with the following keys:
|
|
|
|
timestamp : UTC timestamp
|
|
|
|
<storage_key> : storage_id
|
|
|
|
(may be s3_url, gridfs_id, etc.)
|
2012-02-13 23:35:58 +00:00
|
|
|
|
|
|
|
logs - capped log collection
|
|
|
|
action : log entry
|
|
|
|
url : url that action was related to
|
|
|
|
error : boolean error flag
|
|
|
|
timestamp : UTC timestamp for log entry
|
|
|
|
|
|
|
|
status - internal state
|
|
|
|
update_queue : size of update queue
|
|
|
|
|
|
|
|
|
2012-02-14 21:42:41 +00:00
|
|
|
Storage Interface
|
|
|
|
=================
|
|
|
|
storage_key : key to store on versions
|
|
|
|
put(tracked_doc, data, content_type) -> id
|
|
|
|
get(id) -> file type object
|
|
|
|
|