44 lines
1.3 KiB
ReStructuredText
44 lines
1.3 KiB
ReStructuredText
saucebrush |release|
|
|
====================
|
|
|
|
Overview
|
|
--------
|
|
|
|
saucebrush is a tool for writing ETL pipelines in pure python.
|
|
|
|
The basic premise of saucebrush is that you write `Recipe` that can then
|
|
be applied to data. A `Recipe` is a pipeline consisting of `sources`,
|
|
`filters`, and `sinks`.
|
|
|
|
A `source` is a simple object that yields one data one piece at a time.
|
|
An example of a source might be a CSV file or database, it is also possible
|
|
to write your own sources.
|
|
|
|
A `filter` is a function that takes a single record and returns a modified
|
|
version of that record. Writing a filter is as simple as writing a function
|
|
that modifies a single record in the desired way. A fairly comprehensive
|
|
suite of common filters is also available making it possible to do common
|
|
tasks without writing any of your own filters.
|
|
|
|
An `emitter` is actually a special case `filter` that doesn't modify
|
|
the record but instead writes data out in some way. Emitters can be hooked
|
|
in anywhere in your pipeline but are typically placed at the end to
|
|
save the results of a recipe. Similarly to `sources` filters exist for most
|
|
common formats (CSV, various SQL dialects, etc.) and it is also possible
|
|
to write your own emitter.
|
|
|
|
Contents:
|
|
|
|
.. toctree::
|
|
:maxdepth: 2
|
|
|
|
|
|
|
|
Indices and tables
|
|
==================
|
|
|
|
* :ref:`genindex`
|
|
* :ref:`modindex`
|
|
* :ref:`search`
|
|
|