Dat is an open source project for building automated, reproducible data pipelines that sync.
Dat Alpha Now Available
Our first stable API, intended for developers and early adopters. Read more here. Our Beta release is under active development now.
Learn more at our getting started guide.
Everything in dat is built using streaming + non-blocking components so that you can work with large datasets and get immediate, real-time results without running out of RAM.
Made with Modules
Dat stores data locally, but you can easily configure it to store its tabular data in the database of your choice (e.g. PostgreSQL) and its files in external file stores (e.g. Google Drive).
You can stream data in and out of dat from the command line using any program that can write to stdin (e.g. R, Python, Ruby, etc) or you can use dat's built in HTTP REST API.
