Crowdsourcing the liberation of data trapped in documents!

CrowData is a tool to collaborate on the verification or release of data that otherwise would be hard or impossible to get via automatic tools.

When to use Crowdata?


Similar projects

Crowdata was inspired in the project from ‘ProPublica <>’_ called ‘Free the Files <>’_ and The Guardian MP´s Expenses and Sarah Palin´s Emails.
It was born from a need that La Nacion had to transform scanned image PDFs into a comprehensible and structured dataset, and also to ask for community’s help to catalog those spendings that call their attention.

Here some of the projects that do the same for some specific cases.


‘Crowdata’ is an open source project that was born when Manuel Aristaran was an Open News fellow at La Nacion in 2013. It was finally released as free software when Gabriela Rodriguez continued it for VozData in 2014. Thanks to Cristian Bertelegni and La Nacion for contributing to the code.

Now it relies on contributions from people and organizations. Please, use it, comment on it and make improvements by pull requests in GitHub.


  • Fork the repo
  • Clone your fork
  • Make a branch of your changes
  • Make a pull request through GitHub, and clearly describe your changes

Indices and tables