Best WordPress Hosting
 

Preparing for iNaturalist

via make.wordpress.org => original post link

Today we were able to merge some massive and significant changes contributed by @beccawidom to the iNaturalist DAG! This PR includes a number of changes, namely:

The transformation steps have changed from “CSV -> Postgres -> TSV -> Postgres” now to “CSV -> Postgres -> Postgres”. This significantly reduces disk space, time, and processing overhead, and was a necessary change in order to process all of the iNaturalist data in a reasonable timeframe. It also serves as a proof-of-concept for future bulk data imports, since the transformation & data cleaning steps are happening entirely in SQL (an Openverse first!).

Images are now connected with the Catalog of Life, which provides English vernacular names. This should help improve search relevancy over the current scientific names.