Developing systems for non-model organism de novo genome assembly and annotation

Note: This project is now complete. BioCommons support is continuing through community engagement.


All are welcome to join the the genome assembly or genome annotation communities


The objectives for this activity are to thoughtfully design (with extensive and representative community engagement) and implement a service (or a number of highly related services) that support communities of researchers wishing to undertake de novo genome assembly and genome annotation across a variety of taxa - including but not limited to, plants and animals. 

It is intended that mature service(s) required for genome assembly and annotation will be developed in an iterative fashion over 2+ years - from basic to advanced functionality.

Specific topics within this program of work include: 

  • Documenting the tools and infrastructure currently used by the research community for de novo genome assembly and annotation, including challenges faced that prevent working in a collaborative and/or efficient manner.

  • Eliciting from the community, collective and agreed requirements for shared national infrastructure that could be built to support de novo genome assembly and genome annotation.

  • Producing community endorsed infrastructure roadmap documents outlining the features of shared national infrastructure that could be built to support both de novo genome assembly and genome annotation.

  • Producing design specifications outlining the features and attributes of de novo genome assembly and genome annotation services (based on the infrastructure roadmap documents).

  • Working closely with the BioCloud implementation team to ensure systems for de novo genome assembly are deployed for piloting by a core set of researchers, and to ensure services that have enjoyed a successful pilot phase are then deployed as part of the BioCommons (e.g. a hosted Apollo Platform to support manual collaborative curation of genome annotations).

  • Continued planning for establishment of a shared infrastructure to support genome annotation.



Project partners and contributors include