My work on data sanitization for functional genomics data is out in Cell. In this study, we addressed the most obvious genome privacy leakage from the next-generation sequencing reads of functional genomics data with practical solutions. Our goal is to allow researchers to share raw functional genomics data while preserving privacy & utility. We hope that this will have widespread use and will democratize access to thousands of Chip-seq, single-cell or bulk RNA-seq, ATAC-seq and many other raw functional genomics data that are behind firewall.
We tried to create a robust software with a detailed README. Check it out at here. Our github also contains docker images and WDL scripts to run it in any environment. Please feel free to contact me or use issues on github if you have any feedback both on implementation and privacy and utility improvements.