MethPipe Release History

MethPipe: a computational pipeline for analyzing bisulfite sequencing data
http://smithlabresearch.org/software/methpipe/

Release History (can also be viewed on our Github repo):

* Version 5.0.0 [2021-07-01]

  • This major release of MethPipe changes the first steps in the pipeline by directly accepting the SAM/BAM input file formats from a wide range of mappers. In continuation of release 4.1.2-alpha, we have discontinued the to-mr function to replace it with format_reads, which standardizes SAM formats from different mappers to an output that is understandable by the subsequent Methpipe tools.

    The new pipeline, which describes each tool available on MethPipe, is documented in the manual under docs/methpipe-manual.pdf. In it, we describe how to map reads to a FASTA reference genome using abismal.

    Besides the transition to SAM input, we have addressed the following issues:

    • Issue #161 was fixed, where compiling with clang (used by default in the Mac OS) was failing in several programs that reported “bad input”
    • Issues #177 , #183 and #184 were addressed where FASTA files that contained spaces were causing problems in methcounts. FASTA inputs can have arbitrary read names as long as the first word of each chromosome is unique.
    • Issue #180 has been fixed, and users can now set LDFLAGS and CPPFLAGS when cloning the repository from source

* Version 4.1.2-alpha [2020-12-20]

  • This is a pre-release that focuses on full transition from MR to SAM file format. Much of the functionality of 4.1.2 is similar to 4.1.1, but mr files are no longer supported. This means some major changes in installation and the downstream pipeline:
  • HTSLib is no longer an optional dependency. Installation of the htslib can be done through apt, brew or by following the installation guidelines. HTSLib is used for BAM file decompression, so BAM and SAM files can be used interchangeably.
  • we recommend mapping reads using the new abismal software tool, which generates SAM output.
  • to-mr is replaced by format_reads, and must be run both on abismal outputs and in SAM files generated by other mappers. This tool converts paired-end reads to single-end by merging mates, and also formats the single-end mapped reads to a standardized format used by downstream tools. SAM files generated by other mappers can similarly be provided as input, but the mapper of origin must be specified with the -f flag (currently supported mappers are abismal, bismark and bsmap)
  • bsrate, duplicate-remover, methcounts and methstates now use the formatted SAM files as input.

* Version 4.1.1 [2020-06-23]

# Bug fixes

  • Fixes a problem with the MappedRead class that was crashing programs that used them.

* Version 4.0.0 [2019-7-30]

# Enhancements

  • Major feature added: capacity to read gzip format in many tools
  • Code now links to HTSlib for any BAM/SAM format I/O (optional)
  • This version requires that the smithlab_cpp library is built separately (likely will be changed in future release).

* Version 3.4.3 [2017-03-06 Mon]
# Bug fixes

  1. Fixed a bug in the `dmr` program, which used to report fewer DMR regions than it should be.

#Enhancements

  1. The `methcounts` format now allows user-specified header lines, starting with “#”, to provide more information. All downstream programs are compatible with the header now.
  2. The manual is updated with the information of the new WGBS mapper `walt`.
  3. The whole package is now based on C++11, allowing future enhancements.

* Version 3.4.2 [2015-11-13 Fri]
# Bug fixes

  1. Compressed source generated from github did not contain submodule smithlab_cpp, rendering it uncompilable from the website.

* Version 3.4.1 [2015-11-11 Tues]
# Bug fixes

  1. Fixed bug in methcounts caused by improperly checking for write-privileges before opening output file stream

* Version 3.4.0 [2015-11-01 Sun]
# Bug fixes

  1. Fixed bug in bsrate causing abort while processing reads that hang over the end of a chromosome (and added warning message).
  2. The mutation tracking in methcounts introduced a bug in merge-methcounts when parsing mutated contexts. merge-methcounts now only counts non-mutated site information and doesn’t break while parsing mutated contexts.
  3. duplicate-remover statistics tracking now functions properly: previously the good_bases_out value was not correct when using the sequence info option.
  4. allelicmeth no longer throws a “could not convert” exception when converting the index of the last CpG back to genomic coordinates.
  5. hmr and pmd no longer skip the last domain of each chromosome
  6. merge-bsrate output now includes previously omitted headers, consistent with bsrate output
  7. Fixed hypermr numerical issues by renormalizing learned posterior and transition probabilities at each iteration of the Baum Welch training.
  8. to-mr now ignores discordant pairs

#Enhancements

  1. WALT, our new space-seeded wildcard bisulfite read mapper, has been integrated into our manual as the de facto mapper in methpipe. It can be cloned from https://github.com/smithlabcode/walt
  2. radmeth, a recently developed tool for multi-factor, multi-replicate differential methylation analysis, has been integrated into methpipe. A detailed description of the functionality is available in the manual.
  3. bigwig_to_methcounts.py introduced as a tool to convert tracks downloaded from MethBase on the UCSC genome browser to methcounts format. A description is in the manual.
  4. methcounts memory usage halved and now prints estimate of memory usage for each chromosome
  5. methcounts now prints every cytosine in the reference by default, even if an entire chromosome is not covered, to maintain line number consistency across samples
  6. merge-methcounts now provides an option to output the merged methylomes in a table format for easy piped downstream analysis. It now prints a union of CpG sites from its input files, even though the number of CpG sites for each file is different.
  7. roimethstat memory usage reduced drastically by loading only CpGs within target regions into memory.
  8. Added option to hmr to specify random number generator seed, allowing user to exactly reproduce results if necessary.
  9. hmr now reports the “effective genome proportion,” the percentage of the genome not in deserts, in the verbose output.
  10. hmr no longer has nondeterministic behavior.
  11. levels output format is slightly changed. Now it is technically in YAML format.

# Organizational changes

  1. Substantial changes to the manual to introduce WALT as our new standard read mapper
  2. Removed experimental directory build from makefile to prevent user use of programs that are not rigorously tested or production-ready
  3. Some outdated and MethBase related stuff were removed.

* Version 3.3.1 [2014-08-05 Mon]
# Bug fixes

  1. Fixed small epiread input bug introduced in amrfinder in most recent release.

* Version 3.3.0 [2014-08-04 Mon]
# Bug fixes

  1. Fixed corner-case rounding issues in several programs caused by inconsistent use of various rounding functions. They have been replaced by std::tr1::round.
  2. Removed use of buffer for methcounts and allelicmeth leading to unpredictable memory usage by jackpot regions of unrealistically high coverage. This change has stabilized the memory usage of these two programs at 8GB for the human genome.
  3. amrfinder no longer fails silently on incorrectly formatted epiread input files

# Enhancements

  1. methcounts now produces strand- and context-specific methylation levels and reports cytosines that have mutated away from the reference genome.
  2. methcounts now has an option to produce only CpG methylation levels. This limits the output size from 25GB to 1GB for the human genome.
  3. A new tool, symmetric-cpgs, has been added under utils to merge information of symmetric CpGs so that the output of methcounts is compatible with other analysis tools.
  4. levels now produces all information the methcounts statistics file contained as well as methylation levels for each cytosine context and mutation rate information.
  5. methdiff now contains an option to restrict output to sites with nonzero coverage
  6. methdiff output now is more consistent with methcounts output: changes detailed in the manual
  7. library and merge-methylomes have been updated to reflect changes to methcounts output.

# Organizational changes

  1. Substantial changes to the manual to highlight the change in usage of methcounts output with downstream analysis tools
  2. A PDF copy of the manual is included in /docs.

* Version 3.2.0 [2014-06-19 Thu]
# Bug fixes

  1. Fixed strand error for methcounts when dealing with 0 coverage sites at the tail of chromosomes. This prevents an error in merge-methcounts.
  2. Removed some unused variables to prevent unnecessary compilation warning messages.
  3. Fixed compilation error in Mac OSX 10.9 Mavericks caused by samtools dependency.
  4. Fixed a bug in to-mr when the input file name does not have an extension.
  5. Fixed silent failure in amrfinder when no AMRs were found
  6. Fixed a bug in amrfinder related to conservatively biased likelihood values in low coverage samples.
  7. Fixed a bug in amrfinder related to the likelihood penalty for imbalanced alleles.

# Enhancements

  1. merge-methcounts now reports more specific error message when input files are not synchronized.
  2. The manual now describes usage of the following programs: methentropy, fastLiftOver.

# Organizational changes

  1. hmr_plant has been renamed to hypermr to reflect its more general usage.
  2. A PDF copy of the manual is included in /docs.
  3. merge-bsrate, merge-methcounts, and duplicate-remover moved to the /utils directory
  4. Deprecated programs reorder and clipmates have been removed.
  5. Directories created for mlml and amrfinder and related programs to differentiate standalone published tools from core methpipe functions.

* Version 3.1.1 [2014-05-07 Wed]

  1. Integrated MLML into analysis directory.
  2. Enhancements to amrfinder and other programs
  3. MethPipe is now on GitHub: https://github.com/smithlabcode/methpipe

* Version 3.0.1 [2013-10-04 Fri]

  1. Fixed a bug when reading sequence file containing multiple chromosomes in methcounts, allelicmeth and amrfinder
  2. Updated to-mr substantially to use samtools APIs. It accepts SAM and BAM format as input files native. The bugs in interpreting the meaning FLAGs from output of bismark are fixed
  3. Thanks to Mark Robinson and Charity Law for pointing out the issues above
  4. Streamlined IO of epiread files; Fixed some bugs and changed the default parameters in amrfinder and amrtester

* Version 3.0.0 [2013-09-06 Fri]

  1. initial public release accompanying the submission of the MethPipe paper