CLC Genomics Workbench V6.5.1 版本发布

发布日期:2013-10-21


新特色:

VCF可以输出突变的二倍体报告,从而可以用其他依赖于一行报告两个等位位点的软件进行VCF文件的过滤。其中CLCAD替换为CLCAD2。如果在workflow中输出VCF文件,请参见特别说明。

Special note on VCF export For VCF export, counts from the variant track are put in CLCAD2 tags and coverage in PL tags. The values of the CLCAD2 tag follow the order of REF and ALT, with one value for the REF and for each ALT. When there is no observation in the GT field of the given REF or ALT, the corresponding CLCAD2 value will be 0 (for example if you have a homozygous variant, then the CLCAD2 value for the REF will be 0). This does not mean that the original mapping did not have any reads with that sequence, but it means that the variant track being exported does not contain that variant.

When exporting VCF files, there are three options:

Reference sequence track

Since the VCF format specifies that reference and allele sequences cannot be empty, deletions and insertions have to be padded with bases from the reference sequence. The export needs access to the reference sequence track in order to find the neighboring bases.

Enforce diploid export

For variants that are homozygous, the VCF export will create one entry in the GT field, unless you choose to Enforce diploid export. If you do that, a homozygous variant will be reported with two entries, separated by "/". If you export a variant track that has been filtered, there can be situations where there is only one heterozygous variant at a given position. In this case, the CLC Genomics Workbench will use a "." to denote an unknown genotype, so the GT field will be "1/.". // It is important to note that this Enforce diploid export option will create a diploid format of the VCF file, but it is not able to recover any inconsistencies in the variant track used as input. If the variant track has three variants at a given position, three genotypes will be output. Or if the variant track has two variants at the same position that both postulate to be homozygous, they will be output as two heterozygous variants. When exporting data created by the variant callers of CLC Genomics Workbench, this is usually not a problem, but when applying this diploid scheme to data that has been imported into the CLC Genomics Workbench from other sources, the data can be inconsistent with a diploid model.

Exceptions

Some chromosomes can be excepted from the enforced diploid export. For a human genome, that would be relevant for the mitochondrion and for male X and Y chromosomes. For this option, you can select which chromosomes should be excepted. They will be exported in the standard way without assuming there should be two genotypes, and homozygous calls will just have one value in the GT field.

热图:可在热图上显示图例。


改进:

  • 突变比较工具可以在workflow中使用。

  • 当输入Genbank格式的核酸序列文件时,Workbench可以根据序列而非文件内的描述信息判断是DNA序列还是RNA序列。


修正的错误:

  • 修正的:使用Download Genome或Import Tracks功能输入ensembl类型的gtf文件时出现的解释错误。该问题仅影响Genomics Workbench 6.5版本,如果您使用Genomics Workbench 6.5版本的Download Genomes或Import | Tracks下载基因的注释信息,强烈建议您删除注释文件,重新下载或输入注释信息。6.5之前的版本不存在该问题。

  • 修正的:输入来自UCSC的突变文件时的错误,影响负链上等位序列长于1个碱基的突变。影响使用Download Genome下载的dbSNP track文件,我们强烈建议您删除该输入的或从UCSC下载的突变文件,重新输入或下载。

  • 修正的:当使用多个control reads track进行Filter Against Control Reads时,仅第一个被使用。如果您同时使用多个control read tracks过滤突变,我们强烈建议您重新进行该项分析。

  • 修正的:用来自UniVec的载体信息进行测序数据质控时失败。

  • 修正的:无对照样本时无法进行ChIP-seq分析。

  • 修正的:GFF输出失败。

  • SAM/BAM输出:输入时没有提供的匹配到参考序列上的reads不再包括在unmapped reads中。这些reads根本没输入,输入数据的history中记录了被忽略的reads数目。

  • License对话框中显示的信息改进了。

  • 修正的:track视图中,coloring reads on quality score不影响未比对上的末端。

  • 修正的:track视图中,coloring reads on quality score不能正确处理paired reads。

  • Indel和Structure Variation工具的改进和修正:

  • 改进了对特别是与扩增数据相关的self-mapping证据中插入和缺失突变的检测;

  • 修正的:一些本是insertion或deletions的突变被识别为‘replacements’的错误。

  • 修正的:对长的未比对上的末端进行检测时产生结构变异的错误。

  • 修正的:在trim对话框中,不能对没有将设定重置到default状态的接头列表去选择的错误。

  • 修正的:在trio analysis中,当X染色体、Y染色体和线粒体基因组上的纯和突变没有在父本中发现时被错误地标记为从头突变。

  • 修正的:SAM和BAM输出现在支持直接的gzip和zip压缩格式。

  • 修正的:来自Folder Contents视图中的拷贝信息不起作用。

  • 修正的:bootstrap最大似然法构建进化树时报告内存不足的错误。

关注微信