Package vcf
Class VcfRec
- java.lang.Object
-
- vcf.VcfRec
-
- All Implemented Interfaces:
IntArray
,DuplicatesGTRec
,GTRec
,MarkerContainer
public final class VcfRec extends java.lang.Object implements GTRec
Class
VcfRec
represents a VCF record. If one allele in a diploid genotype is missing, then both alleles are set to missing.Instances of class
VcfRec
are immutable.
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description int
allele1(int sample)
Returns the first allele for the specified sample or -1 if the allele is missing.int
allele2(int sample)
Returns the second allele for the specified sample or -1 if the allele is missing.int[]
alleles()
Returns an array of lengththis.size()
whosej
-th element is equal tothis.allele(j
}java.lang.String
filter()
Returns the FILTER field.java.lang.String
format()
Returns the FORMAT field.java.lang.String[]
formatData(java.lang.String formatCode)
Returns an array of lengththis.size()
containing the specified FORMAT subfield data for each sample.int
formatIndex(java.lang.String formatCode)
Returns the index of the specified FORMAT subfield if the specified subfield is defined for this VCF record, and returns -1 otherwise.java.lang.String
formatSubfield(int subfieldIndex)
Returns the specified FORMAT subfield.static VcfRec
fromGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRec
instance from a VCF record and its GL or PL format subfield data.static VcfRec
fromGT(VcfHeader vcfHeader, java.lang.String vcfRecord)
Constructs and returns a newVcfRec
instance from a VCF record and its GT format subfield datastatic VcfRec
fromGTGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRec
instance from a VCF record and its GT, GL, and PL format subfield data.int
get(int hap)
Returns the specified allele for the specified haplotype or -1 if the allele is missing.float
gl(int sample, int allele1, int allele2)
Returns the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype.static int
gtIndex(int a1, int a2)
Returns the VCF genotype index for the specified pair of alleles.boolean
hasFormat(java.lang.String formatCode)
Returnstrue
if the specified FORMAT subfield is present, and returnsfalse
otherwise.java.lang.String
info()
Returns the INFO field.boolean
isPhased()
Returnstrue
if every genotype for each sample is a phased, non-missing genotype, and returnsfalse
otherwise.boolean
isPhased(int sample)
Returnstrue
if the genotype for the specified sample has non-missing alleles and is either haploid or diploid with a phased allele separator, and returnsfalse
otherwise.Marker
marker()
Returns the marker.int
nFormatSubfields()
Returns the number of FORMAT subfields.java.lang.String
qual()
Returns the QUAL field.java.lang.String
sampleData(int sample)
Returns the data for the specified sample.java.lang.String
sampleData(int sample, int subfieldIndex)
Returns the specified data for the specified sample.java.lang.String
sampleData(int sample, java.lang.String formatCode)
Returns the specified data for the specified sample.Samples
samples()
Returns the list of samples.int
size()
Returns the number of haplotypes.java.lang.String
toString()
Returns the VCF record.VcfHeader
vcfHeader()
Returns the VCF meta-information lines and the VCF header line.
-
-
-
Field Detail
-
GL_FORMAT
public static final java.lang.String GL_FORMAT
The VCF FORMAT code for log-scaled genotype likelihood data: "GL".- See Also:
- Constant Field Values
-
PL_FORMAT
public static final java.lang.String PL_FORMAT
The VCF FORMAT code for phred-scaled genotype likelihood data: "PL".- See Also:
- Constant Field Values
-
-
Method Detail
-
gtIndex
public static int gtIndex(int a1, int a2)
Returns the VCF genotype index for the specified pair of alleles.- Parameters:
a1
- the first allelea2
- the second allele- Returns:
- the VCF genotype index for the specified pair of alleles
- Throws:
java.lang.IllegalArgumentException
- ifa1 < 0 || a2 < 0
-
fromGT
public static VcfRec fromGT(VcfHeader vcfHeader, java.lang.String vcfRecord)
Constructs and returns a newVcfRec
instance from a VCF record and its GT format subfield data- Parameters:
vcfHeader
- meta-information lines and header line for the specified VCF record.vcfRecord
- a VCF record with a GL format field corresponding to the specifiedvcfHeader
object- Returns:
- a new
VcfRec
instance - Throws:
java.lang.IllegalArgumentException
- if the VCF record does not have a GT format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is detectedjava.lang.IllegalArgumentException
- if there are notvcfHeader.nHeaderFields()
tab-delimited fields in the specified VCF recordjava.lang.NullPointerException
- ifvcfHeader == null || vcfRecord == null
-
fromGL
public static VcfRec fromGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRec
instance from a VCF record and its GL or PL format subfield data. If both GL and PL format subfields are present, the GL format field will be used. If the maximum normalized genotype likelihood is 1.0 for a sample, then any other genotype likelihood for the sample that is less thanlrThreshold
is set to 0.- Parameters:
vcfHeader
- meta-information lines and header line for the specified VCF recordvcfRecord
- a VCF record with a GL format field corresponding to the specifiedvcfHeader
objectmaxLR
- the maximum likelihood ratio- Returns:
- a new
VcfRec
instance - Throws:
java.lang.IllegalArgumentException
- if the VCF record does not have a GL format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is detectedjava.lang.IllegalArgumentException
- if there are notvcfHeader.nHeaderFields()
tab-delimited fields in the specified VCF recordjava.lang.NullPointerException
- ifvcfHeader == null || vcfRecord == null
-
fromGTGL
public static VcfRec fromGTGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRec
instance from a VCF record and its GT, GL, and PL format subfield data. If the GT format subfield is present and non-missing, the GT format subfield is used to determine genotype likelihoods. Otherwise the GL or PL format subfield is used to determine genotype likelihoods. If both the GL and PL format subfields are present, only the GL format subfield will be used. If the maximum normalized genotype likelihood is 1.0 for a sample, then any other genotype likelihood for the sample that is less thanlrThreshold
is set to 0.- Parameters:
vcfHeader
- meta-information lines and header line for the specified VCF recordvcfRecord
- a VCF record with a GT, a GL or a PL format field corresponding to the specifiedvcfHeader
objectmaxLR
- the maximum likelihood ratio- Returns:
- a new
VcfRec
- Throws:
java.lang.IllegalArgumentException
- if the VCF record does not have a GT, GL, or PL format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is detectedjava.lang.IllegalArgumentException
- if there are notvcfHeader.nHeaderFields()
tab-delimited fields in the specified VCF recordjava.lang.NullPointerException
- ifvcfHeader == null || vcfRecord == null
-
qual
public java.lang.String qual()
Returns the QUAL field.- Returns:
- the QUAL field
-
filter
public java.lang.String filter()
Returns the FILTER field.- Returns:
- the FILTER field
-
info
public java.lang.String info()
Returns the INFO field.- Returns:
- the INFO field
-
format
public java.lang.String format()
Returns the FORMAT field. Returns the empty string ("") if the FORMAT field is missing.- Returns:
- the FORMAT field
-
nFormatSubfields
public int nFormatSubfields()
Returns the number of FORMAT subfields.- Returns:
- the number of FORMAT subfields
-
formatSubfield
public java.lang.String formatSubfield(int subfieldIndex)
Returns the specified FORMAT subfield.- Parameters:
subfieldIndex
- a FORMAT subfield index- Returns:
- the specified FORMAT subfield
- Throws:
java.lang.IndexOutOfBoundsException
- ifsubfieldIndex < 0 || subfieldIndex >= this.nFormatSubfields()
-
hasFormat
public boolean hasFormat(java.lang.String formatCode)
Returnstrue
if the specified FORMAT subfield is present, and returnsfalse
otherwise.- Parameters:
formatCode
- a FORMAT subfield code- Returns:
true
if the specified FORMAT subfield is present
-
formatIndex
public int formatIndex(java.lang.String formatCode)
Returns the index of the specified FORMAT subfield if the specified subfield is defined for this VCF record, and returns -1 otherwise.- Parameters:
formatCode
- the format subfield code- Returns:
- the index of the specified FORMAT subfield if the
specified subfield is defined for this VCF record, and
-1
otherwise
-
sampleData
public java.lang.String sampleData(int sample)
Returns the data for the specified sample.- Parameters:
sample
- a sample index- Returns:
- the data for the specified sample
- Throws:
java.lang.IndexOutOfBoundsException
- ifsample < 0 || sample >= this.size()
-
sampleData
public java.lang.String sampleData(int sample, java.lang.String formatCode)
Returns the specified data for the specified sample.- Parameters:
sample
- a sample indexformatCode
- a FORMAT subfield code- Returns:
- the specified data for the specified sample
- Throws:
java.lang.IllegalArgumentException
- ifthis.hasFormat(formatCode)==false
java.lang.IndexOutOfBoundsException
- ifsample < 0 || sample >= this.size()
-
sampleData
public java.lang.String sampleData(int sample, int subfieldIndex)
Returns the specified data for the specified sample.- Parameters:
sample
- a sample indexsubfieldIndex
- a FORMAT subfield index- Returns:
- the specified data for the specified sample
- Throws:
java.lang.IndexOutOfBoundsException
- iffield < 0 || field >= this.nFormatSubfields()
java.lang.IndexOutOfBoundsException
- ifsample < 0 || sample >= this.size()
-
formatData
public java.lang.String[] formatData(java.lang.String formatCode)
Returns an array of lengththis.size()
containing the specified FORMAT subfield data for each sample. Thek
-th element of the array is the specified FORMAT subfield data for thek
-th sample.- Parameters:
formatCode
- a format subfield code- Returns:
- an array of length
this.size()
containing the specified FORMAT subfield data for each sample - Throws:
java.lang.IllegalArgumentException
- ifthis.hasFormat(formatCode) == false
-
samples
public Samples samples()
Description copied from interface:GTRec
Returns the list of samples.
-
vcfHeader
public VcfHeader vcfHeader()
Returns the VCF meta-information lines and the VCF header line.- Returns:
- the VCF meta-information lines and the VCF header line
-
marker
public Marker marker()
Description copied from interface:MarkerContainer
Returns the marker.- Specified by:
marker
in interfaceMarkerContainer
- Returns:
- the marker
-
allele1
public int allele1(int sample)
Description copied from interface:DuplicatesGTRec
Returns the first allele for the specified sample or -1 if the allele is missing. The two alleles for a sample are arbitrarily ordered ifthis.unphased(marker, sample) == false
.- Specified by:
allele1
in interfaceDuplicatesGTRec
- Parameters:
sample
- a sample index- Returns:
- the first allele for the specified sample
-
allele2
public int allele2(int sample)
Description copied from interface:DuplicatesGTRec
Returns the second allele for the specified sample or -1 if the allele is missing. The two alleles for a sample are arbitrarily ordered ifthis.unphased(marker, sample) == false
.- Specified by:
allele2
in interfaceDuplicatesGTRec
- Parameters:
sample
- a sample index- Returns:
- the second allele for the specified sample
-
get
public int get(int hap)
Description copied from interface:DuplicatesGTRec
Returns the specified allele for the specified haplotype or -1 if the allele is missing. The two alleles for a sample at a marker are arbitrarily ordered ifthis.unphased(marker, hap/2) == false
.- Specified by:
get
in interfaceDuplicatesGTRec
- Specified by:
get
in interfaceIntArray
- Parameters:
hap
- a haplotype index- Returns:
- the specified allele for the specified sample
-
alleles
public int[] alleles()
Description copied from interface:DuplicatesGTRec
Returns an array of lengththis.size()
whosej
-th element is equal tothis.allele(j
}- Specified by:
alleles
in interfaceDuplicatesGTRec
- Returns:
- an array of length
this.size()
whosej
-th element is equal tothis.allele(j
}
-
isPhased
public boolean isPhased(int sample)
Description copied from interface:DuplicatesGTRec
Returnstrue
if the genotype for the specified sample has non-missing alleles and is either haploid or diploid with a phased allele separator, and returnsfalse
otherwise.- Specified by:
isPhased
in interfaceDuplicatesGTRec
- Parameters:
sample
- a sample index- Returns:
true
if the genotype for the specified sample is a phased, nonmissing genotype
-
isPhased
public boolean isPhased()
Description copied from interface:DuplicatesGTRec
Returnstrue
if every genotype for each sample is a phased, non-missing genotype, and returnsfalse
otherwise.- Specified by:
isPhased
in interfaceDuplicatesGTRec
- Returns:
true
if the genotype for each sample is a phased, non-missing genotype
-
gl
public float gl(int sample, int allele1, int allele2)
Returns the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype. Returns1.0f
if the corresponding genotype determined by theisPhased()
,allele1()
, andallele2()
methods is consistent with the specified ordered genotype, and returns0.0f
otherwise.- Parameters:
sample
- the sample indexallele1
- the first allele indexallele2
- the second allele index- Returns:
- the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype.
- Throws:
java.lang.IndexOutOfBoundsException
- ifsamples < 0 || samples >= this.size()
java.lang.IndexOutOfBoundsException
- ifallele1 < 0 || allele1 >= this.marker().nAlleles()
java.lang.IndexOutOfBoundsException
- ifallele2 < 0 || allele2 >= this.marker().nAlleles()
-
size
public int size()
Description copied from interface:DuplicatesGTRec
Returns the number of haplotypes.- Specified by:
size
in interfaceDuplicatesGTRec
- Specified by:
size
in interfaceIntArray
- Returns:
- the number of haplotypes
-
toString
public java.lang.String toString()
Returns the VCF record.- Overrides:
toString
in classjava.lang.Object
- Returns:
- the VCF record
-
-