![]() |
FingerprintsBitVector
use FingerprintsBitVector;
use FingerprintsBitVector qw(:coefficients);
use FingerprintsBitVector qw(:all);
FingerprintsBitVector class provides the following methods:
new, BaroniUrbaniSimilarityCoefficient, BuserSimilarityCoefficient , CosineSimilarityCoefficient, DennisSimilarityCoefficient , DiceSimilarityCoefficient, EuclidSimilarityCoefficient , FoldFingerprintsBitVectorByDensity, FoldFingerprintsBitVectorBySize , ForbesSimilarityCoefficient, FossumSimilarityCoefficient, GetBitsAsBinaryString , GetBitsAsHexadecimalString, GetBitsAsRawBinaryString, GetFingerprintsBitDensity , GetSupportedSimilarityCoefficients, HamannSimilarityCoefficient , IsFingerprintsBitVector, IsSubSet, JacardSimilarityCoefficient , Kulczynski1SimilarityCoefficient, Kulczynski2SimilarityCoefficient , ManhattanSimilarityCoefficient, MatchingSimilarityCoefficient , McConnaugheySimilarityCoefficient, NewFromBinaryString , NewFromHexadecimalString, NewFromRawBinaryString , OchiaiSimilarityCoefficient, PearsonSimilarityCoefficient , RogersTanimotoSimilarityCoefficient, RussellRaoSimilarityCoefficient , SimpsonSimilarityCoefficient, SkoalSneath1SimilarityCoefficient , SkoalSneath2SimilarityCoefficient, SkoalSneath3SimilarityCoefficient , StringifyFingerprintsBitVector, TanimotoSimilarityCoefficient , TverskySimilarityCoefficient, WeightedTanimotoSimilarityCoefficient , WeightedTverskySimilarityCoefficient, YuleSimilarityCoefficient
The methods available to create fingerprints bit vector from strings and to calculate similarity coefficient between two bit vectors can also be invoked as class functions.
FingerprintsBitVector class is derived from BitVector class which provides the functionality to manipulate bits.
For two fingerprints bit vectors A and B of same size, let:
Then, various similarity coefficients [ Ref. 40 - 42 ] for a pair of bit vectors A and B are defined as follows:
BaroniUrbani: ( SQRT( Nc * Nd ) + Nc ) / ( SQRT ( Nc * Nd ) + Nc + ( Na - Nc ) + ( Nb - Nc ) ) ( same as Buser )
Buser: ( SQRT ( Nc * Nd ) + Nc ) / ( SQRT ( Nc * Nd ) + Nc + ( Na - Nc ) + ( Nb - Nc ) ) ( same as BaroniUrbani )
Cosine: Nc / SQRT ( Na * Nb ) (same as Ochiai)
Dice: (2 * Nc) / ( Na + Nb )
Dennis: ( Nc * Nd - ( ( Na - Nc ) * ( Nb - Nc ) ) ) / SQRT ( Nt * Na * Nb)
Euclid: SQRT ( ( Nc + Nd ) / Nt )
Forbes: ( Nt * Nc ) / ( Na * Nb )
Fossum: ( Nt * ( ( Nc - 1/2 ) ** 2 ) / ( Na * Nb )
Hamann: ( ( Nc + Nd ) - ( Na - Nc ) - ( Nb - Nc ) ) / Nt
Jaccard: Nc / ( ( Na - Nc) + ( Nb - Nc ) + Nc ) = Nc / ( Na + Nb - Nc ) (same as Tanimoto)
Kulczynski1: Nc / ( ( Na - Nc ) + ( Nb - Nc) ) = Nc / ( Na + Nb - 2Nc )
Kulczynski2: ( ( Nc / 2 ) * ( 2 * Nc + ( Na - Nc ) + ( Nb - Nc) ) ) / ( ( Nc + ( Na - Nc ) ) * ( Nc + ( Nb - Nc ) ) ) = 0.5 * ( Nc / Na + Nc / Nb )
Manhattan: ( ( Na - Nc ) + (Nb - Nc) ) / Nt = ( Na + Nb - 2Nc ) / Nt
Matching: ( Nc + Nd ) / Nt
McConnaughey: ( Nc ** 2 - ( Na - Nc ) * ( Nb - Nc) ) / ( Na * Nb )
Ochiai: Nc / SQRT ( Na * Nb ) (same as Cosine)
Pearson: ( ( Nc * Nd ) - ( ( Na - Nc ) * ( Nb - Nc ) ) / SQRT ( Na * Nb * ( Na - Nc + Nd ) * ( Nb - Nc + Nd ) )
RogersTanimoto: ( Nc + Nd ) / ( ( Na - Nc) + ( Nb - Nc) + Nt) = ( Nc + Nd ) / ( Na + Nb - 2Nc + Nt)
RussellRao: Nc / Nt
Simpson: Nc / MIN ( Na, Nb)
SkoalSneath1: Nc / ( Nc + 2 * ( Na - Nc) + 2 * ( Nb - Nc) ) = Nc / ( 2 * Na + 2 * Nb - 3 * Nc )
SkoalSneath2: ( 2 * Nc + 2 * Nd ) / ( Nc + Nd + Nt )
SkoalSneath3: ( Nc + Nd ) / ( ( Na - Nc ) + ( Nb - Nc ) ) = ( Nc + Nd ) / ( Na + Nb - 2 * Nc )
Tanimoto: Nc / ( ( Na - Nc) + ( Nb - Nc ) + Nc ) = Nc / ( Na + Nb - Nc ) (same as Jaccard)
Tversky: Nc / ( alpha * ( Na - Nc ) + ( 1 - alpha) * ( Nb - Nc) + Nc ) = Nc / ( alpha * ( Na - Nb ) + Nb)
Yule: ( ( Nc * Nd ) - ( ( Na - Nc ) * ( Nb - Nc ) ) ) / ( ( Nc * Nd ) + ( ( Na - Nc ) * ( Nb - Nc ) ) )
The values of Tanimoto/Jaccard and Tversky coefficients are dependent on only those bit which are set to ''1'' in both A and B. In order to take into account all bit positions, modified versions of Tanimoto [ Ref. 42 ] and Tversky [ Ref. 43 ] have been developed.
Let:
Tanimoto': Nc' / ( ( Na' - Nc') + ( Nb' - Nc' ) + Nc' ) = Nc' / ( Na' + Nb' - Nc' )
Tversky': Nc' / ( alpha * ( Na' - Nc' ) + ( 1 - alpha) * ( Nb' - Nc' ) + Nc' ) = Nc' / ( alpha * ( Na' - Nb' ) + Nb')
Then:
WeightedTanimoto = beta * Tanimoto + (1 - beta) * Tanimoto'
WeightedTversky = beta * Tversky + (1 - beta) * Tversky'
Creates a new FingerprintsBitVector object of size Size and returns newly created FingerprintsBitVector. Bit numbers range from 0 to 1 less than Size.
Returns value of BaroniUrbani similarity coefficient between two same size FingerprintsBitVectors
Returns value of Buser similarity coefficient between two same size FingerprintsBitVectors
Returns value of Cosine similarity coefficient between two same size FingerprintsBitVectors
Returns value of Dennis similarity coefficient between two same size FingerprintsBitVectors
Returns value of Dice similarity coefficient between two same size FingerprintsBitVectors
Returns value of Euclid similarity coefficient between two same size FingerprintsBitVectors
Folds FingerprintsBitVector by recursively reducing its size by half until bit density of set bits is greater than or equal to specified Density and returns folded FingerprintsBitVector.
Folds FingerprintsBitVector by recursively reducing its size by half until size is less than or equal to specified Size and returns folded FingerprintsBitVector
Returns value of Forbes similarity coefficient between two same size FingerprintsBitVectors
Returns value of Fossum similarity coefficient between two same size FingerprintsBitVectors
Returns fingerprints as a binary ASCII string containing 0s and 1s
Returns fingerprints as a hexadecimal string
Returns fingerprints as a raw binary string containing packed bit values for each byte
Returns BitDensity of FingerprintsBitVector corresponding to bits set to 1s
Returns an array containing names of supported similarity coefficients
Returns value of Hamann similarity coefficient between two same size FingerprintsBitVectors
Returns 1 or 0 based on whether Object is a FingerprintsBitVector object
Returns 1 or 0 based on whether first firngerprints bit vector is a subset of second fingerprints bit vector.
For a bit vector to be a subset of another bit vector, both vectors must be of the same size and the bit positions set in first vector must also be set in the second bit vector.
Returns value of Jacard similarity coefficient between two same size FingerprintsBitVectors
Returns value of Kulczynski1 similarity coefficient between two same size FingerprintsBitVectors
Returns value of Kulczynski2 similarity coefficient between two same size FingerprintsBitVectors
Returns value of Manhattan similarity coefficient between two same size FingerprintsBitVectors
Returns value of Matching similarity coefficient between two same size FingerprintsBitVectors
Returns value of McConnaughey similarity coefficient between two same size FingerprintsBitVectors
Creates a new FingerprintsBitVector using BinaryString and returns new FingerprintsBitVector object
Creates a new FingerprintsBitVector using HexdecimalString and returns new FingerprintsBitVector object
Creates a new FingerprintsBitVector using RawBinaryString and returns new FingerprintsBitVector object
Returns value of Ochiai similarity coefficient between two same size FingerprintsBitVectors
Returns value of Pearson similarity coefficient between two same size FingerprintsBitVectors
Returns value of RogersTanimoto similarity coefficient between two same size FingerprintsBitVectors
Returns value of RussellRao similarity coefficient between two same size FingerprintsBitVectors
Returns value of Simpson similarity coefficient between two same size FingerprintsBitVectors
Returns value of SkoalSneath1 similarity coefficient between two same size FingerprintsBitVectors
Returns value of SkoalSneath2 similarity coefficient between two same size FingerprintsBitVectors
Returns value of SkoalSneath3 similarity coefficient between two same size FingerprintsBitVectors
Returns a string containing information about FingerprintsBitVector object
Returns value of Tanimoto similarity coefficient between two same size FingerprintsBitVectors
Returns value of Tversky similarity coefficient between two same size FingerprintsBitVectors
Returns value of WeightedTanimoto similarity coefficient between two same size FingerprintsBitVectors
Returns value of WeightedTversky similarity coefficient between two same size FingerprintsBitVectors
Returns value of Yule similarity coefficient between two same size FingerprintsBitVectors
BitVector.pm, Fingerprints.pm, PathLengthFingerprints.pm
Copyright (C) 2004-2008 Manish Sud. All rights reserved.
This file is part of MayaChemTools.
MayaChemTools is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.