NAME
InfoFingerprintsSDFiles.pl - List information about fingerprints data in SDFile(s)
SYNOPSIS
InfoFingerprintsSDFiles.pl SDFile(s)...
InfoFingerprintsSDFiles.pl [-a, --all] [--AverageBitDensity] [--BitDensity] [-c, --count]
[-d, --detail InfoLevel] [--DataCheck] [-e, --empty] [--FingerprintsField FieldLabel]
[--FingerprintsType] [--FingerprintsDescription] [--FingerprintsSize]
[--FingerprintsBitStringFormat] [--FingerprintsBitOrder]
[--FingerprintsVectorValuesType] [--FingerprintsVectorValuesFormat]
[-h, --help] [--NumOfOnBits] [--NumOfNonZeroValues]
[-w, --WorkingDir dirname] SDFile(s)...
DESCRIPTION
List information about fingerprints data in SDFile(s): number of rows containing
fingerprints data, type of fingerprints vector, description and size of fingerprints, bit density
and average bit density for bit-vector fingerprints strings, and so on.
Multiple SDFile names are separated by spaces. The valid file extensions are .sdf
and .sd. All other file names are ignored. All the SD files in a current directory
can be specified either by *.sdf or the current directory name.
Format of fingerprint strings data in SDFile(s) is automatically detected. The current release
of MayaChemTools supports the following types of fingerprint bit-vector and vector strings:
FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes;1024;
HexadecimalString;Ascending;00000000000000000000000040000000000000
000000000000000000000000000020200000000000000000004000000000000...
FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes;1024;
BinaryString;Ascending;0000000000000000000000000000000000000000000
000000000000001000000001000000000010000000000000000000010000000...
FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes;27;
NumericalValues;IDsAndValuesPairsString;C 8 O 1 C:C 8 C:O 2 C:C:C 9
C:C:O 3 C:O:C 1 C:C:C:C 10 C:C:C:O 4 C:C:O:C 3 C:C:C:C:C 10 ...
FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;000000000
000000000000000000000000000000000000000000000000001000000000000
000000000010000000000001001000000000000000000001000000000000000...
FingerprintsBitVector;MACCSKeyBits;166;HexadecimalString;Ascending;0000
002000002010008040084010080100902805e1
FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;1100000000
0000001000001000010011000001100000001000000000000000101000000000
0000000000000000000000000000000000000000000000000000100000000000...
FingerprintsBitVector;MACCSKeyBits;322;HexadecimalString;Ascending;3001
48c060400041000000000000000100000000000000000000000000000000500
000000000000000
FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesString;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesString;
2 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 1 0 0 0 0 1 0 0 7 1 0 0 0 0 0 2 1 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 2 0 0 0 0 0 0 0 0 ...
FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTypes;14;
AlphaNumericalValues;ValuesString;333564680 1142173602 14814699391
977749791 2006158649 291020918 443330853 692611812 816539344173
1657806 2039728782 931045615 1273931663 1317501190
FingerprintsVector;ExtendedConnectivity:FunctionalClassAtomTypes;11;
AlphaNumericalValues;ValuesString;862102353 981185303 12517955598
10600886 885767127 1452087973 1878436093 2029559552 1465773182
1530666307 2113761516
FingerprintsVector;TopologicalAtomPairs:AtomicInvariantsAtomTypes;23;
NumericalValues;IDsAndValuesString;C.X1.BO1.H3-D1-C.X3.BO4 C.X2.BO3.H1-
D1-C.X2.BO3.H1 C.X2.BO3.H1-D1-C.X3.BO4 C.X2.BO3.H1-D1-N.X2.BO2.H1
C.X3.BO4-D1-C.X3.BO4 C.X3.BO4-D1-O.X1.BO2 C.X1.BO1.H3-D2-C.X2.BO3.H1
C.X1.BO1.H3-D2-C.X3.BO4...; 1 1 2 2 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 2 1 1 1
FingerprintsVector;TopologicalAtomPairs:AtomicInvariantsAtomTypes;23;
NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3-D1-C.X3.BO4 1
C.X2.BO3.H1-D1-C.X2.BO3.H1 1 C.X2.BO3.H1-D1-C.X3.BO4 2 C.X2.BO3.H1-
D1-N.X2.BO2.H1 2 C.X3.BO4-D1-C.X3.BO4 1 C.X3.BO4-D1-O.X1.BO2 1
C.X1.BO1.H3-D2-C.X2.BO3.H1 1 C.X1.BO1.H3-D2-C.X3.BO4 1
C.X2.BO3.H1-D2-C.X2.BO3.H1 1 C.X2.BO3.H1-D2-C.X3.BO4 3...
FingerprintsVector;TopologicalAtomTorsions:AtomicInvariantsAtomTypes;11;
NumericalValues;IDsAndValuesString;C.X1.BO1.H3-C.X3.BO4-C.X2.BO3.H1-
N.X2.BO2.H1 C.X1.BO1.H3-C.X3.BO4-C.X3.BO4-C.X2.BO3.H1 C.X1.BO1.H3-
C.X3.BO4-C.X3.BO4-O.X1.BO2 C.X2.BO3.H1-C.X2.BO3.H1-C.X3.BO4-C.X3.BO4
C.X2.BO3.H1-C.X2.BO3.H1-C.X3.BO4-O.X1.BO2...;
1 1 1 1 1 1 1 1 1 1 1
FingerprintsVector;TopologicalAtomTorsions:AtomicInvariantsAtomTypes;11;
NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3-C.X3.BO4-C.X2.BO3.H1-
N.X2.BO2.H1 1 C.X1.BO1.H3-C.X3.BO4-C.X3.BO4-C.X2.BO3.H1 1 C.X1.BO1.H3-
C.X3.BO4-C.X3.BO4-O.X1.BO2 1 C.X2.BO3.H1-C.X2.BO3.H1-C.X3.BO4-
C.X3.BO4 1 C.X2.BO3.H1-C.X2.BO3.H1-C.X3.BO4-O.X1.BO2 1 C.X2.BO3.H1-
C.X2.BO3.H1-N.X2.BO2.H1-C.X2.BO3.H1 1 C.X2.BO3.H1-C.X3.BO4-C.X3.BO4-
C.X2.BO3.H1 1...
FingerprintsVector;TopologicalPharmacophoreAtomPairs;150;
OrderedNumericalValues;ValuesString;1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 2 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0
0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
FingerprintsVector;TopologicalPharmacophoreAtomPairs;150;
OrderedNumericalValues;IDsAndValuesString;H-D1-H H-D1-HBA H-D1-HBD
H-D1-NI H-D1-PI HBA-D1-HBA HBA-D1-HBD HBA-D1-NI HBA-D1-PI HBD-D1-HBD
HBD-D1-NI HBD-D1-PI NI-D1-NI NI-D1-PI PI-D1-PI H-D2-H H-D2-HBA ...;
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 ...
FingerprintsVector;TopologicalPharmacophoreAtomTriplets;4960;
OrderedNumericalValues;ValuesString;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
FingerprintsVector;TopologicalPharmacophoreAtomTriplets;4960;
OrderedNumericalValues;IDsAndValuesString;Ar1-Ar1-Ar1 Ar1-Ar1-H1
Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-Ar1-NI1 Ar1-Ar1-PI1 Ar1-H1-H1 ...;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0...
FingerprintsVector;AtomNeighborhoods:AtomicInvariantsAtomTypes;10;
AlphaNumericalValues;ValuesString;NR0-C.X2.BO3.H1-ATC1:NR1-C.X2.BO3.H1-
ATC1:NR1-C.X3.BO4-ATC1:NR2-C.X2.BO3.H1-ATC1:NR2-C.X3.BO4-ATC1:NR2-
N.X1.BO1.H2-ATC1 NR0-C.X2.BO3.H1-ATC1:NR1-C.X2.BO3.H1-ATC1:NR1-
C.X3.BO4-ATC1:NR2-C.X2.BO3.H1-ATC1:NR2-C.X3.BO4-ATC2 NR0-C.X2.BO3.H1-
ATC1:NR1-C.X2.BO3.H1-ATC2:NR2-C.X2.BO3.H1-ATC1:NR2-C.X3.BO4-ATC1...
OPTIONS
- -a, --all
-
List all the available information.
- --AverageBitDensity
-
List average bit density of fingerprint bit-vector strings.
- --BitDensity
-
List bit density of fingerprints bit-vector strings data in each row.
- --count
-
List number of rows containing fingerprints bit-vector or vector strings data. This
is default behavior.
- -d, --detail InfoLevel
-
Level of information to print about lines being ignored. Default: 1. Possible values:
1, 2 or 3.
- --DataCheck
-
Validate fingerprints data specified using --FingerprintsCol and list information
about missing and invalid data.
- -e, --empty
-
List number of rows containing no fingerprints data.
- --FingerprintsField FieldLabel
-
Fingerprints field label to use during listing of fingerprints information for SDFile(s).
Default value: first data field label containing the word Fingerprints in its label.
- --FingerprintsType
-
List types of fingerprint strings: FingerprintsBitVector or FingerprintsVector.
- --FingerprintsDescription
-
List types of fingerprints: PathLengthBits, PathLengthCount, MACCSKeyCount,
ExtendedConnectivity and so on.
- --FingerprintsSize
-
List size of fingerprints.
- --FingerprintsBitStringFormat
-
List format of fingerprint bit-vector strings: BinaryString or HexadecimalString.
- --FingerprintsBitOrder
-
List order of bits data in fingerprint bit-vector bit strings: Ascending or Descending.
- --FingerprintsVectorValuesType
-
List type of values in fingerprint vector strings: OrderedNumericalValues, NumericalValues or
AlphaNumericalValues
- --FingerprintsVectorValuesFormat
-
List format of values in fingerprint vector strings: ValuesString, IDsAndValuesString,
IDsAndValuesPairsString, ValuesAndIDsString or ValuesAndIDsPairsString.
- -h, --help
-
Print this help message.
- --NumOfOnBits
-
List number of on bits in fingerprints bit-vector strings data in each row.
- --NumOfNonZeroValues
-
List number of non-zero values in fingerprints vector strings data in each row.
- -w, --WorkingDir DirName
-
Location of working directory. Default: current directory.
EXAMPLES
To count number of lines containing fingerprints bit-vector or vector strings data present in a
data field with Fingerprint substring in its label, type:
% InfoFingerprintsSDFiles.pl SampleFPBin.sdf
% InfoFingerprintsSDFiles.pl SampleFPHex.sdf
% InfoFingerprintsSDFiles.pl SampleFPcount.sdf
To list all available information about fingerprints bit-vector or vector strings data present in a
data field with Fingerprint substring in its label, type:
% InfoFingerprintsSDFiles.pl -a SampleFPHex.sdf
% InfoFingerprintsSDFiles.pl -a SampleFPcount.sdf
To list all available information about fingerprints bit-vector or vector strings data present in a
data field name Fingerprints, type:
% InfoFingerprintsSDFiles.pl -a --FingerprintsField Fingerprints
SampleFPHex.sdf
% InfoFingerprintsSDFiles.pl -a --FingerprintsField Fingerprints
SampleFPcount.sdf
To list bit density, average bit density, and number of on bits for fingerprints bit-vector strings data
present in a data field with Fingerprint substring in its label, type:
% InfoFingerprintsSDFiles.pl --BitDensity --AverageBitDensity
--NumOfOnBits SampleFPBin.sdf
To list vector values type, format and number of non-zero values for fingerprints vector strings
data present in a data field with Fingerprint substring in its label along with fingerprints type
and description, type:
% InfoFingerprintsSDFiles.pl --FingerprintsType --FingerprintsDescription
--FingerprintsVectorValuesType --FingerprintsVectorValuesFormat
--NumOfNonZeroValues SampleFPcount.sdf
AUTHOR
Manish Sud
SEE ALSO
InfoFingerprintsTextFiles.pl, SimilarityMatrixSDFiles.pl, SimilarityMatrixTextFiles.pl, 
AtomNeighborhoodsFingerprints.pl, ExtendedConnectivityFingerprints.pl, 
MACCSKeysFingerprints.pl, PathLengthFingerprints.pl, 
TopologicalAtomPairsFingerprints.pl, TopologicalAtomTorsionsFingerprints.pl, 
TopologicalPharmacophoreAtomPairsFingerprints.pl, TopologicalPharmacophoreAtomTripletsFingerprints.pl
COPYRIGHT
Copyright (C) 2004-2010 Manish Sud. All rights reserved.
This file is part of MayaChemTools.
MayaChemTools is free software; you can redistribute it and/or modify it under
the terms of the GNU Lesser General Public License as published by the Free
Software Foundation; either version 3 of the License, or (at your option)
any later version.