Function | SQL example |
---|---|
bit_set | Select bit_set(0::bit(20),5); Update nci.structure set myfp = bit_set(myfp, 7) where matches(smiles,'[N,n]'); |
nbits_set | Select nbits_set(vfp)::real/length(vfp) from nci.structure limit 50; |
isbit_set | Select count(smiles) from nci.structure where isbit_set(myfp, 7); |
bit_contains | Select count(smiles) from nci.structure where bit_contains(vfp, fp('c1ccccc1C(=O)NC',2048,64,25)); |
bit_density | Select count(smiles) from nci.structure where bit_density(gfp) > 0.5; |
contains | Select count(smiles) from nci.structure where contains(gfp, fp('c1ccccc1C(=O)NC')); |
orsum | Select orsum(bit_set(0::bit(166), bit)) from fragment.public166keys where matches('c1ccccc1C(=O)NC',smarts); |
Bit strings in PostgreSQL can be thought of as a character string of 0's or 1's, although they are actually stored efficiently as bits in a string of bytes. The external representation, as seen when a bit string is selected, is a character string of 0's and 1's. Bit #1 appears on the left. There are some built-in functions for handling bit strings, as well as &(AND) and |(OR), and ~(NOT) operators. The functions additional here are provided to help handle bit strings typically used for fingerprints.
bit_set(bit Astring, integer ibit) returns bit
This function reads a bit string Astring and returns another bit string with the ibit'th bit set. It does not modify Astring. Bit #1 is the first bit at the left of the string.
nbits_set(bit Astring) returns integer
This function returns the number of bits in Astring which are set to 1.
isbit_set(bit Astring, integer ibit) returns boolean
This function returns true if the ibit'th bit in Astring is set to 1; otherwise it returns false. Bit #1 is the first bit at the left of the string.
bit_contains(bit Astring, bit Bstring) returns boolean
This function returns true if all the bits in Bstring that are set to 1 are also set to 1 in Astring; otherwise it returns false. If the two strings are of unequal size, a folded version of the longer string is compared to the shorter. Folding is possible only if both strings' size is a power of two. If folding is not possible, an error is reported.
bit_density(bit Astring) returns real
This function returns the ratio of the number of bits in Astring that are set to 1, to the size of the string. This will always be between 0 and 1.
contains(bit Astring, bit Bstring) returns boolean
This function returns true if all the bits in Bstring that are set to 1 are also set to 1 in Astring; otherwise it returns false. The two strings must be of equal size.
orsum(bit Astring) returns bit
This is an aggregate function, like sum or count, that operates on an aggregate of values. It combines each bit string of the aggregate by ORing corresponding bits together. Each bit string of the aggregate must have the same size. It is used in the fragment.public166keys function to compute a fingerprint. Consider the following two SQL statements:
select bit_set(0::bit(166),bit) from fragment.public166keys where matches('c1ccccc1C(=O)NC',smarts);
select orsum(bit_set(0::bit(166),bit)) from fragment.public166keys where matches('c1ccccc1C(=O)NC',smarts);The first statement returns14 rows, because there are 14 smarts in the table fragment.public166keys that match c1ccccc1C(=O)NC. Each row is a fingerprint part, of size 166 because of 0:bit(166), but having only one bit set, as given by the value in the bit column of fragment.public166keys. The second statement returns one row which is the OR of each of the 14 bit strings - the full fragment/key fingerprint for c1ccccc1C(=O)NC.