This table of contents shows all chapter, section and sub-section titles. Click on a chapter to show that chapter's examples. Not all chapters contain examples. Click on a page to show computer code or SQL examples on that page. Pages without links do not contain any computer code or SQL. In some cases, examples appears on a page not shown in this table of contents. Click on the next lower numbered page to locate code on that page.

Chapter Page
1. Introduction 1
2. Relational Database Fundamentals 5
2.1 Introduction 5
2.2 Tables, Rows and Columns 5
2.3 External and Internal Representations Of Data 7
2.4 Advantages Over Spreadsheets 8
2.4.1 Size and Speed 8
2.4.2 Multiple Users 8
2.5 Relationships Among Tables 9
2.5.1 One-to-many Relationships 9
2.5.2 One-to-one Relationships 11
2.5.3 Many-to-many Relationships 12
2.6 Entity Relationship Diagrams 12
2.7 Uniqueness 14
2.8 Sequences 14
2.9 Keys 15
2.9.1 Primary Keys 15
2.9.2 Foreign Keys 15
2.10 Constraints 16
2.11 Indexes 16
2.12 Joining Tables 16
2.13 Normal Forms 17
2.13.1 First Normal Form 17
2.13.2 Second Normal Form 18
2.13.3 Third Normal Form 19
2.13.4 Summary Of Normal Forms 20
3. Structured Query Language (SQL) 21
3.1 Introduction 21
3.2 Databases, Schemas, Tables, Rows and Columns 21
3.3 Create 22
3.4 Insert 23
3.5 Select 24
3.6 Update and Delete 25
3.7 SQL Functions 26
3.7.1 Regular Functions 26
3.7.2 Aggregate Functions 27
3.8 Domains, Triggers and Views 28
3.9 Unions, Intersections and Differences 29
4. Relational Database Management Systems 31
4.1 Introduction 31
4.2 Standard SQL 32
4.3 A Sampling Of Differences 32
4.4 Server and Client 33
4.5 Compatibility 35
5. Client and Web Applications 37
5.1 Introduction 37
5.2 Command Line Programs 37
5.3 Web-Based Applications 38
5.4 Client Applications 39
5.5 SQL Interfaces In Various Languages 41
5.5.1 Perl 43
5.5.2 Python 44
5.5.3 PHP 44
5.5.4 Java 45
6. Data Storage, Searching and Manipulation 47
6.1 Introduction 47
6.2 General Schema Design Decisions 47
6.3 Sample Schema For Tracking Chemical Samples 49
6.4 Schemas For Pubchem Data 53
6.4.1 BioAssay Data 54
6.4.2 Substances 56
6.4.3 Compounds 58
6.5 Data Constraints and Data Integrity 60
6.6 Developing Complex SQL 63
6.7 Sub-Select Statements 66
6.8 Views 67
7. Computer Representations Of Molecular Structures 71
7.1 Introduction 71
7.2 SMILES Representation Of Molecular Structure 72
7.3 Extensions To SQL For Chemical Structures 72
7.4 SMARTS Representation Of Molecular Searches 74
7.5 SMILES and SMARTS Quirks 76
7.5.1 Hydrogen Atoms 76
7.5.2 Aromaticity 77
7.5.3 Tautomers 77
7.5.4 Valence 80
7.5.5 Chirality 80
7.5.6 Isotopes 81
7.5.7 Salts and Mixtures 81
7.5.8 InChI and Canonical SMILES 82
7.6 SMILES and Inorganic Structures 82
7.7 Other SMILES Extensions 82
7.8 Input and Output Of Molecular Structures 83
7.9 Useful SQL Extensions 85
7.10 SMILES As A SQL Data Type 86
7.10.1 Domains 86
7.10.2 Triggers 87
7.11 Summary 88
8. Molecular Fragments and Fingerprints 91
8.1 Introduction 91
8.2 Fragments 91
8.2.1 Fragment Keys 92
8.2.2 MACCS Keys and Other Fragment Keys 95
8.3 Fingerprints 95
8.4 Similarity Measures 96
8.5 Computing Fragment-Based Properties 96
9. Reactions and Transformations 99
9.1 Introduction 99
9.2 Reaction SMILES 99
9.3 Transformations 100
9.3.1 Unimolecular Transformations 101
9.3.2 Multi-component Transformations 104
9.4 Canonical Reaction SMILES 106
10. PostgreSQL Extensions 109
10.1 Introduction 109
10.2 Composite Data Types 109
10.3 Composite Data Type For Experimental Values 111
10.4 Array Data Types For 2- and 3-Dimensional Coordinates 115
10.5 Functions In Other Languages 117
10.5.1 Plpgsql 117
10.5.2 Plperl, Plpython, Pltcl 118
10.5.3 Core Chemical Functions 119
10.5.4 C Language Functions 120
10.6 Object RDBMS 121
11. 3-Dimensional Molecular Structure Tables 123
11.1 Introduction 123
11.2 Using Tables Instead Of Files 123
11.3 Molfile and Other Common File Formats 124
11.4 Processing SDF Files 125
11.5 Using Tables Instead Of Files In Client Programs 131
11.6 File Import, Export and Conversions 132
11.7 Functions Using 3-Dimensional Atomic Coordinates 133
11.8 Conformations 135
11.9 Other Representations Of 3-Dimensional Molecular Structure 136
12. More On Client and Web Interfaces To RDBMS 137
12.1 Introduction 137
12.2 Store All Possible Data In The RDBMS 139
12.3 Advanced SQL Techniques 140
12.3.1 Placeholders In SQL Statements 141
12.3.2 Bind Values In SQL Statements 142
12.4 Web Applications 143
12.5 R Programs 147
12.5.1 Hierarchical Clustering 147
12.5.2 Linear Models 148
13. Applications 155
13.1 Introduction 155
13.2 Compound Registration 155
13.3 Experimental Chemical and Biological Data Integration 162
13.4 Data From External Sources 164
13.5 Utilities 167
13.5.1 molgrep 168
13.5.2 molcat 168
13.5.3 molview 169
13.5.5 molrandom 170
13.5.6 molnear 171
13.5.7 molsame 171
Appendix 173
A.1 Introduction 173
A.2 Symbols and Bonds From Smiles 173
A.3 Normalizing Data 175
A.4 SQL Functions 176
A.4.1 Public166keys 176
A.4.2 Orsum 176
A.4.3 Tanimoto 176
A.4.4 Euclid 177
A.4.5 Hamming 177
A.4.6 Nbits_set 177
A.4.7 Amw 177
A.4.8 Tpsa 181
A.5 Tables Used In Functions 182
A.5.1 Amw 183
A.5.2 Tpsa 183
A.5.3 Public166keys 183
A.6 Core Function Implementation For PostgreSQL 188
A.6.1 PerlMol/plperlu 191
A.6.2 FROWNS/plpythonu 197
A.6.3 OpenBabel/python 203
A.7 C Language PostgreSQL Functions 205
A.8 Database Utilities Dbutils 206
A.9 Loading Files Into Simple Tables 207
A.9.1 Smiloader 208
A.9.2 Sdfloader 210