All major ZINC by-property subsets have been updated in the past month or so

Dear ZINC Fans

The "all purchasable" subset (#6) was updated this morning. All the major subsets of ZINC have been updated within the last month or so.

Since we have loaded about a quarter of a million new molecules in the past month, we will immediately begin with new exports again. Subsets will appear in decreasing order of speed, thus: fragment-like, then lead-like, then drug-like, then everything.

The subsets are available for download here:

Thank you for your interest in ZINC!


New ZINC subset releases. Dec 3, 2010

Dear ZINC Fans

We have just released updates of some ZINC subsets, as follows:

Lead like (#1) - 4.2 M
Clean-drug-like (#13) - 3.75 M
Everything (#10) - 19.5 M

We have been informed that "clean drug like" is not fully exported. We regret this error, and have re-started the export. It should be ready on Monday.


Thanksgiving 2011

Dear ZINC Users

It is Thanksgiving, time to remember all we are grateful for:

* our participating vendors who send us regular updates of their catalogs, so that we can provide you with timely information. http://zinc.docking.org/vendor0/

* our commercial software colleagues who make their software available to us, so that we can provide you with ZINC. http://zinc.docking.org/ack.shtml

* our users, i.e. YOU! Thanks for your helpful feedback, your patience, and for pushing us to improve.

In the last week or so we have updated the following subsets:
* drug-like (#3) - 13.4M

more broken molecules removed from ZINC: carbanions

Dear ZINC Fans

We do not know how carbanions ever got into ZINC, but there they are. Today, we finally put them out of their misery. We deleted 23,565 molecules matching the following embarassing patterns:

[cH-] 3035
[CH-] 3646
[C@@-] 2671
[C@-] 2250
[C-] 2806
[c-] 9317

The sum is not equal to 23,565 because some molecules had two pathologies.

Actually, most of these molecules should have been invisible to you, they were lurking, silently in the shadows. They are all gone now. All future subsets will have these removed.

80,000 lost molecules found and now back in ZINC

Due to bookkeeping and other accumulated errors, about 80,000 molecules were not visible in ZINC.

We have fixed this by rebuilding these molecules. They will appear in ZINC search starting on Dec 1st. They will start to appear in ZINC subsets effective immediately (but it takes time to update all the subsets)


2-aminopyridines cleaned up

We cleaned up the 2-aminopyridines, which often only had the protonated form and not the netural form, which is nearly always wrong. There were 24,464 cases of these molecules that we found. They have all been rebuilt, and should not be a problem any more. All future subsets will have this problem fixed.

three ring systems cleaned up

Dear ZINC Users

Today we repaired molecules matching three SMARTS patterns in ZINC that had issues, as follows:

1. 6511 cases of imidazo[1,2-a]pyridine-2-ester or -2-amide, having the SMARTS pattern: [O,N]C(=O)c2cn1ccccc1n2. For instance, the ester (these are ZINC IDs) 05983327, 05983011, 07240926 or the amide 12630083. We deleted the wrongly protonated rings.

2. 24,110 cases of imidazo[2,1-b]thiazole, having the SMARTS pattern c2cn1ccsc1n2. For instance, 08281641, 04864557, 12531418. We deleted the wrongly protonated rings.

chEMBL08 is out

ChEMBL08 was released over the weekend. Better act quickly before it is obsoleted by ChEMBL09.

Updating refererence SMILES of molecules in ZINC

Dear ZINC Users

We updated the canonical SMILES of molecules in ZINC today. The 3D representations of the molecules are unchanged.

As you know, SMILES can often be written in multiple ways. Today we applied a new canonicalization scheme. You should notice absolutely nothing, except that the 2D representations of molecules should be better in a few cases.

All together, approximately 1M of the 18 M molecules in ZINC were affected.

-- John

