Why can't we predict which molecules can react with each other?

StarWorms

Deity
Joined
Dec 1, 2005
Messages
2,348
Location
England
... or why does it seem that way?

So let's say we have the sequence of part of a DNA molecule. Let's say we want to make a drug that can bind to it. Why can't we put the sequences into a computer, which can calculate it for us? I don't understand why this seemingly cannot be done: We know the target sequence, we surely must know the physical interactions between atoms. We can work out how a known drug can bind to its known target. Are we even able to deduce the approximate shape that the drug must have to bind?

It just seems really odd to me that we can't seem to calculate this sort of stuff.
 
I won't pretend to know anything about DNA, or what you mean by "binding" to it, but I'd imagine that the problems are:

1. How to synthesise the drug (technical feasibility, mass production, cost-benefit ratio, etc)
2. Determining what side-effects the drug has (i.e. law of unintended consequences)
3. There must be a million different chemicals that will bind to a particular protein -- which one should we use? (i.e. rather than can we use)
 
molecular modeling is pretty darn computing intensive - there are a few supercomputers employed for doing pretty much what you ask for, but it still is much much much much much much much much much ... slower than simply doing high throughput testing using libraries of chemicals and go from there. This also allows to use only chemicals that are actually feasible to synthesize.
But the notion that it is impossible is not true. Though DNA is a very simple molecule and seldom a feasible drug target - proteins are what you want interactions for and those are by no means simple or even well known for the most part...
 
I won't pretend to know anything about DNA, or what you mean by "binding" to it, but I'd imagine that the problems are:

1. How to synthesise the drug (technical feasibility, mass production, cost-benefit ratio, etc)
2. Determining what side-effects the drug has (i.e. law of unintended consequences)
3. There must be a million different chemicals that will bind to a particular protein -- which one should we use? (i.e. rather than can we use)
I was actually referring to 'Why can't computers show us an example of a drug that will bind to molecule x' rather than the problems with actually making it. I'm talking about why can't we design them.

molecular modeling is pretty darn computing intensive - there are a few supercomputers employed for doing pretty much what you ask for, but it still is much much much much much much much much much ... slower than simply doing high throughput testing using libraries of chemicals and go from there. This also allows to use only chemicals that are actually feasible to synthesize.
But the notion that it is impossible is not true. Though DNA is a very simple molecule and seldom a feasible drug target - proteins are what you want interactions for and those are by no means simple or even well known for the most part...
Most of our knowledge of what will bind seems to come from biology rather than chemistry. For example Helix-turn-helix and zinc fingers. It just seems that we can stand back and say "Oh yes, I can see how that binds, it makes perfect sense", but then we can't make one that binds, ourselves.

I mean I'm not surprised that we can't design a molecule perfect with a computer, but I'm surprised that it seems a useless tool in this area. Although I suppose that could be partly to do with us not wanting the designed molecule to bind too strongly.
 
Most of our knowledge of what will bind seems to come from biology rather than chemistry. For example Helix-turn-helix and zinc fingers. It just seems that we can stand back and say "Oh yes, I can see how that binds, it makes perfect sense", but then we can't make one that binds, ourselves.

I mean I'm not surprised that we can't design a molecule perfect with a computer, but I'm surprised that it seems a useless tool in this area. Although I suppose that could be partly to do with us not wanting the designed molecule to bind too strongly.

I am not sure where you get this idea. There is quite a lot of literature computing interactions between (mostly small) molecules. The problem is not that it cannot be done but that you need huge computing power to compute interactions involving at least one large molecule.
Also we know a lot about interaction of molecules on the atomic level - such as hydrogen bonds, ionic interactions, van der waals forces etc. So I would actually say most of our understanding of interactions comes from there. The reason we look at big features of proteins (which you are citing) is that it is MUCH easier to determine that a protein has an alpha helix than to determine how exactly its single atoms are positioned - and without this knowledge we can only guess how it works. Even though this guess may be well informed by what we know about the effects those helices for example have on the behavior of a protein.

I don't think it is really a problem of not being able to compute these things but rather having all relevant (i.e. structural) information about the target (which we usually do not have) and having sufficient computing power. We know pretty well how the physical interaction works (there is no chemistry involved here - unless you want to not only compute binding but also reaction with each other).
 
Just to add to that:
We don't know the shape of most drug targets. Complex molecules cannot be modelled perfectly, although we can generally guess a rough shape from recognising amino acid sequences that lead to certain familiar shapes.

We know very accurately the shape of many drugs used in drug libraries, and I believe that companies do indeed use drugs of similar shapes first, when they find that one molecule seems to have an effect.

Crystallisation and crystallography are difficult, laborious techniques, as you know, and have only been done for a few very important proteins.

Molecules that bind to DNA are relatively unimportant because they need to penetrate the cell and nuclear membranes for an effect.
 
The whole grid computing movement, or at least most of it, is aimed at computing binding sites for possible drugs vs proteins. Hook up a few million PCs and use up all their spare cycles, and you get plenty of computing power.

The trouble is finding what inputs to give to the process.
 
The whole grid computing movement, or at least most of it, is aimed at computing binding sites for possible drugs vs proteins. Hook up a few million PCs and use up all their spare cycles, and you get plenty of computing power.

The trouble is finding what inputs to give to the process.

:lol: I actually have two of those on the go on my computer. I forgot about them!
 
... or why does it seem that way?

So let's say we have the sequence of part of a DNA molecule. Let's say we want to make a drug that can bind to it. Why can't we put the sequences into a computer, which can calculate it for us? I don't understand why this seemingly cannot be done: We know the target sequence, we surely must know the physical interactions between atoms. We can work out how a known drug can bind to its known target. Are we even able to deduce the approximate shape that the drug must have to bind?

It just seems really odd to me that we can't seem to calculate this sort of stuff.


Rational drug design has been around since the mid 1980's. It is possible to calculate the spatial arrangement of the target receptor protein "lock" and then design a small molecule with the right functional groups in the right positions to form an alternative "key". One of the first group of products to get to market that were designed in this fashion were the neuraminidase inhibitors that are used to treat and prevent flu infections.

Unfortunately, rational drug design did not turn out to be the solution to the hit and miss approach for the drug industry because the right spatial arrangement of functional groups is not a solid predictor of efficacy or toxicity.
 
Crystallisation and crystallography are difficult, laborious techniques, as you know, and have only been done for a few very important proteins.

Also, many proteins have been impossible to crystallize.

@Thread: Not enough base data on the structure of proteins and DNA, not enough computing power even if we had the data.

And remember interactions are rarely 1-on-1, there is a whole melange of stuff going on the cell that will affect the binding, exponentially increasing the amount of factors need to make an accurate prediction.
 
I'm guessing you want a way to automate the design of an inhibitor protein that binds directly to a DNA sequence to prevent its activation. So for instance, a way to custom order proteins guaranteed to eliminate the symptoms of a genetic disease.

Predicting molecule binding at the atomic level is still very hypothetical and prone to error. The computational modeling is also very time consuming and costly. Similarly, predicting how a custom designed protein sequence will fold into a three-dimensional protein, and how that protein will interact with other proteins, is also similarly difficult.

It's not a case that science doesn't have a clue, but that science (mainly thermodynamics on the molecular/atomic scale) is highly based on empirical studies of known protein systems. It's not a matter of just manipulating atoms to spell "IBM" with some kind of nano tool, but a matter of having more and more successful real world examples to predict how a brand new system will behave.

Obviously if we couldn't do anything, pharmaceuticals wouldn't be made. We can work out solutions using educated guesses, but there is no perfectly engineered system for doing a complete design by computer, as the computer models are incomplete, since our understanding is incomplete. Hence, pharmaceutical discovery is still very expensive, and big business.

I would point some of the problem at understanding interactions at sub-atomic (quantum?) level and applying that knowledge back up to traditional sciences. Some of the problem also is that crystallography (empirical method of studying molecules and their interactions at atomic level) is very difficult. It is hard to always get the specific molecule you want in th specific reaction crystallized and analyzed.

... or why does it seem that way?

So let's say we have the sequence of part of a DNA molecule. Let's say we want to make a drug that can bind to it. Why can't we put the sequences into a computer, which can calculate it for us? I don't understand why this seemingly cannot be done: We know the target sequence, we surely must know the physical interactions between atoms. We can work out how a known drug can bind to its known target. Are we even able to deduce the approximate shape that the drug must have to bind?

It just seems really odd to me that we can't seem to calculate this sort of stuff.
 
StarWorms said:
I was actually referring to 'Why can't computers show us an example of a drug that will bind to molecule x' rather than the problems with actually making it. I'm talking about why can't we design them.

Calculating the exact shape (conformation) of even quite a small molecule takes a considerable amount of computing power. You've got Van der Waal's forces, steric interactions, hydrogen bonding, resonance structures, and a load of other factor to account for. On top of those you've got the problem that a molecule may not necessarily have the same shape at different temperatures, pHs, or in the presence of certain metal ions.

Now consider the fact that a molecule in solution often won't have a single conformation - it'll have several with similar energies (which detemine the relative proportions of molecules with each conformation). Only some of these will bind to a given protein, and maybe not with the most abundant conformation in solution.

As others have pointed out, we don't know the exact shape of many proteins - the closest we can really get is from crystal structures, but a lot of proteins won't crystallise, and even these are far from perfect. Have you considered the possibility the protein may change shape slightly to bind to a small molecule? Structures of ligands bound to proteins are incredibly scarce. Here the best you can do is look at similar molecules to known binders, and screen for binding with the desired characteristics.

I've used computer programs for docking small molecules to proteins, and have learnt that it really only provides a starting guess to work from.
 
As a friend of mine did diploma on computational chemistry I have some second hand knowledge on this topic.

Computation power is very limited in the sense that you can only model a part of the protein together with the binding molecule dynamically. You have to have a good starting point for the proteins conformation, you won´t reach the "correct" conformation within a sensible timeframe. Additionally you can only include a certain amount of "environment" like water, ions and so on. You have to be very good to set these parameters right to reach a model that works for proteins in real solutions. It can be, for instance that ions or water molecules are adsorbed in the active centre of the protein and neccessary for substrate-protein interaction, and so on.

Other more "classic" factors why it´s not so easy to design molecules are things like polarity, solubility, stability as they determine
-intake into the body (correct hydrophobicity to be taken up in the digestive system yet still to be dissolved in the bloodstream),
-rate of decay of the molecule (lots of mechanisms present to get rid of xenobiotica, yet you don´t want your medication to stay in the body for too long, obviously)
 
The answer is 'de novo protein synthesis'.
http://en.wikipedia.org/wiki/Protein_folding#Computational_prediction_of_protein_tertiary_structure
http://en.wikipedia.org/wiki/Protein_folding

Wikipedia said:
Because of the many possible ways of folding, there can be many possible structures. A peptide consisting of just five amino acids can fold into over 100 billion possible structures.[15]


Wikipedia said:
The determination of the folded structure of a protein is a lengthy and complicated process, involving methods like X-ray crystallography and NMR. One of the major areas of interest is the prediction of native structure from amino-acid sequences alone using bioinformatics and computational simulation methods.

... or why does it seem that way?

So let's say we have the sequence of part of a DNA molecule. Let's say we want to make a drug that can bind to it. Why can't we put the sequences into a computer, which can calculate it for us? I don't understand why this seemingly cannot be done: We know the target sequence, we surely must know the physical interactions between atoms. We can work out how a known drug can bind to its known target. Are we even able to deduce the approximate shape that the drug must have to bind?

It just seems really odd to me that we can't seem to calculate this sort of stuff.


EDIT: oops forgot I already rsponded awhile ago D''OH!!
 
Another complicated thing with proteins is so-called post-translational modification. This means that after synthesis proteins get modified by enzymes - AAs can be cut, disulfide bridges built from oxidation, Oxidations can be made on other parts (proline-hydroxyproline comes to my mind), sugars can be added, metals incorporated in vital parts, decarboxylation of the C-terminal AA, ect.
I´m not saying de-novo sequencing is useless, merely that after de-novo sequencing those reactions have to be applied as well, and the knowledge on modification patterns is rather incomplete. Not to say that because of those modifications all proteins have several (dozen) isomers. The good thing is that not all of those modifications influence the conformation of the active centre.
 
Back
Top Bottom