Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Biology
BiologyBotanyMicrobiologyEntomologyEvolutionPaleontology
Chemistry
General ChemistryAnalytical ChemistryElectrochemistryOrganic Synthesis
Earth Science
GeologyMineralogyOceanographyMeteorologyEarthquakes
Physics
General PhysicsResearchRelativityParticle PhysicsElectromagnetismFusionOpticsAcousticsNew Theories

Natural Science Forum / Biology / Biology / December 2007



Tip: Looking for answers? Try searching our database.

what exactly was sequenced by Human Genome Project?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
bennetthaselton@gmail.com - 20 Dec 2007 02:12 GMT
I've read many alleged descriptions of what the Human Genome Project
did but I still don't clearly understand what they mean by having
"sequenced" the human genome.

If an individual person's DNA consists of a sequence G, A, T and C
arranged in codons of 3 each, but every person's DNA is different,
then is it possible to explain, *in terms of the G, A, T, C sequence*:
- what did we already know before the Human Genome Project?
- what new knowledge did we have after the Human Genome Project was
finished?  (Wikipedia says for example 'mapping "the human genome"
involves sequencing multiple variations of each gene'.  Does that mean
deciding on a starting and ending "boundary" for each gene, which
occurs at the same position in the sequence for every human, and then
counting all the possible variations of it?)
and
- what do we still not know?  (I assume this category includes, for
example, we don't know where in the sequence to find the genes for
specific traits like the "gene for a large right big toe".)
Bob - 20 Dec 2007 05:06 GMT
>I've read many alleged descriptions of what the Human Genome Project
>did but I still don't clearly understand what they mean by having
>"sequenced" the human genome.

Good for you. :-) It is not simple.

>If an individual person's DNA consists of a sequence G, A, T and C
>arranged in codons of 3 each, but every person's DNA is different,
>then is it possible to explain, *in terms of the G, A, T, C sequence*:
>- what did we already know before the Human Genome Project?

very little. We had fragmentary information at various levels. For
simplicity at the moment, let's set this aside.

>- what new knowledge did we have after the Human Genome Project was
>finished?  

We have a "reference" genome. DNA from one person (Or actually from a
small mixture of people) has been sequenced. That is, we know _this_
sequence of AGCT, from "one end to the other" (with some
reservations). We do not know how typical it is, and we have almost no
info on variability. Further, the sequence alone does not provide any
information about where genes start and stop.

The general idea is that it is a starting point, that we will build
on. Thus the idea of a "reference" genome.

There are separate projects on looking for variations.

The process of trying to identify genes along the sequence is called
annotation. It typically requires more info than just the DNA sequence
to find genes. Some annotation was reported in the early reports, but
it continues.

Also note that the complete genome sequences for two specific
individuals (Venter and Watson) were announced recently.

>(Wikipedia says for example 'mapping "the human genome"
>involves sequencing multiple variations of each gene'.  

That is confusing things.

>Does that mean
>deciding on a starting and ending "boundary" for each gene, which
[quoted text clipped - 4 lines]
>example, we don't know where in the sequence to find the genes for
>specific traits like the "gene for a large right big toe".)

As with so many parts of this, that will vary. There are traits where
we do know the gene. For example, the genes for hemoglobin have long
been known. However, the more complex the trait, the less likely we
know the many genes that are involved.

I think it is probably best to think of "the human genome" -- esp that
reference genome -- as a dataset that people will use for various
purposes. Alone, it doesn't tell us much. But it allows better
progress on many fronts.

Hope that helps some, while still trying to be brief here.

bob
bennetthaselton@gmail.com - 20 Dec 2007 16:55 GMT
> On Wed, 19 Dec 2007 18:12:59 -0800 (PST), bennetthasel...@gmail.com
> wrote:
[quoted text clipped - 35 lines]
> Also note that the complete genome sequences for two specific
> individuals (Venter and Watson) were announced recently.

OK, but I'm confused about what is special about that, if the
completion of the Human Genome Project meant the complete sequencing
of DNA from a small mixture of people.  If it was such a big milestone
to sequence all the DNA from one individual, why didn't they just do
that from the very beginning, instead of sequencing DNA from the
"small mixture"?

> >(Wikipedia says for example 'mapping "the human genome"
> >involves sequencing multiple variations of each gene'.  
[quoted text clipped - 14 lines]
> been known. However, the more complex the trait, the less likely we
> know the many genes that are involved.

Thanks for that explanation.  How do they look for the gene for a
particular trait?  Do they observe directly the way that DNA sequences
influence synthesis of a particular protein, or do they just observe
that people with one trait have one sequence of DNA in a particular
place, while people with another trait have a different sequence?

After mapping 1 person's DNA, suppose scientists then map the DNA of
100 different individuals.  You could then choose some trait such that
50 people in your group have it and 50 don't, and then look for places
in the DNA sequence where everybody in the first group has property X
and everybody in the second group has property Y.  Would this be a
reliable way of finding if a specific part of DNA sequence correlated
with a particular trait, and hence that that part of the DNA sequence
would in fact be the gene for that trait?

> I think it is probably best to think of "the human genome" -- esp that
> reference genome -- as a dataset that people will use for various
> purposes. Alone, it doesn't tell us much. But it allows better
> progress on many fronts.
Bob - 21 Dec 2007 03:47 GMT
>> On Wed, 19 Dec 2007 18:12:59 -0800 (PST), bennetthasel...@gmail.com
>> wrote:

>> Also note that the complete genome sequences for two specific
>> individuals (Venter and Watson) were announced recently.
[quoted text clipped - 5 lines]
>that from the very beginning, instead of sequencing DNA from the
>"small mixture"?

Not sure I know that offhand.

With the public genome project, it may have been a "requirement" that
no one's personal information be revealed.

Why Celera chose to do a mixture, I don't know; perhaps it is in the
records somewhere. (Perhaps Venter was not ready to agree to
disclosure of his genome when that project started.)

Much of the Celera data was from Venter's DNA. The Venter genome that
was finally released was based on that, plus additional data as
needed. In fact, they sorted out the separate sequences of each
chromosome of his diploid set. They do not know which is from mother
and which is from father, though finding out is under consideration.
(His mother is still alive, as I recall.)

The Watson genome was a new project, designed to show off  a new
sequencing technique.

>> >- what do we still not know?  (I assume this category includes, for
>> >example, we don't know where in the sequence to find the genes for
[quoted text clipped - 10 lines]
>that people with one trait have one sequence of DNA in a particular
>place, while people with another trait have a different sequence?

Both of those are logical.

The big problem is limited info and expensive technologies. So what
one does depends on what is known -- and there is a lot of
bootstrapping.

With the basic genome at hand, people are trying to assign function to
putative genes. This is easy for some, hard for others. But the more
we know, the easier it is to proceed.

Data from animal models offers clues -- that need to be followed up.
In fact, with mice, one can intentionally change the gene and see what
the effect is. This year's Nobel in Medicine was actually about this.
Since mice are very much like people... (Seriously, a caution. Mouse
data gives clues, but some are right and some are not.)

>After mapping 1 person's DNA, suppose scientists then map the DNA of
>100 different individuals.  

Ok, let's pause there. The cost of sequencing a person's genome is
still on the order of 1-10 million dollars (US). The dream (and goal)
is to get that down to more like a thousand dollars per genome -- at
which point it will become practical to routinely sequence people’s
whole genomes. But for now, one will only sequence 100 individuals for
a particular gene (or genetic region) known (or considered likely) to
be of interest. If the gene is known, good. If not, then we are back
to that problem -- but more and more genes are getting assigned.

>You could then choose some trait such that
>50 people in your group have it and 50 don't, and then look for places
[quoted text clipped - 3 lines]
>with a particular trait, and hence that that part of the DNA sequence
>would in fact be the gene for that trait?

Indeed. In fact, a method long used to try to find the gene is similar
to what you say, except not involving sequencing (which is too
expensive). Instead, it involves the simpler method of mapping
restriction fragments, which reflects the sequence (but at a gross
level, and can miss things).

bob
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.