Hey, guys, In this video, we're gonna talk about indirect protein sequencing via genomic analyses. So up until this point in our course, we've only focused on direct protein sequencing methods such as Tana Mass Spectrometry or Edmund Degradation. Now, direct protein sequencing is used on already extracted or isolated proteins, and direct protein sequencing is able to directly identify the sequence of unknown proteins in a sample. However, direct protein sequencing does not account for how biochemists obtain most of their protein sequencing data. And so most of the protein sequencing data is actually derived indirectly from genomic analyses or translating the nucleotide sequences of genes into amino acid sequences. And so this brings up the question. Why is most of the protein sequencing data obtained via genomic analyses? Why would we obtain most of our protein sequencing data this way? Well, it turns out that it actually saves a boatload of time. It saves a lot of time. Working with DNA is actually significantly easier than working with proteins in a lab, And that's because we know that proteins are really sensitive to ah, lots of conditions, and they could be pretty easily denatured if the temperature is off or if the pH is different and DNA is more resistant to, uh, essentially decomposing and breaking apart. And so because DNA is more stable, it's easier toe work with, and so that allows us to essentially work with DNA faster. And it turns out that DNA sequencing is actually significantly faster, cheaper and more efficient and informative than direct protein sequencing. Since direct protein sequencing Onley allows us to obtain the amino acid sequence. But DNA sequencing allows us to obtain the nucleotide sequence and then from that nucleotide sequence, weaken, derive the amino acid sequence using the genetic coat. And so essentially overall genomic analyses allows us to collect MAWR data and more protein sequencing data faster. And so that break that begs the question. Why do we even need direct protein sequencing? If genomic analysis is the best way that allows us to obtain more protein sequencing data faster? Why do we even need direct protein sequencing if indirect genomic analysis is the best of that? Well, it turns out that we can't just scrap direct protein sequencing because direct protein sequencing has its own sets of advantages, and some of those advantages include the fact that genomic analyses are not able to identify an unknown protein sample on its own. And so because it cannot do this, that's something that direct protein sequencing is easily able to do. And that's because when we're working with genomic analyses, we're gonna need a DNA sample on DSO. If we Onley have unknown protein or just protein, then we're not able to perform genomic analyses on these proteins. So it's not, uh, that's not a good thing about genomic analyses now. In addition to that, unlike genomic analyses, direct protein sequencing via tandem mass spectrometry can actually reveal chemically modified amino acid residues. And that allows us to identify UH, essentially proteins that are modified, such as lipoproteins, for instance, which are proteins that are co violently attached to lipids. And so genomic analysis does not reveal chemically modified amino acid residues. But direct protein sequencing can. So that's another advantage of direct protein sequencing and another reason for why we can't just scrap all of the direct protein sequencing techniques. So the rest of this video here is going to refresh our memories on how the genetic code works, which allows us to perform genomic analyses so recall from our previous videos that the genetic code actually reveals the connection between the code eons of nucleic acids and the amino acids of proteins. Mm. And so, in our example below, we're gonna use the genetic code to reveal the peptide sequence in the example shown over here on the right. And so what you'll see is on the left. Here we have the genetic code and recall that the genetic code is essentially reading the code on of the Marna and the code on have three nucleotides. So what? This genetic code? We have the first base of the code on on the left. We have the second base of the code on So the second base of the coat on on the top here. And then we have the third base of the code on over here on the right. And so recall that the first base of the coat on limits us to one particular row here, the second base of the code on limits us to one particular column and then the third particular code on limits us to a specific position in a box. And so what you'll see here is that we have a DNA coding sequence that's provided, and you can see that it has a five prime end and a three prime end. And so we know that this DNA coding sequence can be converted into an M R DNA sequence through the process of transcription that's shown here represented by this Arab. And so we know that the the Marna sequence is going to be exactly the same as the DNA coding sequence up above. Accept the fact that all of the teas, all of the three needs are going to be converted into use or your cells, because M. R. A on Lee has your cells. And so these 23 Indians here are going to be converted into your cells in our our DNA sequence. And so, uh, now that we have our m r in a sequence, we know that the genetic code breaks down and reads the m r n a sequence in code ions which are sets of three nucleotides. So our first code on are these first three nucleotides a you G, and so again, the first base of our code on is A and so because it's a it limits us to this column. I'm sorry. This row, the second base of our code on is you. So we could see that here, you and so in the second base of our code on it limits us toe one particular Cottam column. So the overlap between these two is this box right here. And then the third coat on is, uh I'm sorry the third base of our code on is G. And so that limits us to this particular position within the box, which is a U G and A you G code on corresponds with a meth einen amino acid residue. Which is why we have meth. I inning as our first residue on the end terminal end of our peptide. So moving on to our next code on we have G c u. And so GC, you corresponds with this first residue here in this column. I'm sorry, this rope, Then we have C, which limits us to this column. So now we're in this box, and then you will limits us to this one particular position. G C U, which is an ally. Nine amino acid residue. So over here we can put an A for Alan in in that position. And so, essentially, what we could do is continue through this process here and move on to our next code on. So the next code on is GC and GC G is here in this row, G uh, the second one is G. What's that? Limits us to this column. So now we're in this box and then see here limits us to G C, which is glazing. So glazing is our next residue. And now you guys air probably, uh, remembering how this works here. And so what we could do is a fill out the rest of these code on here. So we have after GC, we have C g. Then we have a GC and then last but not least, we have a and so c g corresponds with an Argentine. So this is an Argentine C g. And then a GC corresponds with a searing. And then, of course, a A corresponds with a lie scene. And so what we can see is that the amino acid sequence of our peptide is actually revealed through genomic analyses. We obtain the DNA sequence and we sequence that DNA and then through the process of transcription and translation. The genetic code were able to obtain the sequence of our peptide. And so this is an indirect method to be able to sequence are peptides. And that's exactly how, uh, indirect sequencing via genomic analyses works. And so in our next couple of videos will be able to get some practice utilizing the genetic code and indirect protein sequencing, so I'll see you guys in those practice videos.
2
Problem
Use the genetic code above & the coding DNA sequence below to determine the protein sequence.
Was this helpful?
3
Problem
Suppose the sequence below is a template DNA sequence. What is the corresponding protein sequence?
Was this helpful?
4
Problem
Even when the sequence of nucleotides for a gene is available and genomic analyses can be performed, direct chemical techniques on the physical protein are still required to determine:
A
The molecular weight of a simple protein.
B
The N-terminal amino acid residue.
C
The total number of amino acid residues in the protein.