{"cells": [{"cell_type": "markdown", "metadata": {}, "source": ["# Exercise 2.4: ORF detection\n", "\n", "This exercise was inspired by [Libeskind-Hadas and Bush, *Computing for Biologists*, Cambridge University Press, 2014](https://www.cs.hmc.edu/CFB).\n", "\n", "
"]}, {"cell_type": "markdown", "metadata": {}, "source": ["**a)** Write a function, `longest_orf()`, that takes a DNA sequence as input and finds the longest open reading frame (ORF) in the sequence (we will not consider reverse complements). A sequence fragment constitutes an ORF if the following are all true.\n", "\n", "1. It begins with `ATG`.\n", "2. It ends with any of `TGA`, `TAG`, or `TAA`.\n", "3. The total number of bases is a multiple of 3.\n", "\n", "Note that the sequence `ATG` may appear in the middle of an ORF. So, for example,\n", "\n", " GGATGATGATGTAAAAC\n", "\n", "has two ORFs, `ATGATGATGTAA` and `ATGATGTAA`. You would return the first one, since it is longer of these two.\n", "\n", "*Hint: The statement for this problem is a bit ambiguous as it is written. What other specification might you need for this function?*"]}, {"cell_type": "markdown", "metadata": {}, "source": ["**b)** Use your function to find the longest ORF from the section of the *Salmonella* genome we are investigating."]}, {"cell_type": "markdown", "metadata": {}, "source": ["**c)** Write a function that converts a DNA sequence into a protein sequence. You can of course use the `bootcamp_utils` module."]}, {"cell_type": "markdown", "metadata": {}, "source": ["**d)** Translate the longest ORF you generated in part (b) into a protein sequence and perform a [BLAST search](http://blast.ncbi.nlm.nih.gov/). Search for the protein sequence (a blastp query). What gene is it?"]}, {"cell_type": "markdown", "metadata": {}, "source": ["**e)** [*Bonus challenge*] Modify your function to return the `n` longest ORFs. Compute the five longest ORFs for the *Salmonella* genome section we are working with. Perform BLAST searches on them. What are they?"]}, {"cell_type": "markdown", "metadata": {}, "source": ["
"]}], "metadata": {"anaconda-cloud": {}, "kernelspec": {"display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.12"}}, "nbformat": 4, "nbformat_minor": 4}