:py:mod:`pyranges.get_fasta` ============================ .. py:module:: pyranges.get_fasta Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: pyranges.get_fasta.get_fasta .. py:function:: get_fasta(gr, path=None, pyfaidx_fasta=None) Get fasta sequence. :param gr: Coordinates. :type gr: PyRanges :param path: Path to fasta file. It will be indexed using pyfaidx if an index is not found :type path: str :param pyfaidx_fasta: Alternative method to provide fasta target, as a pyfaidx.Fasta object :type pyfaidx_fasta: pyfaidx.Fasta :returns: Sequences, one per interval. :rtype: Series .. note:: Sorting the PyRanges is likely to improve the speed. Intervals on the negative strand will be reverse complemented. .. warning:: Note that the names in the fasta header and gr must be the same. .. rubric:: Examples >>> gr = pr.from_dict({"Chromosome": ["chr1", "chr1"], ... "Start": [5, 0], "End": [8, 5]}) >>> gr +--------------+-----------+-----------+ | Chromosome | Start | End | | (category) | (int32) | (int32) | |--------------+-----------+-----------| | chr1 | 5 | 8 | | chr1 | 0 | 5 | +--------------+-----------+-----------+ Unstranded PyRanges object has 2 rows and 3 columns from 1 chromosomes. For printing, the PyRanges was sorted on Chromosome. >>> tmp_handle = open("temp.fasta", "w+") >>> _ = tmp_handle.write("> chr1\n") >>> _ = tmp_handle.write("ATTACCAT") >>> tmp_handle.close() >>> seq = pr.get_fasta(gr, "temp.fasta") >>> seq 0 CAT 1 ATTAC dtype: object >>> gr.seq = seq >>> gr +--------------+-----------+-----------+------------+ | Chromosome | Start | End | seq | | (category) | (int32) | (int32) | (object) | |--------------+-----------+-----------+------------| | chr1 | 5 | 8 | CAT | | chr1 | 0 | 5 | ATTAC | +--------------+-----------+-----------+------------+ Unstranded PyRanges object has 2 rows and 4 columns from 1 chromosomes. For printing, the PyRanges was sorted on Chromosome.