RangeFrame
The RangeFrame is the parent class of PyRanges. It supports interval-based operations that do not require the data to contain Chromosome and Strand information. It is a subclass of pandas.DataFrame.
- class pyranges1.range_frame.range_frame.RangeFrame(*args, **kwargs)
Class for range based operations.
A table with Start and End columns. Parent class of PyRanges. Subclass of pandas DataFrame.
- cluster_overlaps(*, match_by: str | Iterable[str] | None = None, cluster_column: str = 'Cluster', slack: int = 0) RangeFrame
Give overlapping intervals a common id.
- Parameters:
match_by (str or list, default None) – If provided, only intervals with an equal value in column(s) match_by may be considered as overlapping.
slack (int, default 0) – Length by which the criteria of overlap are loosened. A value of 1 clusters also bookended intervals. Higher slack values cluster more distant intervals (with a maximum distance of slack-1 between them).
cluster_column – Name the cluster column added in output. Default: “Cluster”
- Returns:
RangeFrame with an ID-column “Cluster” added.
- Return type:
See also
RangeFrame.mergecombine overlapping intervals into one
- combine_interval_columns(function: Literal['intersect', 'union', 'swap'] | CombineIntervalColumnsOperation = 'intersect', *, start: str = 'Start', end: str = 'End', start2: str = 'Start_b', end2: str = 'End_b', drop_old_columns: bool = True) RangeFrame
Use two pairs of columns representing intervals to create a new start and end column.
The function is designed as post-processing after join_overlaps to aggregate the coordinates of the two intervals. By default, the new start and end columns will be the intersection of the intervals.
- Parameters:
function ({"intersect", "union", "swap"} or Callable, default "intersect") – How to combine the self and other intervals: “intersect”, “union”, or “swap” If a callable is passed, it should take four Series arguments: start1, end1, start2, end2; and return a tuple of two integers: (new_starts, new_ends).
start (str, default "Start") – Column name for Start of first interval
end (str, default "End") – Column name for End of first interval
start2 (str, default "Start_b") – Column name for Start of second interval
end2 (str, default "End_b") – Column name for End of second interval
drop_old_columns (bool, default True) – Whether to drop the above mentioned columns.
- copy(*args, **kwargs) RangeFrame
Make a copy of this object’s indices and data.
When
deep=True(default), a new object will be created with a copy of the calling object’s data and indices. Modifications to the data or indices of the copy will not be reflected in the original object (see notes below).When
deep=False, a new object will be created without copying the calling object’s data or index (only references to the data and index are copied). Any changes to the data of the original will be reflected in the shallow copy (and vice versa).Note
The
deep=Falsebehaviour as described above will change in pandas 3.0. Copy-on-Write will be enabled by default, which means that the “shallow” copy is that is returned withdeep=Falsewill still avoid making an eager copy, but changes to the data of the original will no longer be reflected in the shallow copy (or vice versa). Instead, it makes use of a lazy (deferred) copy mechanism that will copy the data only when any changes to the original or shallow copy is made.You can already get the future behavior and improvements through enabling copy on write
pd.options.mode.copy_on_write = True- Parameters:
deep (bool, default True) – Make a deep copy, including a copy of the data and the indices. With
deep=Falseneither the indices nor the data are copied.- Returns:
Object type matches caller.
- Return type:
Notes
When
deep=True, data is copied but actual Python objects will not be copied recursively, only the reference to the object. This is in contrast to copy.deepcopy in the Standard Library, which recursively copies object data (see examples below).While
Indexobjects are copied whendeep=True, the underlying numpy array is not copied for performance reasons. SinceIndexis immutable, the underlying data can be safely shared and a copy is not needed.Since pandas is not thread safe, see the gotchas when copying in a threading environment.
When
copy_on_writein pandas config is set toTrue, thecopy_on_writeconfig takes effect even whendeep=False. This means that any changes to the copied data would make a new copy of the data upon write (and vice versa). Changes made to either the original or copied variable would not be reflected in the counterpart. See Copy_on_Write for more information.Examples
>>> s = pd.Series([1, 2], index=["a", "b"]) >>> s a 1 b 2 dtype: int64
>>> s_copy = s.copy() >>> s_copy a 1 b 2 dtype: int64
Shallow copy versus default (deep) copy:
>>> s = pd.Series([1, 2], index=["a", "b"]) >>> deep = s.copy() >>> shallow = s.copy(deep=False)
Shallow copy shares data and index with original.
>>> s is shallow False >>> s.values is shallow.values and s.index is shallow.index True
Deep copy has own copy of data and index.
>>> s is deep False >>> s.values is deep.values or s.index is deep.index False
Updates to the data shared by shallow copy and original is reflected in both (NOTE: this will no longer be true for pandas >= 3.0); deep copy remains unchanged.
>>> s.iloc[0] = 3 >>> shallow.iloc[1] = 4 >>> s a 3 b 4 dtype: int64 >>> shallow a 3 b 4 dtype: int64 >>> deep a 1 b 2 dtype: int64
Note that when copying an object containing Python objects, a deep copy will copy the data, but will not do so recursively. Updating a nested data object will be reflected in the deep copy.
>>> s = pd.Series([[1, 2], [3, 4]]) >>> deep = s.copy() >>> s[0][0] = 10 >>> s 0 [10, 2] 1 [3, 4] dtype: object >>> deep 0 [10, 2] 1 [3, 4] dtype: object
Copy-on-Write is set to true, the shallow copy is not modified when the original data is changed:
>>> with pd.option_context("mode.copy_on_write", True): ... s = pd.Series([1, 2], index=["a", "b"]) ... copy = s.copy(deep=False) ... s.iloc[0] = 100 ... s a 100 b 2 dtype: int64 >>> copy a 1 b 2 dtype: int64
- count_overlaps(other: RangeFrame, *, match_by: str | list[str] | None = None, slack: int = 0) Series
Count the number of overlaps per interval.
For each interval in self, count how many intervals in
otheroverlap with it. The overlap computation is based on the start and end coordinates, with an optionalslackparameter to adjust the overlap threshold by temporarily extending the intervals.- Parameters:
other (RangeFrame) – The RangeFrame whose intervals are compared against those in self for overlap counting.
match_by (str or list, default None) – Column(s) to group intervals by when determining overlaps. Only intervals with equal values in the specified column(s) will be considered as overlapping.
slack (int, default 0) – Temporarily extend intervals in self by this many nucleotides before checking for overlaps, thereby adjusting the overlap threshold.
- Returns:
A pandas Series where each element corresponds to the number of overlapping intervals in
otherfor the corresponding interval in self.- Return type:
pd.Series
- drop(*args, **kwargs) RangeFrame | None
Drop specified labels from rows or columns.
Remove rows or columns by specifying label names and corresponding axis, or by directly specifying index or column names. When using a multi-index, labels on different levels can be removed by specifying the level. See the user guide for more information about the now unused levels.
- Parameters:
labels (single label or list-like) – Index or column labels to drop. A tuple will be used as a single label and not treated as a list-like.
axis ({0 or 'index', 1 or 'columns'}, default 0) – Whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).
index (single label or list-like) – Alternative to specifying axis (
labels, axis=0is equivalent toindex=labels).columns (single label or list-like) – Alternative to specifying axis (
labels, axis=1is equivalent tocolumns=labels).level (int or level name, optional) – For MultiIndex, level from which the labels will be removed.
inplace (bool, default False) – If False, return a copy. Otherwise, do operation in place and return None.
errors ({'ignore', 'raise'}, default 'raise') – If ‘ignore’, suppress error and only existing labels are dropped.
- Returns:
Returns DataFrame or None DataFrame with the specified index or column labels removed or None if inplace=True.
- Return type:
DataFrame or None
- Raises:
KeyError – If any of the labels is not found in the selected axis.
See also
DataFrame.locLabel-location based indexer for selection by label.
DataFrame.dropnaReturn DataFrame with labels on given axis omitted where (all or any) data are missing.
DataFrame.drop_duplicatesReturn DataFrame with duplicate rows removed, optionally only considering certain columns.
Series.dropReturn Series with specified index labels removed.
Examples
>>> df = pd.DataFrame(np.arange(12).reshape(3, 4), ... columns=['A', 'B', 'C', 'D']) >>> df A B C D 0 0 1 2 3 1 4 5 6 7 2 8 9 10 11
Drop columns
>>> df.drop(['B', 'C'], axis=1) A D 0 0 3 1 4 7 2 8 11
>>> df.drop(columns=['B', 'C']) A D 0 0 3 1 4 7 2 8 11
Drop a row by index
>>> df.drop([0, 1]) A B C D 2 8 9 10 11
Drop columns and/or rows of MultiIndex DataFrame
>>> midx = pd.MultiIndex(levels=[['llama', 'cow', 'falcon'], ... ['speed', 'weight', 'length']], ... codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2], ... [0, 1, 2, 0, 1, 2, 0, 1, 2]]) >>> df = pd.DataFrame(index=midx, columns=['big', 'small'], ... data=[[45, 30], [200, 100], [1.5, 1], [30, 20], ... [250, 150], [1.5, 0.8], [320, 250], ... [1, 0.8], [0.3, 0.2]]) >>> df big small llama speed 45.0 30.0 weight 200.0 100.0 length 1.5 1.0 cow speed 30.0 20.0 weight 250.0 150.0 length 1.5 0.8 falcon speed 320.0 250.0 weight 1.0 0.8 length 0.3 0.2
Drop a specific index combination from the MultiIndex DataFrame, i.e., drop the combination
'falcon'and'weight', which deletes only the corresponding row>>> df.drop(index=('falcon', 'weight')) big small llama speed 45.0 30.0 weight 200.0 100.0 length 1.5 1.0 cow speed 30.0 20.0 weight 250.0 150.0 length 1.5 0.8 falcon speed 320.0 250.0 length 0.3 0.2
>>> df.drop(index='cow', columns='small') big llama speed 45.0 weight 200.0 length 1.5 falcon speed 320.0 weight 1.0 length 0.3
>>> df.drop(index='length', level=1) big small llama speed 45.0 30.0 weight 200.0 100.0 cow speed 30.0 20.0 weight 250.0 150.0 falcon speed 320.0 250.0 weight 1.0 0.8
- join_overlaps(other: RangeFrame, *, join_type: Literal['inner', 'left', 'outer', 'right'] = 'inner', multiple: Literal['first', 'all', 'last', 'contained'] = 'all', match_by: str | Iterable[str] | None = None, slack: int = 0, suffix: str = '_b', contained_intervals_only: bool = False, report_overlap_column: str | None = None, preserve_input_order: bool = True) RangeFrame
Join RangeFrame objects based on overlapping intervals.
Find pairs of overlapping intervals between self and other and combine their attributes. Each row in the output contains columns from both intervals, including their start and end positions. By default, only overlapping intervals are included, but the join_type parameter controls how intervals without overlaps are handled.
- Parameters:
other (RangeFrame) – The RangeFrame to join with.
join_type ({"inner", "left", "right", "outer"}, default "inner") – Specifies how to handle intervals that do not overlap. “inner” returns only overlapping intervals, “left” returns all intervals from self (with missing values for non-overlapping intervals from other), “right” returns all intervals from other, and “outer” returns all intervals from both.
multiple ({"all", "first", "last"}, default "all") – Determines which overlapping interval(s) to report when multiple intervals in other overlap the same interval in self. “all” reports all overlaps (which may lead to duplicate rows), “first” reports only the overlapping interval with the smallest start in other, and “last” reports only the overlapping interval with the largest end in other.
match_by (str or list, default None) – If provided, only intervals with matching values in the specified column(s) will be joined.
slack (int, default 0) – Temporarily extend intervals in self by this many units on both ends before checking for overlaps.
suffix (str, default JOIN_SUFFIX) – Suffix to append to columns from the other RangeFrame in the output.
contained_intervals_only (bool, default False) – If True, only join intervals from self that are entirely contained within an interval from other.
report_overlap_column (str or None, default None) – If provided, add a column with this name reporting the amount of overlap between joined intervals. The overlap is computed as the minimum of the end positions minus the maximum of the start positions.
preserve_input_order (bool, default True) –
Whether to preserve the original input order in the result.
If False, rows may be returned in algorithm/output order instead, which can be faster for large results.
- Returns:
A new RangeFrame containing the joined intervals with columns from both input RangeFrames. The indices of the input RangeFrames are not preserved in the output.
- Return type:
Notes
Attributes from the other RangeFrame may have their column names modified by appending the specified suffix.
- max_disjoint_overlaps(*, slack: int = 0, match_by: str | Iterable[str] | None = None, preserve_input_order: bool = True) RangeFrame
Find the maximal disjoint set of intervals.
Returns a subset of the rows in self so that no two intervals overlap, choosing those that maximize the number of intervals in the result.
- Parameters:
slack (int, default 0) – Length by which the criteria of overlap are loosened. A value of 1 implies that bookended intervals are considered overlapping. Higher slack values allow more distant intervals (with a maximum distance of slack-1 between them).
match_by (str or list, default None) – If provided, only intervals with an equal value in column(s) match_by may be considered as overlapping.
preserve_input_order (bool, default True) –
Whether to preserve the original input order in the result.
If False, rows may be returned in algorithm/output order instead, which can be faster for large results.
- Returns:
RangeFrame with maximal disjoint set of intervals.
- Return type:
See also
RangeFrame.merge_overlapsmerge intervals into non-overlapping superintervals
RangeFrame.clusterannotate overlapping intervals with common ID
- merge_overlaps(*, count_col: str | None = None, match_by: str | Iterable[str] | None = None, slack: int = 0) RangeFrame
Merge overlapping intervals into one.
Merge overlapping intervals into a single superinterval by uniting intervals that overlap, optionally allowing a small gap (specified by
slack) between intervals to be merged. The resulting RangeFrame will contain the merged intervals, and ifcount_colis provided, a column with the counts of merged intervals will be included.- Parameters:
count_col (str or None, default None) – Name of the column to store the count of intervals merged into each superinterval. If None, no count column is added.
match_by (str or list, default None) – Column(s) to group intervals by before merging. Only intervals with equal values in the specified column(s) will be considered as overlapping.
slack (int, default 0) – Allow this many nucleotides between intervals to still consider them overlapping.
- Returns:
A RangeFrame with merged (super) intervals. Metadata columns, index, and order are not necessarily preserved.
- Return type:
- nearest_ranges(other: RangeFrame, *, match_by: str | Iterable[str] | None = None, suffix: str = '_b', exclude_overlaps: bool = False, k: int = 1, dist_col: str | None = 'Distance', direction: Literal['any', 'forward', 'backward'] = 'any', preserve_input_order: bool = True) RangeFrame
Find closest interval.
For each interval in self RangeFrame, the columns of the nearest interval in other RangeFrame are appended.
- Parameters:
other (RangeFrame) – RangeFrame to find nearest interval in.
exclude_overlaps (bool, default True) – Whether to not report intervals of others that overlap with self as the nearest ones.
direction ({"any", "forward", "backward"}, default "any", i.e. both directions) – Whether to only look for nearest in one direction.
match_by (str or list, default None) – If provided, only intervals with an equal value in column(s) match_by may be matched.
k (int, default 1) – Number of nearest intervals to fetch.
suffix (str, default "_b") – Suffix to give columns with shared name in other.
dist_col (str or None) – Optional column to store the distance in.
preserve_input_order (bool, default True) –
Whether to preserve the original input order in the result.
If False, rows may be returned in algorithm/output order instead, which can be faster for large results.
- Returns:
A RangeFrame with columns representing nearest interval horizontally appended.
- Return type:
See also
RangeFrame.join_overlapsHas a slack argument to find intervals within a distance.
- overlap(other: RangeFrame, multiple: Literal['first', 'all', 'last', 'contained'] = 'all', slack: int = 0, *, contained_intervals_only: bool = False, match_by: str | Iterable[str] | None = None, preserve_input_order: bool = True) RangeFrame
Return overlapping intervals.
Returns the intervals in self which overlap with those in other.
- Parameters:
other (RangeFrame) – RangeFrame to find overlaps with.
multiple ({"all", "first", "last"}, default "all") – What intervals to report when multiple intervals in ‘other’ overlap with the same interval in self. The default “all” reports all overlapping subintervals, which will have duplicate indices. “first” reports only, for each interval in self, the overlapping subinterval with smallest Start in ‘other’ “last” reports only the overlapping subinterval with the biggest End in ‘other’
slack (int, default 0) – Intervals in self are temporarily extended by slack on both ends before overlap is calculated, so that we allow non-overlapping intervals to be considered overlapping if they are within less than slack distance e.g. slack=1 reports bookended intervals.
contained_intervals_only (bool, default False) – Whether to report only intervals that are entirely contained in an interval of ‘other’.
match_by (str or list, default None) – If provided, only overlapping intervals with an equal value in column(s) match_by are reported.
preserve_input_order (bool, default True) –
Whether to preserve the original input order in the result.
If False, rows may be returned in algorithm/output order instead, which can be faster for large results.
- Returns:
A RangeFrame with overlapping intervals.
- Return type:
See also
RangeFrame.intersectreport overlapping subintervals
RangeFrame.set_intersectset-intersect RangeFrame (e.g. merge then intersect)
- reindex(*args, **kwargs) RangeFrame
Conform DataFrame to new index with optional filling logic.
Places NA/NaN in locations having no value in the previous index. A new object is produced unless the new index is equivalent to the current one and
copy=False.- Parameters:
labels (array-like, optional) – New labels / index to conform the axis specified by ‘axis’ to.
index (array-like, optional) – New labels for the index. Preferably an Index object to avoid duplicating data.
columns (array-like, optional) – New labels for the columns. Preferably an Index object to avoid duplicating data.
axis (int or str, optional) – Axis to target. Can be either the axis name (‘index’, ‘columns’) or number (0, 1).
method ({None, 'backfill'/'bfill', 'pad'/'ffill', 'nearest'}) –
Method to use for filling holes in reindexed DataFrame. Please note: this is only applicable to DataFrames/Series with a monotonically increasing/decreasing index.
None (default): don’t fill gaps
pad / ffill: Propagate last valid observation forward to next valid.
backfill / bfill: Use next valid observation to fill gap.
nearest: Use nearest valid observations to fill gap.
copy (bool, default True) –
Return a new object, even if the passed indexes are the same.
Note
The copy keyword will change behavior in pandas 3.0. Copy-on-Write will be enabled by default, which means that all methods with a copy keyword will use a lazy copy mechanism to defer the copy and ignore the copy keyword. The copy keyword will be removed in a future version of pandas.
You can already get the future behavior and improvements through enabling copy on write
pd.options.mode.copy_on_write = Truelevel (int or name) – Broadcast across a level, matching Index values on the passed MultiIndex level.
fill_value (scalar, default np.nan) – Value to use for missing values. Defaults to NaN, but can be any “compatible” value.
limit (int, default None) – Maximum number of consecutive elements to forward or backward fill.
tolerance (optional) –
Maximum distance between original and new labels for inexact matches. The values of the index at the matching locations most satisfy the equation
abs(index[indexer] - target) <= tolerance.Tolerance may be a scalar value, which applies the same tolerance to all values, or list-like, which applies variable tolerance per element. List-like includes list, tuple, array, Series, and must be the same size as the index and its dtype must exactly match the index’s type.
- Return type:
DataFrame with changed index.
See also
DataFrame.set_indexSet row labels.
DataFrame.reset_indexRemove row labels or move them to new columns.
DataFrame.reindex_likeChange to same indices as other DataFrame.
Examples
DataFrame.reindexsupports two calling conventions(index=index_labels, columns=column_labels, ...)(labels, axis={'index', 'columns'}, ...)
We highly recommend using keyword arguments to clarify your intent.
Create a dataframe with some fictional data.
>>> index = ['Firefox', 'Chrome', 'Safari', 'IE10', 'Konqueror'] >>> df = pd.DataFrame({'http_status': [200, 200, 404, 404, 301], ... 'response_time': [0.04, 0.02, 0.07, 0.08, 1.0]}, ... index=index) >>> df http_status response_time Firefox 200 0.04 Chrome 200 0.02 Safari 404 0.07 IE10 404 0.08 Konqueror 301 1.00
Create a new index and reindex the dataframe. By default values in the new index that do not have corresponding records in the dataframe are assigned
NaN.>>> new_index = ['Safari', 'Iceweasel', 'Comodo Dragon', 'IE10', ... 'Chrome'] >>> df.reindex(new_index) http_status response_time Safari 404.0 0.07 Iceweasel NaN NaN Comodo Dragon NaN NaN IE10 404.0 0.08 Chrome 200.0 0.02
We can fill in the missing values by passing a value to the keyword
fill_value. Because the index is not monotonically increasing or decreasing, we cannot use arguments to the keywordmethodto fill theNaNvalues.>>> df.reindex(new_index, fill_value=0) http_status response_time Safari 404 0.07 Iceweasel 0 0.00 Comodo Dragon 0 0.00 IE10 404 0.08 Chrome 200 0.02
>>> df.reindex(new_index, fill_value='missing') http_status response_time Safari 404 0.07 Iceweasel missing missing Comodo Dragon missing missing IE10 404 0.08 Chrome 200 0.02
We can also reindex the columns.
>>> df.reindex(columns=['http_status', 'user_agent']) http_status user_agent Firefox 200 NaN Chrome 200 NaN Safari 404 NaN IE10 404 NaN Konqueror 301 NaN
Or we can use “axis-style” keyword arguments
>>> df.reindex(['http_status', 'user_agent'], axis="columns") http_status user_agent Firefox 200 NaN Chrome 200 NaN Safari 404 NaN IE10 404 NaN Konqueror 301 NaN
To further illustrate the filling functionality in
reindex, we will create a dataframe with a monotonically increasing index (for example, a sequence of dates).>>> date_index = pd.date_range('1/1/2010', periods=6, freq='D') >>> df2 = pd.DataFrame({"prices": [100, 101, np.nan, 100, 89, 88]}, ... index=date_index) >>> df2 prices 2010-01-01 100.0 2010-01-02 101.0 2010-01-03 NaN 2010-01-04 100.0 2010-01-05 89.0 2010-01-06 88.0
Suppose we decide to expand the dataframe to cover a wider date range.
>>> date_index2 = pd.date_range('12/29/2009', periods=10, freq='D') >>> df2.reindex(date_index2) prices 2009-12-29 NaN 2009-12-30 NaN 2009-12-31 NaN 2010-01-01 100.0 2010-01-02 101.0 2010-01-03 NaN 2010-01-04 100.0 2010-01-05 89.0 2010-01-06 88.0 2010-01-07 NaN
The index entries that did not have a value in the original data frame (for example, ‘2009-12-29’) are by default filled with
NaN. If desired, we can fill in the missing values using one of several options.For example, to back-propagate the last valid value to fill the
NaNvalues, passbfillas an argument to themethodkeyword.>>> df2.reindex(date_index2, method='bfill') prices 2009-12-29 100.0 2009-12-30 100.0 2009-12-31 100.0 2010-01-01 100.0 2010-01-02 101.0 2010-01-03 NaN 2010-01-04 100.0 2010-01-05 89.0 2010-01-06 88.0 2010-01-07 NaN
Please note that the
NaNvalue present in the original dataframe (at index value 2010-01-03) will not be filled by any of the value propagation schemes. This is because filling while reindexing does not look at dataframe values, but only compares the original and desired indexes. If you do want to fill in theNaNvalues present in the original dataframe, use thefillna()method.See the user guide for more.
- sort_by_position() RangeFrame
Sort by Start and End columns.
- sort_ranges(by: str | Iterable[str] | None = None, *, natsort: bool = True, sort_rows_reverse_order: Sequence[bool] | None = None) RangeFrame
Sort RangeFrame according to Start, End, and any other columns given.
For uses not covered by this function, use DataFrame.sort_values().
- Parameters:
by (str or list of str, default None) – in the desired order as part of the ‘by’ argument.
natsort (bool, default False) – Whether to use natural sorting for the columns in match_by.
sort_rows_reverse_order (sequence of bools or None) – Whether to sort these rows in the reverse order for the starts and ends.
- Returns:
Sorted RangeFrame. The index is preserved. Use .reset_index(drop=True) to reset the index.
- Return type:
- subtract_overlaps(other: RangeFrame, match_by: str | Iterable[str] | None = None, *, preserve_input_order: bool = True) RangeFrame
Subtract intervals, i.e. return non-overlapping subintervals.
Identify intervals in other that overlap with intervals in self; return self with the overlapping parts removed.
- Parameters:
other – RangeFrame to subtract.
match_by (str or list, default None) – If provided, only intervals with an equal value in column(s) match_by may be considered as overlapping.
preserve_input_order (bool, default True) –
Whether to preserve the original input order in the result.
If False, rows may be returned in algorithm/output order instead, which can be faster for large results.
- Returns:
RangeFrame with subintervals from self that do not overlap with any interval in other. Columns and index are preserved.
- Return type:
Warning
The returned Pyranges may have index duplicates. Call .reset_index(drop=True) to fix it.
See also
RangeFrame.overlapuse with invert=True to return all intervals without overlap
RangeFrame.complement_rangesreturn the internal complement_ranges of intervals, i.e. its introns.