rcx_tk.sequence
Functions
|
Processes a metadata file, keeping and renaming specific columns. |
|
Processes the metadata dataframe. |
|
Removes the file Name column and moves the sampleName col. |
|
Validates if injectionOrder is of integer type. |
|
Derives additional metadata columns. |
|
Rearranges the columns. |
|
Validates the file names. |
|
Returns the localOrder value, i.e. the last n-digits after the last underscore. |
|
Returns the sequenceIdentifier value, i.e. everything before last _[digits]. |
|
Split a filename into the non-numeric prefix and trailing numeric suffix. |
|
Returns the subjectIdentifier value, i.e. everything between [digit_] and [_digit]. |
Module Contents
- rcx_tk.sequence.process_sequence_file(file_path: str, out_path: str) None[source]
Processes a metadata file, keeping and renaming specific columns.
- rcx_tk.sequence.process_sequence(df: pandas.DataFrame) pandas.DataFrame[source]
Processes the metadata dataframe.
- Parameters:
df (pd.DataFrame) – The metadata dataframe.
- Returns:
A metadata dataframe with rearranged and newly derived columns.
- Return type:
pd.DataFrame
- rcx_tk.sequence.cleanup(df: pandas.DataFrame) pandas.DataFrame[source]
Removes the file Name column and moves the sampleName col.
- Parameters:
df (pd.DataFrame) – The metadata dataframe.
- Returns:
The processed dataframe.
- Return type:
pd.DataFrame
- rcx_tk.sequence.validate_injection_order(df: pandas.DataFrame) bool[source]
Validates if injectionOrder is of integer type.
- Parameters:
df (pd.DataFrame) – The metadata dataframe.
- Returns:
Whether the injectionOrder is integer.
- Return type:
- rcx_tk.sequence.derive_additional_metadata(df: pandas.DataFrame) pandas.DataFrame[source]
Derives additional metadata columns.
- Parameters:
df (pd.DataFrame) – The metadata dataframe.
- Returns:
The processed dataframe.
- Return type:
pd.DataFrame
- rcx_tk.sequence.rearrange_columns(df: pandas.DataFrame) pandas.DataFrame[source]
Rearranges the columns.
- Parameters:
df (pd.DataFrame) – The metadata dataframe.
- Returns:
The processed dataframe.
- Return type:
pd.DataFrame
- rcx_tk.sequence.validate_filenames_column(df: pandas.DataFrame) None[source]
Validates the file names.
- Parameters:
df (pd.DataFrame) – A dataframe to process.
- Raises:
ValueError – An error if there is any invalid file name.
- rcx_tk.sequence.add_local_order(file_name: str) int[source]
Returns the localOrder value, i.e. the last n-digits after the last underscore.
- rcx_tk.sequence.add_sequence_identifier(file_name: str) str[source]
Returns the sequenceIdentifier value, i.e. everything before last _[digits].