Fingerprints and Descriptors
- class Smi2Fp(radius=3, fpSize=2048)[source]
Calculate Morgan fingerprints from SMILES strings
- get_np(smiles)[source]
Convert a SMILES string to a numpy array with Morgan fingerprint bits.
- Parameters:
smiles – SMILES string
- Returns:
numpy array with Morgan fingerprint bits
- get_np_counts(smiles)[source]
Convert a SMILES string to a numpy array with Morgan fingerprint counts.
- Parameters:
smiles – SMILES string
- Returns:
numpy array with Morgan fingerprint counts
- mol2morgan_fp(mol, radius=2, nBits=2048)[source]
Convert an RDKit molecule to a Morgan fingerprint To avoid the rdkit deprecated warning, do this from rdkit import rdBase with rdBase.BlockLogs():
uru.smi2numpy_fp(“CCC”)
- smi2morgan_fp(smi, radius=2, nBits=2048)[source]
Convert a SMILES to a Morgan fingerprint To avoid the rdkit deprecated warning, do this from rdkit import rdBase with rdBase.BlockLogs():
uru.smi2numpy_fp(“CCC”)
- mol2numpy_fp(mol, radius=2, n_bits=2048)[source]
Convert an RDKit molecule to a numpy array with Morgan fingerprint bits Borrowed from https://iwatobipen.wordpress.com/2019/02/08/convert-fingerprint-to-numpy-array-and-conver-numpy-array-to-fingerprint-rdkit-memorandum/
- smi2numpy_fp(smi, radius=2, nBits=2048)[source]
Convert a SMILES to a numpy array with Morgan fingerprint bits
- class RDKitDescriptors(desc_names=None, hide_progress=False, skip_fragments=False)[source]
Calculate RDKit descriptors for molecules or SMILES.
Provide methods to compute descriptor vectors for a single molecule or SMILES, and to produce pandas DataFrames for lists of molecules or SMILES.
Attributes
- desc_names
Sorted list of descriptor names that will be calculated.
- hide_progress
Whether to hide progress bars when processing lists.
Initialize descriptor calculator.
- type desc_names:
- param desc_names:
Optional list of descriptor names to use. If not provided, the full RDKit descriptor list is used.
- type hide_progress:
- param hide_progress:
If true, progress bars are disabled when processing lists.
- type skip_fragments:
- param skip_fragments:
If true, descriptors whose names contain “fr_” are excluded.
- return:
None
- update_descriptors(index_list)[source]
Update the descriptor names to only include those at the specified indices.
- calc_mol(mol)[source]
Calculate descriptors for an RDKit molecule.
- Parameters:
mol (
Mol) – RDKit molecule- Return type:
ndarray- Returns:
A numpy array with descriptor values
- calc_smiles(smiles)[source]
Calculate descriptors for a SMILES string.
- Parameters:
smiles (
str) – SMILES string- Return type:
ndarray- Returns:
A numpy array with descriptor values
- clean_descriptors(desc_in)[source]
Remove descriptor columns that contain any NaN or infinite values.
- class RDKitProperties[source]
Calculate RDKit properties
- calc_mol(mol)[source]
Calculate properties for an RDKit molecule
- Parameters:
mol (
Mol) – RDKit molecule- Return type:
ndarray- Returns:
a numpy array with properties
- pandas_smiles(smi_list)[source]
Calculates properties for a list of SMILES strings and returns them as a pandas DataFrame.
- Parameters:
smi_list (List[str]) – List of SMILES strings
- Returns:
DataFrame with calculated properties. Each row corresponds to a SMILES string and each column to a property.
- Return type:
pd.DataFrame
- pandas_mols(mol_list)[source]
Calculates properties for a list of RDKit molecules and returns them as a pandas DataFrame.
- Parameters:
mol_list (List[Mol]) – List of RDKit molecules
- Returns:
DataFrame with calculated properties. Each row corresponds to a molecule and each column to a property.
- Return type:
pd.DataFrame
- class Ro5Calculator[source]
A class used to calculate Lipinski’s Rule of Five properties for a given molecule.
Attributes
- namesList[str]
A list of names of the properties to be calculated.
- functionsList[Callable[[Mol], float]]
A list of functions used to calculate the properties.
Methods
- calc_mol(mol: Mol) -> np.ndarray
Calculates properties for a RDKit molecule.
- calc_smiles(smi: str) -> Optional[np.ndarray]
Calculates properties for a SMILES string.
- pandas_smiles(smiles_list: List[str]) -> pd.DataFrame
Calculates properties for a list of SMILES strings and returns them as a pandas DataFrame.
- pandas_mols(mol_list: List[Mol]) -> pd.DataFrame
Calculates properties for a list of RDKit molecules and returns them as a pandas DataFrame.
Initialize the Ro5Calculator class.
- type self:
- param self:
An instance of the Ro5Calculator class
- type self:
Ro5Calculator
- return:
None
- rtype:
None
- calc_mol(mol)[source]
Calculate properties for a RDKit molecule
- Parameters:
mol (Mol) – RDKit molecule
- Returns:
a numpy array with properties
- Return type:
np.ndarray
- calc_smiles(smi)[source]
Calculate properties for a SMILES string
- Parameters:
smi (str) – SMILES string
- Returns:
a numpy array with properties
- Return type:
Optional[np.ndarray]
- pandas_smiles(smiles_list)[source]
Calculates properties for a list of SMILES strings and returns them as a pandas DataFrame.
- Parameters:
smiles_list (List[str]) – List of SMILES strings
- Returns:
DataFrame with calculated properties. Each row corresponds to a SMILES string and each column to a property.
- Return type:
pd.DataFrame
- pandas_mols(mol_list)[source]
Calculates properties for a list of RDKit molecules and returns them as a pandas DataFrame.
- Parameters:
mol_list (List[Mol]) – List of RDKit molecules
- Returns:
DataFrame with calculated properties. Each row corresponds to a molecule and each column to a property.
- Return type:
pd.DataFrame