Module alvadesc
Wrapper for alvaDesc command line application (alvaDescCLI).
Requirements
The alvaDescCLIWrapper
package requires:
- Python 3.5 or higher
- A licensed copy of alvaDesc installed on the same computer
- Minimum alvaDesc version: 1.0.14
A few examples of use
1: Calculate two descriptors for three molecules on Windows:
from alvadesccliwrapper.alvadesc import AlvaDesc
aDesc = AlvaDesc('C:\Program Files\Alvascience\alvaDesc\alvaDescCLI.exe') # Windows default alvaDescCLI.exe location
aDesc.set_input_SMILES(['C#N', 'CCCC', 'CC(=O)OC1=CC=CC=C1C(=O)O'])
if not aDesc.calculate_descriptors(['MW', 'AMW']):
print('Error: ' + aDesc.get_error())
else:
print(aDesc.get_output_descriptors())
print(aDesc.get_output())
The result is a list of lists of float containing the required descriptors:
['MW', 'AMW']
[[27.03, 9.01], [58.14, 4.1529], [180.17, 8.5795]]
2: Calculate all descriptors for an input file on Linux:
from alvadesccliwrapper.alvadesc import AlvaDesc
aDesc = AlvaDesc('/usr/bin/alvaDescCLI') # Linux default alvaDescCLI location
aDesc.set_input_file('./myfile.sdf', 'MDL')
if not aDesc.calculate_descriptors('ALL'): # with alvaDesc v2.0.0 you can also use ALL2D keyword
print('Error: ' + aDesc.get_error())
else:
print(aDesc.get_output())
3. Calculate the MACCS 166 fingerprint for the molecules contained in a MDL file on Linux:
from alvadesccliwrapper.alvadesc import AlvaDesc
aDesc = AlvaDesc('/usr/bin/alvaDescCLI') # Linux default alvaDescCLI location
aDesc.set_input_file('./myfile.sdf', 'MDL')
if not aDesc.calculate_fingerprint('MACCSFP'):
print('Error: ' + aDesc.get_error())
else:
print(aDesc.get_output())
The result is a simple list of strings containing the required fingerprint:
['0000000000000000000000000000000000010000000000000000000000000000100000000000000110101111011...']
4: Calculate the ECFP fingerprint with size 1024 saving the result to a text file on macOS:
from alvadesccliwrapper.alvadesc import AlvaDesc
aDesc = AlvaDesc('/Applications/alvaDesc.app/Contents/MacOS/alvaDescCLI') # macOS default alvaDescCLI location
aDesc.set_input_file('./myfile.sdf', 'MDL')
aDesc.set_output_file('./test.txt')
if not aDesc.calculate_fingerprint('ECFP', 1024):
print('Error: ' + aDesc.get_error())
# the result is in the output file
#else:
# print(aDesc.get_output())
5: Run a script file created with alvaDescGUI on Windows:
from alvadesccliwrapper.alvadesc import AlvaDesc
aDesc = AlvaDesc() # Windows is the default
# set_ functions are ignored when using run_script
#aDesc.set_input_file('./myfile.sdf', 'MDL')
#aDesc.set_output_file('./test.txt')
if not aDesc.run_script('./myscript.adscr'):
# it could happen also if the script does not write output on the stdout
print('Error: ' + aDesc.get_error())
else:
print(aDesc.get_output())
Classes
class AlvaDesc (exePath='C:/Program Files/Alvascience/alvaDesc/alvaDescCLI.exe')
-
Initialize the alvaDesc command line wrapper.
Args
exePath
:str
- alvaDescCLI executable file path (by default the path is set for the Windows version)
Methods
def calculate_descriptors(self, descriptors)
-
Calculate the requested descriptors.
Args
descriptors
:str
orlist[str]
- list of descriptors (e.g., ['MW', 'AMW']) or a single descriptor (e.g., 'ALL' or 'MW')
Returns
bool
- The return value. True for success, False otherwise.
def calculate_fingerprint(self, fingerprint_type, fingerprint_size=1024)
-
Calculate the requested fingerprint.
Args
fingerprint_type
:str
- type of fingerprint: 'ECFP' or 'PFP' or 'MACCSFP'
fingerprint_size
:int
- size of fingerprint; it's not used for MACCS and by default is 1024
Returns
bool
- The return value. True for success, False otherwise.
def get_descriptors(self)
-
Get the list of all available descriptors.
Returns
list[str]
- the list of all available descriptors.
def get_error(self)
-
Return the error message of the previous execution.
Returns
str
- the error message of the previous execution.
def get_output(self)
-
Return the output of the previous execution.
Returns
list[float]
orlist[str]
- the output of the previous execution.
Note
The result is a list of lists that can be seen as a matrix where the number of rows is equal to the number of molecules and the number of columns is equal to the number of requested descriptors.
def get_output_descriptors(self)
-
Return the name of the descriptors calculated in the previous execution.
Returns
list[str]
- the output of the previous execution.
Note
The result is a list with the same order as the result of get_output.
def run_script(self, file_path)
-
Run alvaDesc with a script file.
Args
file_path
:str
- script file path
Returns
bool
- The return value. It returns True only if the script writes the output to the stdout, False otherwise.
Notes
- When it's used the other 'set_' functions, except for set_threads, are ignored
- It identifies Nan values only if the Missing_String is set to 'na' in the script (i.e., <Missing_String value="na"/>)
def set_input_SMILES(self, SMILES)
-
Set the input molecules using the SMILES format.
Args
SMILES
:str
orlist[str]
- list of SMILES (e.g., ['CC', 'CCC']) or a single molecules (e.g., 'CC')
Note
It's alternative to set_input_file.
def set_input_file(self, file_path, file_type)
-
Set the input file path.
Args
file_path
:str
- input file path
file_type
:str
- input file type (SMILES or MDL or SYBYL or HYPERCHEM)
Note
It's alternative to set_input_SMILES.
def set_output_file(self, file_path)
-
Set the output file path.
Args
file_path
:str
- output file path
Note
if file_path is not an empty string, the results won't be available through get_output.
def set_threads(self, num_threads)
-
Set the number of threads to be used during the calculation.
Args
num_threds
:int
- use 0 to let alvaDescCLI automatically determine the appropriate number of threads.