PySSA Code Rules
Description
Python is the main programming language of the PySSA project. This document describes the rules which must be followed if the source code gets extended. The rule of thumb is to follow the Google style guide for Python, but anything mentioned in this document overrides the Google style guide. These rules can be outdated or changed at any time.
Linting
You have to run ruff over your code to check any static errors.
The configuration to use is defined in the pyproject.toml
.
Do NOT commit any code which is not linted with ruff
and all ruff errors
are fixed.
Type annotation
Python is a dynamically typed language, but this project makes heavy use of type annotations. The decision emphasizes robust and less error-prone code. Therefore, you have to use Python's type annotation feature.
Annotations of Python built-ins
Annotating variables using python built-ins where it is possible.
i: int = 0
program_name: str = "PySSA"
Annotations of PySSA built-ins
Annotating variables using PySSA built-ins where data structures of PySSA are used.
protein_pairs_for_analysis: list['protein_pair.ProteinPair'] = []
Annotations of library built-ins
Annotating variables using library built-ins where data types of libraries are used.
import numpy as np
distances_of_amino_acid_pairs: np.ndarray = np.ndarray([])
Annotations of return values
If a function/ method has a return value that will not be used, that
function call needs to be wrapped inside the rvoid
function.
The rvoid
function is the only function which gets imported as function
and not as module:
from pyssa.util.void import rvoid
from pyssa.util import main_window_util
rvoid(main_window_util.setup_app_settings(self.app_settings))
# void indicates that there is a return value, but it is not used
Naming conventions
- Package: snake_case
- Module: snake_case
- Class: PascalCase
- Method: snake_case
- private: _ prefix (single underscore)
def _create_directory_structure(self) -> None:
- slots: __ prefix (double underscore)
def __slot_open_project(self) -> None:
- Function: snake_case
- Variable: snake_case
- argument: a/an_var_name, if no specific variable is meant.
def export_protein_as_pdb_file(a_filepath: str) -> None:
- argument: the_var_name, if a specific variable is meant.
def load_settings(the_app_settings: 'settings.Settings') -> None:
- method/function scope: tmp_ prefix
... tmp_destination_filepath: str = "/home/rhel_user/scratch/log.txt" ...
- Global variable: g_ prefix + snake_case
Imports
Never use wildcard imports. Always import the module not the class itself.
from pymol import cmd # Correct: Module is imported
from pymol import * # Wrong! Wildcard import
from os.path import exists # Wrong! Function/Class import
import numpy as np
import pandas as pd
Exception handling
Argument checks
Always check for None:
def copy_fasta_file(a_source_filepath, a_destination_filepath):
if a_source_filepath is None:
logger.error(f"The argument 'a_source_filepath' is illegal: {a_source_filepath}!")
raise exception.IllegalArgumentError("An argument is illegal.")
if a_destination_filepath is None:
logger.error(f"The argument 'a_destination_filepath' is illegal: {a_destination_filepath}!")
raise exception.IllegalArgumentError("An argument is illegal.")
Raise IllegalArgumentError if unmodified argument is not usable for the function/method:
import os
def copy_fasta_file(a_source_filepath: pathlib.Path, a_destination_filepath: pathlib.Path):
...
if not os.path.exists(a_source_filepath): # argument is unmodified
raise exception.IllegalArgumentError("An argument is illegal.")
Raise custom exception if argument is modified and is not usable for the function/method
import os
def copy_fasta_file(a_source_filepath: pathlib.Path, a_destination_filepath: pathlib.Path):
...
if not os.path.exists(a_source_filepath.parent): # .parent is a modified version of the argument
raise exceptions.DirectoryNotFoundError("")
try-except blocks
Always wrap cmd
commands of the PyMOL API into a try-except block.
import pymol
try:
cmd.scene(f"{tmp_protein_pair.protein_1.get_molecule_object()}"
f"{tmp_protein_pair.protein_2.get_molecule_object()}",
action="recall")
except pymol.CmdException:
logger.error("...")
raise ...
Communication
QMainWindow & QDialogs
The communication between any QMainWindow and QDialog is done with signals and slots. This ensures that no unauthorized memory access violations occur.
How-to
- Define a custom pyqtsignal in the QDialog class:
... class DialogAddModel(Qt.QtWidgets.QDialog): """Class for a dialog to add proteins to a project.""" """ A pyqtsignal that is used to hand-over the protein structure information. """ return_value = pyqtSignal(tuple) # this is a custom PyQt signal ...
- Emit the signal where communication should occur.
... def add_model(self) -> None: """Emits a custom pyqtsignal and closes the dialog.""" self.return_value.emit((self.ui.txt_add_protein.text(), True)) self.close() ...
- Connect the signal in the QMainWindow with the QDialog object and the slot function
...
def add_existing_protein(self) -> None:
"""Opens a dialog to add an existing protein structure to the project."""
self.tmp_dialog = dialog_add_model.AddProteinView()
self.tmp_dialog.return_value.connect(self.post_add_existing_protein) # here is the connection
self.tmp_dialog.show()
...
...
def post_add_existing_protein(self, return_value: tuple): # in this case the value is a tuple
...
Threading
Within PySSA the custom tea
package will be used if multithreading is necessary.
Do NOT use the Action
class directly only use the Task
class!
Usage
...
def opens_project(self):
"""Initiates the task to open an existing project."""
self._active_task = tasks.LegacyTask(self.__async_open_project, post_func=self.__await_open_project)
self._active_task.start()
def __async_open_project(self) -> tuple:
"""Runs in the separate QThread and does CPU-bound work."""
tmp_project_path = pathlib.Path(f"{self._workspace_path}/{self._view.ui.txt_open_selected_project.text()}")
return ("result", project.Project.deserialize_project(tmp_project_path, self._application_settings))
def __await_post_project(self, a_result: tuple):
"""Runs after the QThread finished."""
...
Task
class gets an "async" function and optionally an "await" function.
The function that runs in the QThread must have the signature __async
(double underscore). The function that runs after the QThread finished must
have the signature __await
.
This design decision is based on intuition because the __async
function
runs asynchronous in the QThread and the __await
function waits
for the QThread (__async
function) to finish,
Database
PySSA uses a SQLite database for every single project.
Basic interaction with DatabaseX classes
The interaction is managed through the DatabaseManager
class.
The interaction with the manager from a controller is done through the
DatabaseThread
class. The DatabaseThread
has a queue which
accepts objects of the type DatabaseOperation
.
To run an INSERT statement from a controller, you have to create
a DatabaseOperation
object with the SQLQueryType
(in this case INSERT_...)
and put it into the queue of the DatabaseThread
.
def _delete_protein(self):
"""Deletes an existing protein from the project."""
tmp_protein: "protein.Protein" = self._view.ui.proteins_tree_view.currentIndex().data(enums.ModelEnum.OBJECT_ROLE)
# Below is the creation of the DatabaseOperation object
tmp_database_operation = database_operation.DatabaseOperation(enums.SQLQueryType.DELETE_EXISTING_PROTEIN,
(0, tmp_protein.get_id()))
# Here the DatabaseOperation object will be put into the queue of the DatabaseThread
self._database_thread.put_database_operation_into_queue(tmp_database_operation)
# -- The rest of the function
self._interface_manager.get_current_project().delete_specific_protein(tmp_protein.get_molecule_object())
self._interface_manager.refresh_protein_model()
self._interface_manager.refresh_main_view()
Adding new SQL queries to the DatabaseThread
Every SQL statement has to be implemented in the DatabaseManager
class!
For proper functionality of the DatabaseThread
class
it is necessary to add the SQL statements from the
database manager into a wrapper function and map this function against
an appropriate SQLQueryType enum.
An example for a wrapper function.
@staticmethod
def __wrapper_delete_existing_protein(the_db_manager, the_buffered_data: tuple):
# It is import to unpack the first element of the tuple with an _ !
_, tmp_protein_id = the_buffered_data
the_db_manager.delete_existing_protein(tmp_protein_id)
An example for the mapping process
def _setup_operations_mapping(self):
self._operations_mapping = {
enums.SQLQueryType.INSERT_NEW_PROTEIN: self.__wrapper_insert_new_protein,
enums.SQLQueryType.DELETE_EXISTING_PROTEIN: self.__wrapper_delete_existing_protein
}
class SQLQueryType(enum.Enum):
"""An enum for all possible sql queries for the database thread."""
INSERT_NEW_PROTEIN = 'insert_new_protein'
DELETE_EXISTING_PROTEIN = 'delete_existing_protein'
Terminology
Path, dir, file & filepath
- Always use
path
if a directory path is meant. - Always use
dir
if a directory name is meant. - Always use
filepath
if an absolute path to a file is meant. - Always use
file
if a name of a file is meant.
Difference between TODO and fixme
- Add a
# TODO
if there is a task which needs to be done. - Add a
# fixme
if there is an important note which needs to be quickly found.
Code formatting
The overall code formatting is done with the auto-formatter black. This will be done if the pre-commit hooks are ran.
Editor folds
Always wrap argument checks into an editor-fold (Ctrl+Alt+T). Example:
# <editor-fold desc="Checks">
if the_fasta_path is None:
logger.error("The argument filename is illegal.")
raise exception.IllegalArgumentError("")
# </editor-fold>