Automating the identification of code-smells in a Python project

Working on large projects in Python, I quickly learned how important it is to maintain high code quality. Code-smells – i.e. minor potential problems that may indicate deeper structural deficiencies – appear over time in almost every project.

Our aim was to create a process that would allow for the ongoing identification of code-smells before they go into production. How did we do this? We analyse the solution step by step.

#0 What is code-smell?

Code smell is a term used to refer to code fragments that signal potential structural problems, hindering the maintenance and development of a project. These indicators, while not always causing errors, often lead to technical debt and complicate application development.

These can include unnecessary dependencies, complex structures or code duplication, and identifying and removing them helps to maintain high code quality.

#1 Problem: Identifying and managing code-smells in Python

In the context of TravelTech, every second of delay or minor interface error can mean a lost booking or an unhappy customer. Therefore, maintaining the highest quality code is not just a matter of good practice – it is also a business imperative.

While working on Getaway, we noticed several problems that regularly appeared in the code:

Unused imports and variables – they litter the code and can cause performance problems over time.
Inconsistent formatting – within a team, a variety of coding styles can lead to chaos and make code difficult to maintain.
Code duplication and complexity – introduces risk of bugs and makes updates difficult.

In order to effectively address these issues, we needed tools that could automatically recognise code-smells in real time, enabling quick fixes.

Let’s talk business.

#2 Solution: Integration of linter

To set the standard for code quality control, we implemented several linter and tools that became part of our pre-commit process. Each had its own unique task in detecting and solving specific problems:

Ruff – a rapid code linter that helped identify and correct style issues. Ruff allowed changes to be made immediately, reducing the number of formatting corrections made after coding was completed.
Autoflake – a tool for removing unused imports and variables. Thanks to Autoflake, we were able to eliminate these elements before they were accumulated, which had a positive effect on the clarity of the code.
Flake8 and Pylint – these linters introduced more detailed analysis of complex problems in code, such as redundancies or potential logic errors. Flake8 provided quick inspection results, while Pylint offered more extensive reports on code structure.

Closed

Identifying code-smells #3456

Rayan10 days ago

Prerequisites

✅ Creating a stable pre-commit process for identifying code-smells in a Python project
✅ Selection of tools to eliminate problems such as unused imports, code duplication, inconsistent formatting

Description of the problem

We have noticed that some problems are repeated in the code, which can increase the risk of errors and hinder the development of the project. Has anyone tested different linters in one set?

RuffUser8 days ago

We used Ruff in combination with Autoflake and Flake8. Ruff is great for quick style fixes and Autoflake perfectly removes unused elements. The problem can be inconsistency between the different linters.

Neoncube7 days ago

We have added Pylint for more thorough code structure analysis. Although it's slower, it's worth the time investment, especially on large projects. Using it with ppre-commit was crucial in eliminating code-smells at an early stage.

Ryan6 days ago

Sounds promising. Do you use any additional hooks to guard against other problems, such as committing large files?

Neoncube6 days ago

Yes, we have added merge conflict checking, JSON and YAML validation and automatic removal of private keys. This helps to avoid problems even before the changes are saved. Pre-commit really does work!

Ryan3 days ago

Thanks! The implementation of this process sounds solid. I think I'm going to check out this configuration at our place.

Neoncube2 days ago

It is worth investing in such automation, especially in large projects. Fewer revisions in the later stages of development.

#3 Formatting using pre-commit hooks

Each of the tools was linked to pre-commit, which ensured that they were run before the code changes were approved. As a result, each time we committed code, our tools performed the following checks:

Verification of merge conflicts in the repository.
Checking for inadvertently oversized files.
Verification that a private key has not been added to the repository.
Validation of JSON files.
Validation of YAML files.

This approach allowed us to identify code-smells on an ongoing basis and prevent code containing problems from going into production. Integrated unit and integration tests further strengthened our quality control, enabling their widespread use in Python projects.

#4 Implementation in three steps

The process of implementing these tools and procedures followed three main steps:

Tool research and selection – we analysed tools such as Ruff, Autoflake, Flake8 and Pylint to assess their effectiveness in identifying code-smells in our project.
Integration with pre-commit hooks – setting these tools to run before commits allowed code to be checked immediately for problems even before it was saved to the repository.
Automated testing and verification – finally, we combined our linter with unit and integration tests to ensure full stability and consistency of the code.

Let’s talk business.

#5 Effects and benefits

The integration of Ruff, Autoflake, Flake8 and Pylint tools as part of an automated commit flow has brought a number of benefits to Getaway:

Increased code consistency – with pre-commit hooks, our team was able to maintain a consistent working style, eliminating the need to manually review code for formatting.
Improved code management – removal of unused imports and variables has improved code readability and reduced code loading times. – Faster development cycles – early identification of code-smells at commit has significantly improved the debugging and refactoring process, enabling faster deployment of fixes.

The implementation of these automated quality checks has proved crucial to maintaining a high standard of code and has allowed the team to focus on developing new features rather than constantly fixing bugs.

This approach has not only accelerated our work, but has also laid a solid foundation for future Python projects, ensuring that they are flexible and of high quality right from the programming stage.

#0 What is code-smell? ​

#1 Problem: Identifying and managing code-smells in Python ​

#2 Solution: Integration of linter ​

#3 Formatting using pre-commit hooks ​

#4 Implementation in three steps ​