Coding in style¶
Please check the complete Python style guide PEP 8 or Google Python Style Guide. You can also check the detailed Tidyverse R style guide.
Tools for coding styles¶
However, instead of reading the whole style guide, we recommend you to use many existing tools to help you write code in a consistent style.
Formatting¶
Formatting is the process of organizing your code to make it more readable.
- Python: You can check the VS Code tutorial for python formatting here. We recommend using VS Code extensions (e.g. autopep8) to format your code.
- R: The R package styler allows you to interactively restyle selected text, files, or entire projects. It includes an RStudio add-in, the easiest way to re-style existing code.
Linting¶
Linting is the process of analyzing your code to check for errors and enforce coding standards.
Here are some popular linting tools for Python
:
- Pylint: it is a static type checker and linting tool for Python.
- Black: it is a code formatter that automatically formats your code to adhere to a consistent style.
Please check the VS Code tutorial for python linting here.
The most popular linting tools for R
:
- lintr: it has been integrated into the RStudio and VS Code R extensions.
Tip:AI-assisted styling
You can also ask AI to help you style your code. You can document all of your code style preferences and attached the instructions when asking AI to style your code. For example, you can prompt:
Basic coding style principles¶
Despite the existence of many tools, we still list some basic coding style principles here.
Naming objects¶
Names should be descriptive and self-documenting:
# Bad example - not self-documenting
def func(y):
# z is gravitational force
z = 9.81
# y is time in seconds
y = 5
# x is displacement
x = 1/2 * z * y**2
return x
# Good example - self-documenting
def compute_displacement(time_in_seconds):
gravitational_force = 9.81
displacement = 1/2 * gravitational_force * time_in_seconds**2
return displacement
# Even better - with parameters
def compute_displacement(time_in_seconds,
gravitational_force=9.81):
displacement = 1/2 * gravitational_force * time_in_seconds**2
return displacement
Use either snake_case
or camelCase
but make it consistent:
# snake_case (Python preferred style)
def calculate_average(number_list):
return sum(number_list) / len(number_list)
# camelCase (less common in Python)
def calculateAverage(numberList):
return sum(numberList) / len(numberList)
Python name conventions:
- Module names should use
snake_case
. - Class names should use
CapWords
. - Function and variable names should use
snake_case
. - Constant names should use
UPPERCASE
. _single_leading_underscore
: weak “internal use” indicator. E.g. from M import * does not import objects whose names start with an underscore.single_trailing_underscore_
: used by convention to avoid conflicts with Python keyword, e.g.class_
.__double_leading_underscore
: when naming a class attribute, invokes name mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).__double_leading_and_trailing_underscore__
: “magic” objects or attributes that live in user-controlled namespaces. E.g.__init__
,__import__
or__file__
. Never invent such names; only use them as documented.
Documentation¶
Use docstrings to explain what the module, class, or function does.
# Bad example
def f(x, y):
a = x + y # What does 'a' represent?
return a
# Good example
def calculate_total_price(item_price, tax_rate):
"""Calculate final price including tax.
Args:
item_price (float): Original price of the item
tax_rate (float): Tax rate as a decimal (e.g., 0.1 for 10%)
Returns:
float: Total price including tax
"""
total_price = item_price * (1 + tax_rate)
return total_price
Spacing and Operators¶
Indentation¶
Use 4 spaces per indentation level. Avoid mixing tabs and spaces:
# Good indentation
def process_data():
if condition:
do_something()
for item in items:
process_item()
# Bad indentation (mixing spaces)
def process_data():
if condition:
do_something()
for item in items:
process_item()
Maximum Line Length¶
Limit all lines to a maximum of 79 characters:
# Bad - too long
result = some_long_function_name(very_long_parameter_name1, very_long_parameter_name2, very_long_parameter_name3)
# Good - split into multiple lines
result = some_long_function_name(
very_long_parameter_name1,
very_long_parameter_name2,
very_long_parameter_name3
)
Imports¶
Group imports in the following order: 1. Standard library 2. Third-party 3. Local application/library specific
# Good import organization
import os
import sys
import numpy as np
import pandas as pd
from myproject.utils import helper
from myproject.core import main
# Bad import organization
from myproject.utils import helper
import sys
import pandas as pd
from myproject.core import * # Avoid wildcard imports
import os