Python CLI Monolith vs Modular: 4000 LOC, 18 Subcommands (2026)
An r/Python post asked when to split. Decision rule: coupling + navigability, not LOC. Module-per-subcommand pattern; framework choice secondary.
An r/Python post: 4,000 LOC single-file Python CLI with argparse + 18 subcommands, stdlib + pyyaml only. Tests in a separate dir. The OP asked when to split. The honest answer: coupling, not LOC.
Why single-file works
The OP's reasons are real. One file to grep. One wheel to ship. No package layout decisions to onboard contributors through. For tools in this size range, single-file is a feature, not a smell.
What LOC alone misses
4K LOC with high coupling (subcommands share state heavily) reads differently than 4K LOC with 18 mostly-independent subcommands sharing only a few utility functions. The second case is genuinely splittable; the first probably shouldn't be.
Real signals to split
- "I can't navigate it anymore" — valid on its own.
- A subcommand has 500+ LOC of independent logic.
- A subcommand has an independent test surface.
- A subcommand has a different developer-velocity concern (different team, different release cadence).
What to ignore
- "LOC threshold reached." Arbitrary.
- "Modularity is best practice." Best-practice for what?
- "Adding pytest test files per module." Tests are already separate.
The split shape
Module-per-subcommand is the most common shape that works. Top-level shared utils stay common; each subcommand owns its module + tests. argparse subparsers route to the right module via importlib.
tool/
__init__.py
__main__.py # argparse dispatcher with importlib routing
shared/ # common utils (parse_yaml, call_api, pretty_print)
__init__.py
io.py
api.py
commands/
__init__.py
analyze.py # depends on shared.io, shared.api
migrate.py
deploy.py
...
tests/
test_shared.py
test_analyze.py
...The dispatcher
import argparse, importlib
def main():
parser = argparse.ArgumentParser()
sub = parser.add_subparsers(dest='command', required=True)
for name in ['analyze', 'migrate', 'deploy']: # ...18 subcommands
sub.add_parser(name)
args, rest = parser.parse_known_args()
mod = importlib.import_module(f'tool.commands.{args.command}')
return mod.run(rest)
if __name__ == '__main__':
main()What stays the same
Distribution. Single wheel still ships. Onboarding still simple — a new contributor reads commands/<name>.py for the subcommand they care about. Debugging — grep across the package; modern editors handle this easily. The wins from the OP's monolithic setup don't evaporate.
The 4K LOC threshold question
At ~5K LOC with high coupling, splitting often hurts. At ~5K LOC with low coupling, splitting often helps. The OP's 4K LOC is right at the threshold; the answer depends on the coupling audit, not on hitting a magic number.
Refactor cost
A 4K LOC single-file with 18 subcommands typically splits in 3-6 hours when coupling is moderate. The work is mostly mechanical: identify shared utils, extract them, move per-subcommand logic to its module, update imports, run the existing test suite.
What about Click / Typer / Fire?
At the OP's scale, framework choice matters less than the architectural choice of when (and how) to split. Click is the natural upgrade from argparse if you want decorator-driven cleanliness. Typer is fine for type-hint-heavy codebases. Fire is for internal tools where ergonomics beats UX polish. None of these change the monolith-vs-modular question directly.
What to do this week
Run the coupling audit. List each subcommand's shared imports, shared state, and shared helpers. If most subcommands hit the same 3-5 utils and don't share state, splitting is fine and probably worthwhile. If subcommands share heavy state, keep monolithic.
Verified-online May 2026 against argparse, Click, and Typer documentation, plus the source post.