docs: update research and project dictionary

josecelano · josecelano · commit 84204ced895c · 2025-08-13T19:01:03.000+01:00
diff --git a/docs/redesign/phase3-design/research/01-tools-evaluation.md b/docs/redesign/phase3-design/research/01-tools-evaluation.md
diff --git a/docs/redesign/phase3-design/research/02-language-selection-for-tooling.md b/docs/redesign/phase3-design/research/02-language-selection-for-tooling.md
@@ -0,0 +1,243 @@
+# Language Selection for Automation Tooling
+
+## Key Requirements
+
+The primary requirements for the selected language are:
+
+1. **Cross-Platform Compatibility**: Must run seamlessly on Linux, macOS, and
+   Windows.
+2. **Performance**: Should be fast enough for tasks like file I/O, data
+   processing, and network requests.
+3. **Ecosystem and Libraries**: A rich ecosystem with libraries for common
+   automation tasks is crucial.
+4. **Ease of Use and Learning Curve**: Should be accessible to a wide range of
+   contributors.
+5. **Tooling and IDE Support**: Excellent tooling and IDE support are essential
+   for developer productivity.
+6. **Developer Experience**: The language should be productive and easy for
+   contributors to learn and use, enabling rapid development and maintenance.
+7. **Public Codebase Availability**: The volume of publicly available code is a
+   key factor for AI-assisted development. A larger and more diverse codebase
+   allows for better training of AI models, leading to more accurate and
+   relevant code generation, faster prototyping, and more effective
+   problem-solving.
+8. **Community and Contributor Pool**: A large, active community and a readily
+   available pool of potential contributors are vital for the long-term health
+   and sustainability of the project. This ensures better support, more
+   third-party libraries, and a higher likelihood of attracting developers.
+
+## Language Candidates
+
+The following languages have been identified as strong candidates:
+
+1. **Python**: A high-level, dynamically-typed language renowned for its
+   simplicity, readability, and extensive ecosystem in the automation and
+   DevOps space.
+2. **Go (Golang)**: A statically-typed, compiled language developed by Google,
+   designed for building simple, reliable, and efficient software. It is the
+   de-facto language of the cloud-native ecosystem (Kubernetes, Docker,
+   Prometheus, OpenTofu).
+3. **Rust**: A statically-typed, compiled language focused on performance,
+   safety, and concurrency. While the Torrust project itself uses Rust, its
+   suitability for high-level orchestration scripts needs to be evaluated.
+4. **Perl**: A high-level, general-purpose, interpreted, dynamic programming
+   language. It has a long history of being used for system administration
+   and automation tasks.
+5. **Shell Scripting (Baseline)**: The current approach. It serves as a
+   baseline for comparison.
+
+## Comparison
+
+### Evaluation Criteria
+
+| Criterion                          | Python                 | Go                     | Rust                 | Perl               | Shell Script                   |
+| :--------------------------------- | :--------------------- | :--------------------- | :------------------- | :----------------- | :----------------------------- |
+| **Ease of Testing**                | ⭐⭐⭐⭐⭐ (Excellent) | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐⭐ (Good)        | ⭐⭐⭐ (Good)      | ⭐ (Poor)                      |
+| **Ecosystem & Libraries**          | ⭐⭐⭐⭐⭐ (Excellent) | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐⭐ (Good)        | ⭐⭐ (Fair)        | ⭐⭐ (Fair)                    |
+| **Plugin Architecture**            | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐⭐⭐ (Very Good) | ⭐⭐⭐ (Good)      | ⭐ (Poor)                      |
+| **Standard Library**               | ⭐⭐⭐⭐⭐ (Excellent) | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐⭐ (Good)        | ⭐⭐ (Fair)        | ⭐⭐ (Fair)                    |
+| **Infrastructure Adoption**        | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐⭐⭐⭐ (Excellent) | ⭐⭐⭐ (Growing)     | ⭐⭐⭐ (Growing)   | ⭐⭐⭐⭐ (Widespread)          |
+| **Developer Experience**           | ⭐⭐⭐⭐⭐ (Excellent) | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐ (Steep Curve)   | ⭐⭐ (Steep Curve) | ⭐⭐⭐ (Good for simple tasks) |
+| **Public Codebase Availability**   | ⭐⭐⭐⭐⭐ (Excellent) | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐⭐ (Good)        | ⭐⭐⭐ (Good)      | ⭐⭐ (Fair)                    |
+| **Community and Contributor Pool** | ⭐⭐⭐⭐⭐ (Excellent) | ⭐⭐⭐⭐ (Very Good)   | ⭐⭐⭐⭐ (Very Good) | ⭐⭐ (Fair)        | ⭐⭐⭐⭐⭐ (Ubiquitous)        |
+| **Overall Suitability**            | **Excellent**          | **Excellent**          | **Good**             | **Fair**           | **Poor**                       |
+
+---
+
+## Detailed Analysis
+
+### 1. Python
+
+- **Testing**: Excellent. The `pytest` framework is incredibly powerful and
+  flexible, making it easy to write clean, maintainable tests. The
+  `unittest` module is built-in. Mocking and patching are straightforward.
+- **Libraries**: Unmatched ecosystem for automation.
+  - **Cloud SDKs**: Mature and well-supported libraries for all major cloud
+    providers (AWS Boto3, Azure, GCP).
+  - **OpenTofu**: The `python-terraform` library provides a wrapper, but
+    it's not as integrated as the Go provider SDK.
+  - **Parsing**: Native `json`, and robust libraries like `PyYAML` and
+    `toml`.
+- **Extensibility**: Very good. Python's dynamic nature and support for entry
+  points make plugin systems relatively easy to implement.
+- **Adoption**: Widely used. Ansible, a major configuration management tool,
+  is built in Python. Many cloud provider SDKs have first-class Python
+  support.
+- **Developer Experience**: Excellent. The syntax is clean and readable,
+  leading to high productivity. It's a great language for scripting and
+  building high-level logic.
+- **Public Codebase Availability**: Excellent. Python is one of the most popular
+  languages on GitHub, with a vast and diverse range of projects. This
+  provides an enormous dataset for training AI models, leading to excellent
+  AI-assisted development.
+- **Community and Contributor Pool**: Excellent. Python has a massive, active, and welcoming
+  community. This makes it easy to find help, libraries, and potential
+  contributors.
+- **Downsides**: It's dynamically typed, which can lead to runtime errors.
+  Performance is lower than compiled languages, but this is rarely a
+  bottleneck for orchestration scripts.
+
+### 2. Go (Golang)
+
+- **Score**: ⭐⭐⭐⭐ (Very Good)
+- **Testing**: Very Good. Testing is a first-class citizen, built into the
+  toolchain. It's simple to write unit tests, benchmarks, and examples.
+  Table-driven tests are a common and effective pattern.
+- **Libraries**: Very Good.
+  - **Cloud SDKs**: Official and well-maintained SDKs for all major cloud
+    providers.
+  - **OpenTofu**: **Excellent support**. Go is the native language of
+    Terraform, OpenTofu, Packer, and most HashiCorp tools. The official
+    provider development kits are in Go.
+  - **Parsing**: Excellent support for JSON, YAML, and TOML.
+- **Extensibility**: Very good. Interfaces and packages provide a solid
+  foundation for building extensible systems.
+- **Adoption**: **The standard for cloud-native tools**. Docker, Kubernetes,
+  Prometheus, and Terraform are all written in Go. This is its biggest
+  strength.
+- **Developer Experience**: Very good. The language is simple, compilation is
+  fast, and it produces a single, statically-linked binary, which simplifies
+  deployment immensely.
+- **Public Codebase Availability**: Very Good. Go is prevalent in the cloud-native space,
+  with many high-profile open-source projects (Docker, Kubernetes, etc.)
+  providing a rich source of high-quality code for AI training.
+- **Community and Contributor Pool**: Very Good. Go has a strong and growing community,
+  particularly in the infrastructure and backend development space.
+- **Downsides**: Error handling can be verbose (`if err != nil`). The lack of
+  generics in older versions was a pain point, but this has been addressed.
+
+### 3. Rust
+
+- **Score**: ⭐⭐⭐ (Good)
+- **Testing**: Good. The testing framework is built-in and supports unit and
+  integration tests. However, it's generally more verbose than Python's or
+  Go's.
+- **Libraries**: Good, but less mature for high-level orchestration compared
+  to Python and Go.
+  - **Templates**: `Tera` (a Jinja2-like engine) and `Handlebars` are
+    available.
+  - **OpenTofu**: No mature libraries. Interacting with OpenTofu would
+    likely require wrapping the CLI.
+- **Extensibility**: Excellent. Traits and enums make for a very powerful and
+  safe plugin system.
+- **Adoption**: Growing, but not a mainstream choice for DevOps tooling yet.
+  The learning curve is steep.
+- **Developer Experience**: Good, but can be challenging. The borrow checker,
+  while providing safety, adds complexity that may not be necessary for
+  orchestration scripts.
+- **Public Codebase Availability**: Good. The amount of public Rust code is growing
+  rapidly, especially in systems programming, web assembly, and CLI tools.
+  The quality is generally high.
+- **Community and Contributor Pool**: Very Good. Rust has a passionate, helpful, and rapidly
+  growing community.
+- **Downsides**: Steep learning curve. The focus on safety and performance is
+  often overkill for high-level automation scripts.
+
+### 4. Perl
+
+- **Score**: ⭐⭐ (Fair)
+- **Suitability**: Perl is a powerful and mature language, often praised for its
+  text-processing capabilities. It was a de-facto standard for system
+  administration and web development (CGI scripts) for many years. However, its
+  popularity has declined, and it's often considered a legacy language.
+- **Ecosystem**: The Comprehensive Perl Archive Network (CPAN) is vast but can
+  be difficult to navigate. Many libraries are old and may not be actively
+  maintained.
+- **Extensibility**: Good. Perl's module system is powerful, but the syntax
+  can be dense and difficult to read, making it less approachable for new
+  contributors.
+- **Adoption**: Low for new projects. It's still used in many legacy
+  systems, but it's rarely chosen for new toolchains.
+- **Developer Experience**: Fair. Perl's "There's more than one way to do
+  it" (TMTOWTDI) philosophy can lead to code that is difficult to read and
+  maintain. The syntax is often criticized for being "write-only."
+- **Public Codebase Availability**: Good. The Comprehensive Perl Archive Network (CPAN)
+  is one of the oldest and largest code repositories. However, much of the
+  code is legacy, which might be less relevant for modern AI training.
+- **Community and Contributor Pool**: Fair. While the core community is dedicated, it is much
+  smaller and less active in new projects compared to Python, Go, or Rust.
+- **Downsides**: The syntax is complex and often considered "ugly." The
+  community is smaller and less active than for other languages. Finding
+  developers with Perl experience can be difficult.
+
+### 5. Shell Scripting (Baseline)
+
+- **Score**: ⭐ (Poor)
+- **Testing**: Poor. Testing shell scripts is notoriously difficult. Tools
+  like `shellcheck` help, but robust testing requires significant effort.
+- **Libraries**: N/A. Relies on system binaries (`curl`, `jq`, `sed`, `awk`).
+- **Extensibility**: Poor. Extending shell scripts is manual and error-prone.
+- **Adoption**: Ubiquitous, but not ideal for complex logic.
+- **Developer Experience**: Poor for anything beyond simple scripts. Lack of
+  modern language features makes it hard to maintain.
+- **Public Codebase**: Good. Countless shell scripts are available online, but
+  they often lack standardization, documentation, and quality control, making
+  reuse difficult.
+- **Community and Contributor Pool**: Excellent. The user base is massive, but it is not a
+  formal community. Finding skilled contributors for a structured project can
+  be challenging.
+- **Downsides**: Error handling is fragile, and it's easy to write
+  unmaintainable code. Not suitable for building a robust, extensible
+  toolchain.
+
+## Decision
+
+**Go** is the recommended language for the new Torrust Tracker automation
+toolchain.
+
+## Rationale
+
+While Python is an extremely strong contender and would also be a valid choice,
+**Go's unparalleled alignment with the modern cloud-native and Infrastructure
+as Code ecosystem makes it the superior choice for this specific project.**
+
+1. **Native IaC Ecosystem**: Terraform, OpenTofu, Packer, and nearly all major
+   cloud-native tools are written in Go. By using Go, we are aligning with the
+   language of the tools we are automating. This provides access to the best
+   SDKs, libraries, and community expertise. We can directly use the same
+   libraries that OpenTofu providers use.
+2. **Single Binary Deployment**: Go compiles to a single, statically-linked
+   binary with no external dependencies. This dramatically simplifies the
+   deployment and distribution of our new installer. We can ship a single file
+   that runs on any target system, without worrying about Python versions,
+   virtual environments, or dependency conflicts.
+3. **Performance and Concurrency**: While performance is not the primary
+   concern, Go's efficiency and built-in support for concurrency are
+   significant advantages. This will be beneficial for running tasks in
+   parallel, such as provisioning multiple resources or checking multiple
+   endpoints simultaneously.
+4. **Static Typing and Simplicity**: Go's static typing catches many errors at
+   compile time, a significant improvement over shell scripts and Python. Its
+   simplicity and small number of language features make it easy to learn and
+   maintain, which is crucial for an open-source project with many
+   contributors.
+5. **Strong Standard Library**: Go's standard library is excellent for
+   building command-line tools and network services, covering most of our needs
+   without requiring numerous third-party dependencies.
+
+While Rust is the language of the main Torrust project, it is not the best fit
+for this high-level orchestration tool. The complexity and development
+overhead of Rust are not justified for a tool that primarily glues together
+other processes and APIs. Using Go for tooling and Rust for the core tracker
+application is a common and effective polyglot strategy, playing to the
+strengths of each language.
diff --git a/project-words.txt b/project-words.txt
@@ -5,6 +5,7 @@ Ashburn
 Automatable
 autoport
 bantime
+Boto
 buildx
 cdmon
 cdrom
@@ -16,6 +17,7 @@ codel
 commoninit
 conntrack
 containerd
+CPAN
 CPUS
 crontabs
 dialout
@@ -136,6 +138,7 @@ tfstate
 tfvars
 tlsalpn
 tlsv
+TMTOWTDI
 tulpn
 UEFI
 usermod