One year ago, by chance, I participated in a key project for the company, which required long-term travel. The development team consisted of around 20 people, and the schedule was tight. Being in a different location, without the support of the company’s technical team and with inconvenient remote communication, many things became more difficult. Often, when encountering problems, we had to figure things out on our own. There’s a saying that a development team is like an engine; once it’s started, it can produce results and outputs. However, if the direction is off, the further the engine runs, the harder it is to stop, and the project may spiral out of control. Fortunately, our project was successfully launched, completed the e-commerce promotion, and was delivered smoothly. However, there were some risk points and lessons learned that are worth summarizing.
Usually, the development manager or architect will get involved in the project earlier than the developers, understanding the requirements, performing system analysis, selecting the appropriate technologies, and setting development plans and standards. During the technical specification stage, the development manager or architect should coordinate all developers, establish technical standards, and maintain communication with the developers to ensure they understand the modules or subsystems they are responsible for, ensuring the architecture is implemented correctly.
Most companies have a document for basic coding standards (if not, you may want to consider job-hopping), which is generally unrelated to the project, covering things like naming conventions, comment standards, SQL standards, etc. Additionally, the JDK must be unified, including the local development environment and server environment; project names, package names, database names, table names, and common fields (e.g., version for optimistic locking) must be defined.
Here, I want to emphasize two aspects: project standards and package directory standards.
Project Standards: The boundaries of each project module must be clearly defined, especially in distributed systems. Developers need to fully understand the framework’s hierarchy, such as when to define DTOs and when to define Domains. If this is not clear, the project will grow larger throughout its lifecycle, and when problems surface, they can be disastrous.
Package Directory Standards: Start package directory standards from the top level as much as possible, and developers should have minimal permissions to create new packages. Simply put, developers should create as few packages as possible. This is similar to outsourced projects in Japan, where the goal is to unify standards. Additionally, avoid having two project modules with the same package path, which could lead to conflicts during referencing.
After the system is decomposed, it’s essential to define the boundaries and responsibilities of the components (subsystems or modules). This is the issue developers are most concerned about early in the project. If this isn’t clearly defined, the system could face the risk of refactoring later on.
For example, not all base information and configuration data should be placed in the base data module. Only base data and configuration that span systems, services, or modules should belong to the base data management scope. Additionally, the interfaces for the base data service should be general, minimizing the need for custom interfaces for specific systems, services, or modules. The base data service should not depend on upper-layer services.
At the beginning of the project, aside from some basic modules like utility classes being packaged into Release versions, most other modules will be SNAPSHOT versions. As the project gradually goes live, version control becomes increasingly important, especially in distributed systems where RPC calls provide jar references to client systems. Here, I recommend reading the “Semantic Versioning 2.0.0” articles, which detail a semantic versioning specification established by Tom Preston-Werner, co-founder of Gravatars and GitHub.
Define the relationship between the main branch (trunk), branches (branches), and baselines (tags) according to the project’s development model, and establish the release process standards for the related environments.
In distributed system development, it’s unthinkable to not have automated builds. Often, a project has many subprojects deployed across dozens or even hundreds of servers. Without automated builds, developers could spend significant time packaging and releasing services. If a testing bug needs to be fixed immediately, it could take a long time, leaving the entire team exhausted.
Logs in production environments must not be output to the console. In project code, logs and exceptions should be output through the Logger object, and system.out.println or similar methods should not be used. Exceptions should not be output using e.printStackTrace. Logs should be split by date and size to prevent files from becoming too large to open. The log level in production must not be set to DEBUG. If debugging is necessary, it should only be enabled for specific classes, rather than for the entire environment, as this could cause the system to freeze due to file locks.
In a distributed business processing system, there are many cross-service calls, and different types of exceptions need to be handled at different levels. As the system expands, the handling methods need to adapt to different types of exceptions. This requires unified cross-service exception definitions and the ability to quickly locate the source and cause of cross-service exceptions. Generally, this is achieved by defining a unified exception code across the system, specifying its cause and allowing for system identification.
Batch updates must use the unified batch update method provided, as this significantly improves performance. If there is a special scenario that prevents this, approval is required. Otherwise, it could become a performance bottleneck.
Services like SMS and email should be prioritized in development, as almost every module may involve them.
If multithreading is used in the system, avoid creating threads arbitrarily. Use a unified thread pool as much as possible and encapsulate common calling methods and return results.
11.1 Avoid Long Transactions: Avoid creating transactions that lock multiple rows of data for long periods. Split transactions or switch to asynchronous handling where possible.
11.2 Distributed Transactions: Try to avoid implementing distributed transactions at the database level to ensure data consistency. Instead, use asynchronous methods, compensation mechanisms, and idempotence to achieve eventual data consistency.
Caching is often overlooked, only becoming an issue when performance bottlenecks arise later in the project, at which point the cost of refactoring is very high. A caching strategy, whether using Ehcache, Guava, or Redis, should be determined early on and implemented by the development team.
When using memcached/redis, data structure usage must be standardized. For example, not everyone should use the key/value method to store data, as this could lead to a messy and disorganized cache environment.
Determine early whether to implement shared disks physically or use a distributed file system.
Writing unit tests should become a habit. Additionally, unit tests should be side-effect-free. A well-defined unit test should yield the same result every time it runs, provided no external conditions have changed. For example, inserting test data into the database and then verifying it in the test is unreliable, as the database may change. Using in-memory databases or generating and deleting test data automatically during unit tests is more appropriate.
One reason many colleagues are reluctant to write unit tests is the many dependencies (remote service calls, Redis, web services, etc.). If a service goes down for any reason, the test will fail. To address this pain point, mock objects can be introduced. Mockito is a framework that can simulate such behaviors, allowing developers to focus on business logic without the need to prepare various dependency environments.
Comments: 3
Dolorum aut sit velit officia. Corrupti sed culpa est eius eos nam est
Quos consectetur aut minus ut eum quos. harum tempora sint. Pariatur esse neque rerum quaerat est voluptatem. rem velit alias incidunt atque velit nostrum. et excepturi qui id voluptates quaerat. Quae necessitatibus qui illum fuga. Animi facere impedit quis nihil Quisquam rerum ad distinctio earum soluta dolorum odit. Rem vitae consequatur tempora quae nostrum. Eum quo molestiae delectus a laboriosam. Qui voluptatibus aliquid ipsam Est molestiae quo quam totam quis. Eligendi repellat eaque et rerum quod Incidunt soluta qui accusantium vel velit Alias nihil et nobis. culpa dolore doloribus qui Et molestiae quas totam et aut. quis voluptas quam est. Reiciendis autem ut animi Vero laborum quidem et quia. Tempore fuga id voluptatibus. Repellat non error recusandae alias ullam dicta. Et consequatur ad impedit odio iusto aspernatur quaerat.
Aliquid possimus dolore a quaerat quibusdam eos
Quos iure quos omnis provident. Nesciunt tenetur voluptatum accusantium vitae. Ducimus quam aliquam voluptas exercitationem sed.
Commodi et consequatur odit id pariatur totam. Et in voluptas iusto consequatur et qui