Skip to content

10 — Best Practices, CI/CD, and Common Pitfalls


Project structure

Keep simulations small and composable. Extract reusable parts into helper classes.

For Gradle projects, replace src/test/ with src/gatling/ (see 01 — Setup).

src/test/
├── java/com/example/
│   ├── simulations/
│   │   ├── SmokeSimulation.java      ← 1 user, quick sanity check
│   │   ├── LoadSimulation.java       ← normal expected load
│   │   └── StressSimulation.java     ← pushes system to limit
│   ├── scenarios/
│   │   ├── AuthScenarios.java        ← login/logout chains
│   │   └── ShopScenarios.java        ← browse, add to cart, checkout
│   └── config/
│       ├── HttpConfig.java           ← shared protocol builder
│       └── SimConfig.java            ← env-driven load numbers
└── resources/
    ├── data/
    │   └── users.csv
    └── bodies/
        └── createOrder.json

Externalise load parameters

Never hardcode user counts and durations. Read them from system properties or environment variables so CI can override them.

Java

public class SimConfig {
    public static final int USERS =
        Integer.getInteger("USERS", 50);
    public static final int RAMP_SECONDS =
        Integer.getInteger("RAMP", 30);
    public static final int STEADY_SECONDS =
        Integer.getInteger("STEADY", 120);
}
scn.injectOpen(
    rampUsers(SimConfig.USERS).during(Duration.ofSeconds(SimConfig.RAMP_SECONDS)),
    constantUsersPerSec(SimConfig.USERS / 10.0)
        .during(Duration.ofSeconds(SimConfig.STEADY_SECONDS))
)

Run with:

mvn gatling:test -DUSERS=200 -DRAMP=60 -DSTEADY=300

Kotlin

object SimConfig {
    val users = System.getProperty("USERS", "50").toInt()
    val rampSeconds = System.getProperty("RAMP", "30").toLong()
    val steadySeconds = System.getProperty("STEADY", "120").toLong()
}

Shared HTTP protocol

Define the protocol once and reuse it across all simulations:

// Java — HttpConfig.java
public class HttpConfig {
    public static HttpProtocolBuilder build() {
        return http
            .baseUrl(System.getProperty("BASE_URL", "http://localhost:8080"))
            .acceptHeader("application/json")
            .contentTypeHeader("application/json");
    }
}
// Kotlin
object HttpConfig {
    fun build() = http
        .baseUrl(System.getProperty("BASE_URL", "http://localhost:8080"))
        .acceptHeader("application/json")
        .contentTypeHeader("application/json")
}

CI/CD integration

GitHub Actions

name: Load test

on:
  push:
    branches: [main]

jobs:
  gatling:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-java@v4
        with:
          java-version: '21'
          distribution: 'temurin'

      - name: Run smoke test
        run: |
          mvn gatling:test \
            -Dgatling.simulationClass=com.example.SmokeSimulation \
            -DBASE_URL=https://staging.example.com

      - name: Upload report
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: gatling-report
          path: target/gatling/

GitLab CI

load-test:
  image: maven:3.9-eclipse-temurin-21
  script:
    - mvn gatling:test
      -Dgatling.simulationClass=com.example.LoadSimulation
      -DBASE_URL=$STAGING_URL
      -DUSERS=100
  artifacts:
    when: always
    paths:
      - target/gatling/
    expire_in: 7 days

Assertions as CI gates

setUp(scn.injectOpen(rampUsers(100).during(Duration.ofSeconds(60))))
    .protocols(httpProtocol)
    .assertions(
        global().responseTime().percentile(95).lt(500),
        global().successfulRequests().percent().gt(99.0)
    );

If an assertion fails, Gatling exits with code 2 — your CI pipeline fails automatically.


Common pitfalls

1. Don't use Thread.sleep

Gatling runs on a non-blocking engine. Calling Thread.sleep blocks a real OS thread and kills scalability.

// BAD
exec(session -> { Thread.sleep(1000); return session; })

// GOOD
.pause(1)

2. Don't share mutable state between users

Session is per-user. Avoid static mutable fields.

// BAD — all users share this counter (race condition)
private static int counter = 0;
exec(session -> { counter++; return session.set("n", counter); })

// GOOD — use session or a thread-safe AtomicInteger if you really need it
exec(session -> session.set("n", session.userId()))

3. Don't forget think time

Real users don't hammer endpoints as fast as possible. Add realistic pauses:

.pause(Duration.ofMillis(500), Duration.ofSeconds(2))

4. Warm up the JIT

The first few seconds of a test can show inflated latencies because the JVM is JIT-compiling hot paths. Add a warm-up phase:

scn.injectOpen(
    atOnceUsers(5),                              // JVM warm-up
    nothingFor(Duration.ofSeconds(10)),
    rampUsers(200).during(Duration.ofSeconds(60))
)

5. Avoid large bodies in memory

Use RawFileBody or ElFileBody instead of building huge strings in-process.

6. Don't run tests against production

Always point at a dedicated staging environment. Gatling can saturate a database or exhaust connection pools very quickly.

7. Keep request names stable

Gatling tracks metrics by request name. If names are dynamic (contain timestamps etc.) you get thousands of individual series in the report and stats.js becomes huge.

// BAD
http("Request at " + System.currentTimeMillis()).get("/items")

// GOOD
http("Get items").get("/items")

Performance tuning the Gatling process itself

If you need >5,000 concurrent users on a single machine:

  1. Increase open file descriptors (each TCP connection = 1 fd):
ulimit -n 65535
  1. Tune JVM heap (more users = more session objects):
JAVA_OPTS="-Xms512m -Xmx4g" mvn gatling:test
  1. Use a dedicated load generator — run Gatling on a separate machine from the system under test to avoid resource contention.

Checklist before running a load test

  • [ ] Tested the simulation with atOnceUsers(1) (smoke run)
  • [ ] Base URL points to staging, not production
  • [ ] Test data (feeders) are populated and correct
  • [ ] Assertions are defined in setUp
  • [ ] Load parameters are read from properties, not hardcoded
  • [ ] Think-time / pauses match realistic user behaviour
  • [ ] The team / ops is aware the test is running
  • [ ] Monitoring (APM, DB metrics, pod metrics) is active during the run